EXPLAIN EXTENDED

How to create fast database queries

Hierarchical data in MySQL: parents and children in one query

with 23 comments

Answering questions asked on the site.

Michael asks:

I was wondering how to implement a hierarchical query in MySQL (using the ancestry chains version) for a single row, such that it picks up the parents (if any) and any children (if any).

The idea is, I want to be able to jump in at any point, provide an Id of some sort, and be able to draw out the entire hierarchy for that Id, both upwards and downwards.

We need to combine two queries here:

  1. Original hierarchical query that returns all descendants of a given id (a descendancy chain)
  2. A query that would return all ancestors of a given id (an ancestry chain)

An id can have only one parent, that's why we can employ a linked list technique to build an ancestry chain, like shown in this article:

Here's the query to to this (no functions required):

SELECT  @r AS _id,
         (
         SELECT  @r := parent
         FROM    t_hierarchy
         WHERE   id = _id
         ) AS parent,
         @l := @l + 1 AS lvl
 FROM    (
         SELECT  @r := 1218,
                 @l := 0,
                 @cl := 0
         ) vars,
         t_hierarchy h
WHERE    @r <> 0

To combine two queries, we can employ a simple UNION ALL.

The only problem that is left to preserve the correct level, since the ancestry chain query conts level backwards, and the hierarchical query will count it starting from zero.

Let's create a sample table and see what we get:

Table creation details

Now, let's try to UNION ALL the queries as is:

SELECT  CONCAT(REPEAT('    ', lvl  - 1), _id) AS treeitem, parent, lvl AS level
FROM    (
        SELECT  @r AS _id,
                (
                SELECT  @r := parent
                FROM    t_hierarchy
                WHERE   id = _id
                ) AS parent,
                @l := @l + 1 AS lvl
        FROM    (
                SELECT  @r := 1218,
                        @l := 0,
                        @cl := 0
                ) vars,
                t_hierarchy h
        WHERE   @r <> 0
        ORDER BY
                lvl DESC
        ) qi
UNION ALL
SELECT  CONCAT(REPEAT('    ', level  - 1), CAST(hi.id AS CHAR)), parent, level
FROM    (
        SELECT  hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level
        FROM    (
                SELECT  @start_with := 1218,
                        @id := @start_with,
                        @level := 0
                ) vars, t_hierarchy
        WHERE   @id IS NOT NULL
        ) ho
JOIN    t_hierarchy hi
ON      hi.id = ho.id

treeitem parent level
1 0 6
2 1 5
7 2 4
38 7 3
165 38 2
1218 165 1
5473 1218 1
24796 5473 2
24797 5473 2
24798 5473 2
24799 5473 2
24800 5473 2
5474 1218 1
24801 5474 2
24802 5474 2
24803 5474 2
24804 5474 2
24805 5474 2
5475 1218 1
24806 5475 2
24807 5475 2
24808 5475 2
24809 5475 2
24810 5475 2
5476 1218 1
24811 5476 2
24812 5476 2
24813 5476 2
24814 5476 2
24815 5476 2
5477 1218 1
24816 5477 2
24817 5477 2
24818 5477 2
24819 5477 2
24820 5477 2
36 rows fetched in 0.0014s (0.1447s)

We see that the hierarchical order is mangled: the first resultset is upside down, the second one is starting from level = 1.

To fix it, we need to change the code that calculates level a little.

First, we need to reverse the ancestry part.

This can be easily done by sorting it on lvl DESC:

SELECT  CONCAT(REPEAT('    ', level  - 1), _id) AS treeitem, parent, level
FROM    (
        SELECT  @r AS _id,
                (
                SELECT  @r := parent
                FROM    t_hierarchy
                WHERE   id = _id
                ) AS parent,
                @l := @l + 1 AS level
        FROM    (
                SELECT  @r := 1218,
                        @l := 0,
                        @cl := 0
                ) vars,
                t_hierarchy h
        WHERE   @r <> 0
        ORDER BY
                level DESC
        ) qi

treeitem parent level
1 0 6
2 1 5
7 2 4
38 7 3
165 38 2
1218 165 1
6 rows fetched in 0.0003s (0.0684s)

We now have it in correct order but with wrong level values.

Since a level is essentially a rownum here, we can just calculate it as a rownum instead:

SELECT  CONCAT(REPEAT('    ', level  - 1), id) AS treeitem, parent, level
FROM    (
        SELECT  _id AS id, parent,
                @cl := @cl + 1 AS level
        FROM    (
                SELECT  @r AS _id,
                        (
                        SELECT  @r := parent
                        FROM    t_hierarchy
                        WHERE   id = _id
                        ) AS parent,
                        @l := @l + 1 AS level
                FROM    (
                        SELECT  @r := 1218,
                                @l := 0,
                                @cl := 0
                        ) vars,
                        t_hierarchy h
                WHERE   @r <> 0
                ORDER BY
                        level DESC
                ) qi
        ) qo
treeitem parent level
1 0 1
2 1 2
7 2 3
38 7 4
165 38 5
1218 165 6
6 rows fetched in 0.0003s (0.0712s)

We disregard the previously selected level at all, leaving it only inside the inline view qi for ordering purposes.

Now, we need to merge the descendancy tree query, but with levels starting from 6, not from 1.

Since this query will come after the ancestry one, we can use accumulated value of @cl (which we used to calculate level in the previous query) as a seed.

To do that, we can take the level returned by the descendancy tree query, and just add @cl to it.

The only problem it to determine when we should increment @cl and when we should add it to level.

We will just use a boolean field which will help us to tell the sets apart.

Here's the query to do it:

SELECT  CONCAT(REPEAT('    ', level - 1), CAST(id AS CHAR)),
        parent,
        level
FROM    (
        SELECT  id, parent, IF(ancestry, @cl := @cl + 1, level + @cl) AS level
        FROM    (
                SELECT  TRUE AS ancestry, _id AS id, parent, level
                FROM    (
                        SELECT  @r AS _id,
                                (
                                SELECT  @r := parent
                                FROM    t_hierarchy
                                WHERE   id = _id
                                ) AS parent,
                                @l := @l + 1 AS level
                        FROM    (
                                SELECT  @r := 1218,
                                        @l := 0,
                                        @cl := 0
                                ) vars,
                                t_hierarchy h
                        WHERE   @r <> 0
                        ORDER BY
                                level DESC
                        ) qi
                UNION ALL
                SELECT  FALSE, hi.id, parent, level
                FROM    (
                        SELECT  hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level
                        FROM    (
                                SELECT  @start_with := 1218,
                                        @id := @start_with,
                                        @level := 0
                                ) vars, t_hierarchy
                        WHERE   @id IS NOT NULL
                        ) ho
                JOIN    t_hierarchy hi
                ON      hi.id = ho.id
                ) q
        ) q2

CONCAT(REPEAT(' ', level - 1), CAST(id AS CHAR)) parent level
1 0 1
2 1 2
7 2 3
38 7 4
165 38 5
1218 165 6
5473 1218 7
24796 5473 8
24797 5473 8
24798 5473 8
24799 5473 8
24800 5473 8
5474 1218 7
24801 5474 8
24802 5474 8
24803 5474 8
24804 5474 8
24805 5474 8
5475 1218 7
24806 5475 8
24807 5475 8
24808 5475 8
24809 5475 8
24810 5475 8
5476 1218 7
24811 5476 8
24812 5476 8
24813 5476 8
24814 5476 8
24815 5476 8
5477 1218 7
24816 5477 8
24817 5477 8
24818 5477 8
24819 5477 8
24820 5477 8
36 rows fetched in 0.0016s (0.1514s)

, which is the answer to your question.

Hope that helps.


I'm always glad to answer the questions regarding database queries.

Ask me a question

Written by Quassnoi

July 20th, 2009 at 11:00 pm

Posted in MySQL

23 Responses to 'Hierarchical data in MySQL: parents and children in one query'

Subscribe to comments with RSS

  1. This blew my head off! Thanks though, bookmarked

    Fred

    1 Nov 12 at 14:04

  2. Hi, i found this great, but some times my sql server says

    ( ! ) SCREAM: Error suppression ignored for
    ( ! ) Fatal error: Out of memory (allocated 8650752) (tried to allocate 393216 bytes) in C:\wamp\apps\phpmyadmin3.5.1\libraries\display_tbl.lib.php on line 1581
    Call Stack
    # Time Memory Function Location
    1 0.0015 535600 {main}( ) ..\import.php:0
    2 0.0658 6958528 include( ‘C:\wamp\apps\phpmyadmin3.5.1\sql.php’ ) ..\import.php:478

    i guess its taking too much memory..

    Hemant negi

    15 Feb 13 at 18:15

  3. @Hemant: that’s a PHP error.

    Quassnoi

    15 Feb 13 at 22:26

  4. Hi @Quassnoi, this is absolutely amazing! Thanks for doing this!

    I was wondering though how this could be done for the family tree data where a child has multiple parents?

    I saw your post on this in SQL https://explainextended.com/2009/05/24/genealogy-query-on-both-parents/ but I was wondering if you could show how to do this in MySQL too?

    I don’t see how this can be done in both directions (up and down) for both parents, but even if it was just up/down for both parents that would be really great.

    Thanks!

    Jeremy

    19 Oct 13 at 03:13

  5. This is excellent work!

    How would one sort these results by a timestamp? Let’s say I’ve got varied dates for these items and want to display them in either an ascending or descending order by a date instead of an ID, but retain the correct parent/child relationship and have those items correctly sorted by that field as well? Furthermore, how would one get the timestamp (say it’s a simple integer field) column in the final results as well?

    (Sorry, I know this topic is ancient and I know only the basics of MySQL.)

    Spanky

    25 Feb 14 at 23:43

  6. HI,
    Great work!
    When I ran it on my system, your all exact same code. I see that 3,4,5,6 also have parent as 1.
    So when i try to only run the query to get all descendants for id =1, it no where shows 3,4,5,6 as its descendants in the tree.
    The query I ran was this –>

    SELECT CONCAT(REPEAT(‘ ‘, level – 1), CAST(hi.id AS CHAR)) AS treeitem, parent, level
    FROM (
    SELECT hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level
    FROM (
    SELECT @start_with := 0,
    @id := @start_with,
    @level := 0
    ) vars, t_hierarchy
    WHERE @id IS NOT NULL
    ) ho
    JOIN t_hierarchy hi
    ON hi.id = ho.id

    How do I get all the descendants correctly?

    And if the code doesn’t get the descendants correctly then I doubt that it’s going to give correct parents and children tree as shown in this post.

    Prabhdeep

    17 Jan 16 at 02:43

  7. Hi,

    the query that found all the parents:

    SELECT CONCAT(REPEAT(‘ ‘, level – 1), id) AS treeitem, parent, level
    FROM (
    SELECT _id AS id, parent,
    @cl := @cl + 1 AS level
    FROM (
    SELECT @r AS _id,
    (
    SELECT @r := parent
    FROM t_hierarchy
    WHERE id = _id
    ) AS parent,
    @l := @l + 1 AS level
    FROM (
    SELECT @r := 1218,
    @l := 0,
    @cl := 0
    ) vars,
    t_hierarchy h
    WHERE @r 0
    ORDER BY
    level DESC
    ) qi
    ) qo

    does not work in MySQL 5.7.
    May be I am wrong?

    Murat

    8 Feb 16 at 19:04

  8. In additional to the previous post:

    Error Code: 1054 Unknown column ‘_ID’ in ‘where clause’

    Murat

    8 Feb 16 at 19:06

  9. @Murat: please create a setup at http://sqlfiddle.com and post the link

    Quassnoi

    8 Feb 16 at 19:23

  10. Hi,

    the thing is that sqlfiddle.com has MySQL version 5.6.21, not MySQL 5.7.10 that I use

    Murat

    10 Feb 16 at 16:20

  11. @Murat: ok please try to find a way to set it up somewhere else then. But honestly, the solutions using session variables are black magic in MySQL: they rely on undocumented and unsupported behavior which can break between releases (which it seems is exactly what happened).

    Quassnoi

    10 Feb 16 at 16:24

  12. Hi Quassnoi,

    I had to resolve my problem using dynamic sql. Thank you for response.

    Murat

    10 Feb 16 at 17:44

  13. Hi,
    I am Busy to extract information with more or less same think child and parent.

    The Parent can have multiple child’s.
    But what I notice in this result I get also the parent parents.

    In the example you select to start with 1218 but it also displayes the above parents. and I would like only to display down.

    So what I like to do is only display the tree from an selected parent.

    I hope I make myself clear, else send me an message.
    Thanks

    Marcel Snoeck

    11 Feb 16 at 14:35

  14. Hello. I have added an additional column to the t_hierarchy called type_id.

    I am not able to call that column. It says “unknown column type_id”. How can I add additional columns to t-hierarchy?

    Jason Lancaster

    15 Apr 16 at 23:14

  15. what exactly is qo and qi?

    steve

    8 May 16 at 06:01

  16. @steve: nested query aliases. Just random identifiers.

    Quassnoi

    11 May 16 at 18:04

  17. Hay!
    I want to write a single query that gives me whole child of given parent.

    Amit kumar

    9 Jun 16 at 13:19

  18. Great jobP

    Phongveth

    26 Oct 17 at 09:45

  19. Great job

    Phongveth

    26 Oct 17 at 09:45

  20. Works great. How to we select other fields from each node tho? When I add fields like “title” to the query, I just get seemingly random titles from my db alongside the real IDs I wanted.

    cronoklee

    23 Jan 18 at 14:09

  21. @cronoklee
    There are many codes on the internet that were adapted from this page. This might help you: 1https://stackoverflow.com/questions/2441821/getting-all-parent-rows-in-one-sql-query

    Malo

    26 Feb 18 at 16:38

  22. I’m trying to get my head around this.
    Something which I don’t understand from the start is that you’re passing a dynamically assigned variable ‘id’ to your hierarchy_connect_by_parent_eq_prior_id function.
    If ever I’ve tried to do this, it fails. StackOverflow posters et al, say it’s because MySQL doesn’t support it.

    How is it working in your examples?

    Elliot

    22 Feb 19 at 12:48

  23. This is a ten years old article which relies on odd and undocumented behavior of previous versions of MySQL. Newer versions of MySQL support recursive CTE which is what you should be using for this kind of queries.

    Quassnoi

    22 Feb 19 at 12:52

Leave a Reply