EXPLAIN EXTENDED

How to create fast database queries

Linked lists in MySQL: multiple ordering

Comments enabled. I *really* need your comment

Answering questions asked on the site.

Rick McIntosh asks:

I have two tables, one with a list of spreadsheets (a) and the other with a list of the column headings that show in the spreadsheets (b).

a
id parent
1 2
2 0
b
id parent aid
1 1 1
2 0 1
3 2 1
4 6 2
5 4 2
6 0 2

I want to bring the columns back in first the order of the spreadsheets as defined by the a.parent_id then ordered as b.parent_id:

id parent aid aparent
6 0 2 0
4 6 2 0
5 4 2 0
2 0 1 2
3 2 1 2
1 1 1 2

This can be done using the same recursion technique as the one that was used to build a simple linked list.

MySQL does not support recursion directly, but it can be emulated using subquery calls in the SELECT clause of the query, using session variables to store the recursion state.

In this case we need to do the following:

  1. Generate a dummy recordset for recursion that would contain as many rows as the resulting recordset. This is best done by issuing a JOIN on the FOREIGN KEY, without any ordering. The values of the recordset will not be used in the actual query.
  2. Initialize @a and @b to be the first value of a and a 0, accordingly.
  3. In the loop, make a query that would return the next item of b for the current value of @a, or, should it fail, return the first item of b for the next @a. This is best done using a UNION ALL with a LIMIT.
  4. Adjust @a so that is points to the correct item: just select the appropriate value from b

Let's create a sample table:

Table creation details

Here's what this table contains (in id order):

SELECT  *
FROM    a
JOIN    b
ON      b.aid = a.id
ORDER BY
a.id, b.id

id parent id parent aid
1 0 1 2 1
1 0 2 3 1
1 0 3 0 1
2 4 4 5 2
2 4 5 0 2
2 4 6 4 2
3 1 7 8 3
3 1 8 9 3
3 1 9 0 3
4 3 10 12 4
4 3 11 0 4
4 3 12 11 4
5 2 13 0 5
5 2 14 15 5
5 2 15 13 5
15 rows fetched in 0.0006s (0.0029s)

There are 5 records in a and 15 records in b (3 records per item).

Here's the query:

SELECT  b.id, b.parent, a.id AS aid, a.parent AS aparent
FROM    (
SELECT  @a AS _a,
@b AS _b,
@b :=
(
SELECT  id
FROM    b
WHERE   b.aid = _a
AND b.parent = _b
UNION ALL
SELECT  b.id
FROM    a
JOIN    b
ON      b.aid = a.id
WHERE   b.parent = 0
AND a.parent = _a
LIMIT 1
) AS bid,
@b AS _nb,
@a :=
(
SELECT  aid
FROM    b
WHERE   b.id = _nb
) AS aid
FROM    (
SELECT  @a := a.id, @b := 0
FROM    a
WHERE   parent = 0
) vars,
a
JOIN    b
ON      b.aid = a.id
) q
JOIN    a
ON      a.id = aid
JOIN    b
ON      b.id = bid
id parent aid aparent
3 0 1 0
2 3 1 0
1 2 1 0
9 0 3 1
8 9 3 1
7 8 3 1
11 0 4 3
12 11 4 3
10 12 4 3
5 0 2 4
4 5 2 4
6 4 2 4
13 0 5 2
15 13 5 2
14 15 5 2
15 rows fetched in 0.0006s (0.0073s)
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY a index PRIMARY ux_a_parent 4 5 100.00 Using index
1 PRIMARY <derived2> ALL 15 100.00 Using where; Using join buffer
1 PRIMARY b eq_ref PRIMARY PRIMARY 4 q.bid 1 100.00 Using where
2 DERIVED <derived6> system 1 100.00
2 DERIVED a index PRIMARY PRIMARY 4 5 100.00 Using index
2 DERIVED b ref ux_b_aid_parent ux_b_aid_parent 4 20091229_linked.a.id 7 100.00 Using index
6 DERIVED a const ux_a_parent ux_a_parent 4 1 100.00 Using index
5 DEPENDENT SUBQUERY b eq_ref PRIMARY PRIMARY 4 func 1 100.00 Using where
3 DEPENDENT SUBQUERY b eq_ref ux_b_aid_parent ux_b_aid_parent 8 func,func 1 100.00 Using where; Using index
4 DEPENDENT UNION a eq_ref PRIMARY,ux_a_parent ux_a_parent 4 func 1 100.00 Using where; Using index
4 DEPENDENT UNION b eq_ref ux_b_aid_parent ux_b_aid_parent 8 20091229_linked.a.id 1 100.00 Using where; Using index
UNION RESULT <union3,4> ALL
Field or reference '_a' of SELECT #3 was resolved in SELECT #2
Field or reference '_b' of SELECT #3 was resolved in SELECT #2
Field or reference '_a' of SELECT #4 was resolved in SELECT #2
Field or reference '_nb' of SELECT #5 was resolved in SELECT #2
select `20091229_linked`.`b`.`id` AS `id`,`20091229_linked`.`b`.`parent` AS `parent`,`20091229_linked`.`a`.`id` AS `aid`,`20091229_linked`.`a`.`parent` AS `aparent` from (select (@a) AS `_a`,(@b) AS `_b`,(@b:=(select `20091229_linked`.`b`.`id` AS `id` from `20091229_linked`.`b` where ((`20091229_linked`.`b`.`aid` = `_a`) and (`20091229_linked`.`b`.`parent` = `_b`)) union all select `20091229_linked`.`b`.`id` AS `id` from `20091229_linked`.`a` join `20091229_linked`.`b` where () limit 1)) AS `bid`,(@b) AS `_nb`,(@a:=(select `20091229_linked`.`b`.`aid` AS `aid` from `20091229_linked`.`b` where (`20091229_linked`.`b`.`id` = `_nb`))) AS `aid` from (select (@a:='1') AS `@a := a.id`,(@b:=0) AS `@b := 0` from `20091229_linked`.`a` where 1) `vars` join `20091229_linked`.`a` join `20091229_linked`.`b` where (`20091229_linked`.`b`.`aid` = `20091229_linked`.`a`.`id`)) `q` join `20091229_linked`.`a` join `20091229_linked`.`b` where ((`20091229_linked`.`b`.`id` = `q`.`bid`) and (`20091229_linked`.`a`.`id` = `q`.`aid`))

This returns all rows in correct order.

Note that I used aliases (_a, _b and _nb) instead of variables in the subqueries. This helps MySQL to pick correct execution plans and use the indexes: a predicate containing a variable is an UNCACHEABLE SUBQUERY, while a predicate with an alias is a DEPENDENT SUBQUERY which is optimized much better.

Hope that helps.


I'm always glad to answer the questions regarding database queries.

Ask me a question

Written by Quassnoi

December 29th, 2009 at 11:00 pm

Posted in MySQL

Leave a Reply