EXPLAIN EXTENDED

How to create fast database queries

Sorting lists: deleting blocks

Comments enabled. I *really* need your comment

This is article 6 of 6 on linked lists in MySQL:

Now, let's see how can we delete a whole block of items.

The procedure that deletes a block looks like this:

CREATE PROCEDURE prc_delete_block(first INT, last INT)
MAIN: BEGIN
        DECLARE lastchild INT;
        DECLARE firstparent INT;
        DECLARE _loop INT;
        IF first IS NULL OR last IS NULL THEN
                LEAVE MAIN;
        END IF;
        SELECT  (
                SELECT  id
                FROM    (
                        SELECT  l.id
                        FROM    (
                                SELECT  @r AS _current,
                                        @r :=
                                        (
                                        SELECT  id
                                        FROM    t_list
                                        WHERE   parent = _current
                                        ) AS id
                                FROM    (
                                        SELECT  @r := parent
                                        FROM    t_list
                                        WHERE   id = first
                                        ) vars,
                                        t_list
                                WHERE   @r <> last
                                ) q
                        JOIN
                                t_list l
                        ON      l.id = q.id
                        FOR UPDATE
                        ) qq
                WHERE   id = last
                )
        INTO    _loop;
        IF _loop IS NULL THEN
                LEAVE MAIN;
        END IF;
        SELECT  (
                SELECT  id
                FROM    t_list
                WHERE   parent = last
                FOR UPDATE
                ),
                (
                SELECT  parent
                FROM    t_list
                WHERE   id = first
                )
        INTO    lastchild, firstparent;
        DELETE
        FROM    t_list
        WHERE   id IN
                (
                SELECT  id
                FROM    (
                        SELECT  @r AS _current,
                                @r :=
                                (
                                SELECT  id
                                FROM    t_list
                                WHERE   parent = _current
                                ) AS id
                        FROM    (
                                SELECT  @r := parent
                                FROM    t_list
                                WHERE   id = first
                                ) vars,
                                t_list
                        WHERE   @r <> last
                        ) q
                );
        UPDATE  t_list
        SET     parent = firstparent
        WHERE   id = lastchild;
END
$$

CREATE PROCEDURE prc_delete (item INT)
BEGIN
        CALL prc_delete_block(item, item);
END
$$

Again, the old procedure that deletes a single item is now just a wrapper for the new procedure that deletes a whole block.

Let's look closer into this new procedure.

To delete a block, we pass two id's: first and last.

As in updating, we also need to do some sanity check:

  1. Ensure that first and last do exist in the table and last is the descendant of first.
  2. Lock the rows so that they cannot be modified in between.

Note that we didn't lock the rows when moving a block, but do lock them when deleting. Why? Because when moving, the rows if the block itself are not updated. Only three rows were updated: first, target's child and last's parent, only one of them (first) belonging to the block itself.

When deleting the block, though, the rows themselves will be deleted as well. That's why we need to lock them to ensure nothing will happen to them between we do our sanity checks and select the auxiliary data.

Of course, the deletion procedure should be run inside a transaction. I didn't put the transaction control statements into the procedure itself, as MySQL does not support nested transactions and is not very friendly to undeclared savepoints. It will be better to control transactions from the user side.

The check and the locking are done in a single query:

SELECT  (
        SELECT  id
        FROM    (
                SELECT  l.id
                FROM    (
                        SELECT  @r AS _current,
                                @r :=
                                (
                                SELECT  id
                                FROM    t_list
                                WHERE   parent = _current
                                ) AS id
                        FROM    (
                                SELECT  @r := parent
                                FROM    t_list
                                WHERE   id = first
                                ) vars,
                                t_list
                        WHERE   @r <> last
                        ) q
                JOIN
                        t_list l
                ON      l.id = q.id
                FOR UPDATE
                ) qq
        WHERE   id = last
        )
INTO    _loop;
IF _loop IS NULL THEN
        LEAVE MAIN;
END IF;

As when moving, this query selects all the rows between first and last.

Note that there is one extra subquery, aliased qq, where q is joined with t_list.

This may seem an unnecessity: we join on id and select the id, so we get no new data from the join. But let's remember that we actually use t_list in the innermost query only as a rowsource, it has nothing in common with the id's returned from the subselects.

We need to make this extra join to ensure that the rows with id's we need will be locked in t_list.

After we select and lock all the rows from the block, we try to filter out our last row and return it in a SELECT field into variable called _loop.

If we don't succeed, then last is not among the first's descendants and we return immediately.

If we do succeed, then we select and lock the last's child and also select the first's parent. We don't need to lock first, as it's already locked by the moment we got here.

The we DELETE all rows from the block. To to this we just filter all block's id's by putting the query that returns these id's into the IN condition.

Finally, we update the last's child so that it goes right after what was the first's parent.

Let's see how it works.

Before deletion:

SELECT  @r AS _current,
        @r := (
        SELECT  id
        FROM    t_list
        WHERE   parent = _current
        ) AS id
FROM    (
        SELECT  @r := 0
        ) vars,
        t_list
_current id
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
996 997
997 998
998 999
999 rows fetched in 0.0207s (0.0165s)

Now we'll delete rows from 5 to 10:

CALL prc_delete_block(5, 10)

, and see the results:

SELECT  @r AS _current,
        @r := (
        SELECT  id
        FROM    t_list
        WHERE   parent = _current
        ) AS id
FROM    (
        SELECT  @r := 0
        ) vars,
        t_list
_current id
0 1
1 2
2 3
3 4
4 11
11 12
12 13
996 997
997 998
998 999
993 rows fetched in 0.0217s (0.0149s)

The rows got deleted all right.

Written by Quassnoi

March 30th, 2009 at 11:00 pm

Posted in MySQL

Leave a Reply