EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: Happy New Year!. You're welcome to read and comment on it.

Assigning people

Answering questions asked on the site.

Greg asks:

I have two tables, tasks and resources.

Tasks contains cost of each task: first requires 5 people to complete, second requires 3 people etc.

Resources is just a list of people.

How do I assign people to tasks?

This is in MySQL.

Thanks for a nice question, Greg.

Unfortunately, you haven’t sent your table structure, so I will have to make it up.

Let’s create the tables:

Table creation details

I created 100 tasks with random costs from 1 to 10, and 10000 people. This should be enough for any real-world task management system.

I also made the id‘s in tasks and resources sparse to avoid confusion.

Now, we have a set of tasks ordered by some criteria (priority, probably), and a set of people ordered by some other criteria. And we need to assign people to tasks.

It will probably be best to use a queue like algorithm: first bunch of people from the list go to the first task, the next bunch to the second task, etc.

What we need is to assign the queue number to the list of resource’s, and then calculate which resource goes to which task.

To assign the queue number to people we will use an incrementing session variable:

SELECT  r.*,
        @s := @s + 1 AS res
FROM    (
        SELECT  @s := 0
        ) vars,
        resources r
id name res
17 Employee 17 1
34 Employee 34 2
51 Employee 51 3
68 Employee 68 4
85 Employee 85 5
169966 Employee 169966 9998
169983 Employee 169983 9999
170000 Employee 170000 10000
10000 rows fetched in 0.2485s (0.0258s)

Column res in the query above shows the position in the queue.

To select what number in the queue goes to what task, we will also use the session variables, but will increment them by cost, and not by one:

SELECT  t.*,
        @r AS s_res,
        (@r := @r + cost) - 1 AS e_res
FROM    (
        SELECT  @r := 1
        ) vars,
        tasks t
id cost value s_res e_res
13 9 Task 13 1 9
26 6 Task 26 10 15
39 6 Task 39 16 21
52 9 Task 52 22 30
65 9 Task 65 31 39
1274 9 Task 1274 499 507
1287 6 Task 1287 508 513
1300 1 Task 1300 514 514
100 rows fetched in 0.0038s (0.0028s)

Task 13, which goes first in the task list, requires 9 people, that’s why people with queue numbers from 1 to 9 go to this task. Task 26, which goes second, requires 6 people, and the next bunch (from 10 to 15) goes to this task. Et cetera.

In the query above, we got the starting and ending queue position for each task.

Now, we just need to join the tables so that the queue position for each resource goes between the starting and ending positions for each task:

SELECT  *
FROM    (
        SELECT  t.*,
                @r AS s_res,
                (@r := @r + cost) - 1 AS e_res
        FROM    (
                SELECT  @r := 1
                ) vars,
                tasks t
        ) tq
LEFT JOIN
        (
        SELECT  r.*,
                @s := @s + 1 AS res
        FROM    (
                SELECT  @s := 0
                ) vars,
                resources r
        ) rq
ON      rq.res BETWEEN tq.s_res AND tq.e_res
id cost value s_res e_res id name res
13 9 Task 13 1 9 17 Employee 17 1
13 9 Task 13 1 9 34 Employee 34 2
13 9 Task 13 1 9 51 Employee 51 3
13 9 Task 13 1 9 68 Employee 68 4
13 9 Task 13 1 9 85 Employee 85 5
13 9 Task 13 1 9 102 Employee 102 6
13 9 Task 13 1 9 119 Employee 119 7
13 9 Task 13 1 9 136 Employee 136 8
13 9 Task 13 1 9 153 Employee 153 9
26 6 Task 26 10 15 170 Employee 170 10
26 6 Task 26 10 15 187 Employee 187 11
26 6 Task 26 10 15 204 Employee 204 12
26 6 Task 26 10 15 221 Employee 221 13
26 6 Task 26 10 15 238 Employee 238 14
26 6 Task 26 10 15 255 Employee 255 15
39 6 Task 39 16 21 272 Employee 272 16
39 6 Task 39 16 21 289 Employee 289 17
1287 6 Task 1287 508 513 8704 Employee 8704 512
1287 6 Task 1287 508 513 8721 Employee 8721 513
1300 1 Task 1300 514 514 8738 Employee 8738 514
514 rows fetched in 0.0268s (0.2668s)
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY <derived2> ALL 100 100.00
1 PRIMARY <derived4> ALL 10000 100.00
4 DERIVED <derived5> system 1 100.00
4 DERIVED r ALL 10476 100.00
5 DERIVED No tables used
2 DERIVED <derived3> system 1 100.00
2 DERIVED t ALL 100 100.00
3 DERIVED No tables used

The first task got 9 people, the second one got 6 people, etc.

This is quite a universal decision: you can put anything you like into the subqueries.

You can use your own ordering schemas by adding ORDER BY into the subqueries above, add arbitrary WHERE conditions to filter the tasks or people, etc.

Hope that helps.


I’m always glad to answer the questions regarding database queries.

Ask me a question

Written by Quassnoi

May 4th, 2009 at 11:00 pm

Posted in MySQL

Comments are closed.