Endless loops

Comments enabled. I *really* need your comment

Imagine that we are developing an email server that keeps the outgoing messages in a database.

The server is always busy sending the messages if the outgoing queue is not empty. It may have up to 10 connections at once.

If the message is failed to be sent, it's reinserted to the end of the queue until the retry count is exceeded.

As soon as all messages from a batch succeed or fail to be sent, next 10 messages from the queue are fetched.

How do we write a query to fetch a batch of messages from the queue?

This all can be done in a single query run in a loop. No range checking, no reinserting.

Let's create the sample table:

Table creation details

CREATE TABLE t_queue (
        id INT NOT NULL PRIMARY KEY,
        header VARCHAR(50) NOT NULL,
        attempt INT NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt <= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(20);
COMMIT;

INSERT
INTO    t_queue (id, header, attempt)
SELECT  id, CONCAT('Message ', id), 0
FROM    filler;
&#91;/sourcecode&#93;
</div>

SELECT  *
FROM    t_queue

id	header	attempt
1	Message 1	0
2	Message 2	0
3	Message 3	0
4	Message 4	0
5	Message 5	0
6	Message 6	0
7	Message 7	0
8	Message 8	0
9	Message 9	0
10	Message 10	0
11	Message 11	0
12	Message 12	0
13	Message 13	0
14	Message 14	0
15	Message 15	0
16	Message 16	0
17	Message 17	0
18	Message 18	0
19	Message 19	0
20	Message 20	0
20 rows fetched in 0.0006s (0.0018s)

We have a queue of 20 fresh messages, none of them having been attempted to send.

Let's select a batch of the messages to be sent. We need the messages for which the retry counter had not been exceeded, i. e. those having attempt < 3.

We initialize a session variable:

SET @r := -1;

and issue the following query:

SELECT  q.*
FROM    (
        SELECT  @r AS _current,
                @r := COALESCE(
                (
                SELECT  id
                FROM    t_queue
                WHERE   id > _current
                        AND attempt < 3
                ORDER BY
                        id
                LIMIT 1
                ),
                (
                SELECT  MIN(id)
                FROM    t_queue
                WHERE   attempt < 3
                )
                ) AS nid
        FROM    t_queue
        LIMIT 10
        ) ids, t_queue q
WHERE   id = nid
FOR UPDATE
&#91;/sourcecode&#93;

<div class="terminal">
<table class="terminal">
<tr><th>id</th><th>header</th><th>attempt</th></tr>
<tr><td class="integer">1</td><td class="varchar">Message 1</td><td class="integer">0</td></tr>
<tr><td class="integer">2</td><td class="varchar">Message 2</td><td class="integer">0</td></tr>
<tr><td class="integer">3</td><td class="varchar">Message 3</td><td class="integer">0</td></tr>
<tr><td class="integer">4</td><td class="varchar">Message 4</td><td class="integer">0</td></tr>
<tr><td class="integer">5</td><td class="varchar">Message 5</td><td class="integer">0</td></tr>
<tr><td class="integer">6</td><td class="varchar">Message 6</td><td class="integer">0</td></tr>
<tr><td class="integer">7</td><td class="varchar">Message 7</td><td class="integer">0</td></tr>
<tr><td class="integer">8</td><td class="varchar">Message 8</td><td class="integer">0</td></tr>
<tr><td class="integer">9</td><td class="varchar">Message 9</td><td class="integer">0</td></tr>
<tr class="statusbar"><td colspan="100">9 rows fetched in 0.0003s (0.0067s)</td></tr>
</table>
</div>

This selects a list of first <strong>10</strong> messages.

Now, if we just reissue the exactly same query, we get:

<div class="terminal">
<table class="terminal">
<tr><th>id</th><th>header</th><th>attempt</th></tr>
<tr><td class="integer">11</td><td class="varchar">Message 11</td><td class="integer">0</td></tr>
<tr><td class="integer">12</td><td class="varchar">Message 12</td><td class="integer">0</td></tr>
<tr><td class="integer">13</td><td class="varchar">Message 13</td><td class="integer">0</td></tr>
<tr><td class="integer">14</td><td class="varchar">Message 14</td><td class="integer">0</td></tr>
<tr><td class="integer">15</td><td class="varchar">Message 15</td><td class="integer">0</td></tr>
<tr><td class="integer">16</td><td class="varchar">Message 16</td><td class="integer">0</td></tr>
<tr><td class="integer">17</td><td class="varchar">Message 17</td><td class="integer">0</td></tr>
<tr><td class="integer">18</td><td class="varchar">Message 18</td><td class="integer">0</td></tr>
<tr><td class="integer">19</td><td class="varchar">Message 19</td><td class="integer">0</td></tr>
<tr><td class="integer">20</td><td class="varchar">Message 20</td><td class="integer">0</td></tr>
<tr class="statusbar"><td colspan="100">10 rows fetched in 0.0004s (0.0068s)</td></tr>
</table>
</div>

Exactly same query, but the results are different.

Now let's assume that messages <strong>5</strong>, <strong>10</strong> and <strong>15</strong> failed to be sent and update their attempts to <code>3</code>'s:


UPDATE t_queue
SET    attempt = 3
WHERE  id IN (5, 10, 15)

, and rerun the batch query. We get the following result:

id	header	attempt
1	Message 1	0
2	Message 2	0
3	Message 3	0
4	Message 4	0
6	Message 6	0
7	Message 7	0
8	Message 8	0
9	Message 9	0
11	Message 11	0
12	Message 12	0
10 rows fetched in 0.0004s (0.0068s)

The query has wrapped around automatically.

If we repeat the query again, we will get:

id	header	attempt
13	Message 13	0
14	Message 14	0
16	Message 16	0
17	Message 17	0
18	Message 18	0
19	Message 19	0
20	Message 20	0
1	Message 1	0
2	Message 2	0
3	Message 3	0
10 rows fetched in 0.0003s (0.0059s)

The query makes a rollover right inside the resultset: it selects 1 after 20.

So we can basically just run the query inside an endless loop without having to bother about rewinding, rollovers etc. The query will just select 10 next messages again and again.

So how does it work?

The principle is simple:

We use t_queue in the inner subquery as a dummy rowsource, limiting rows as necessary
We keep current id in the session variable @r
For each row in the dummy rowset, we get the id next to @r that satisfies the conditions (attempt < 3)
If there is no next id for a given @r, we select the minimal id satisfying the conditions, thus making a rollover
Finally, we join t_queue on this next id and mark it FOR UPDATE

As the intermediate result is stored in @r between the queries, we assign @r only once (right after connecting), and just run the batch query over and over again.

Written by Quassnoi

April 3rd, 2009 at 11:00 pm

Posted in MySQL

Subscribe in a reader

Subscribe by email

Contacts

Ask me a question

Drop me a line

Should I?

Yes. Feel free to ask questions and write me. An interesting question is a pleasure to answer, and I really enjoy receiving feedback

EXPLAIN EXTENDED

Endless loops

Leave a Reply

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow

EXPLAIN EXTENDED

Endless loops

Share this:

Leave a Reply

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow