<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>EXPLAIN EXTENDED</title>
	<atom:link href="http://explainextended.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://explainextended.com</link>
	<description>How to create fast database queries</description>
	<lastBuildDate>Wed, 25 Aug 2010 13:29:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>20 latest unique records</title>
		<link>http://explainextended.com/2010/08/24/20-latest-unique-records/</link>
		<comments>http://explainextended.com/2010/08/24/20-latest-unique-records/#comments</comments>
		<pubDate>Tue, 24 Aug 2010 19:00:38 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4884</guid>
		<description><![CDATA[From Stack Overflow: I have a logfile which logs the insert/delete/updates from all kinds of tables. I would like to get an overview of for example the last 20 people which records where updated, ordered by the last update (datetime DESC) A common solution for such a task would be writing an aggregate query with [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://stackoverflow.com/questions/3555118/mysql-select-20-latest-entries-in-logfile-from-unique-persons"><strong>Stack Overflow</strong></a>:</p>
<blockquote>
<p>I have a logfile which logs the insert/delete/updates from all kinds of tables.</p>
<p>I would like to get an overview of for example the last 20 people which records where updated, ordered by the last update (<code>datetime DESC</code>)</p>
</blockquote>
<p>A common solution for such a task would be writing an aggregate query with <code>ORDER BY</code> and <code>LIMIT</code>:</p>
<pre class="brush: sql">
SELECT  person, MAX(ts) AS last_update
FROM    logfile
GROUP BY
        person
ORDER BY
        last_update DESC
LIMIT 20
</pre>
<p>What&#8217;s bad in this solution? Performance, as usual.</p>
<p>Since <code>last_update</code> is an aggregate, it cannot be indexed. And <code>ORDER BY</code> on unindexed fields results in our good old friend, <code>filesort</code>.</p>
<p>Note that even in this case the indexes can be used and the full table scan can be avoided: if there is an index on <code>(person, ts)</code>, <code>MySQL</code> will tend to use a <a href="http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html">loose index scan</a> on this index, which can save this query if there are relatively few persons in the table. However, if there are many (which is what we can expect for a log table), loose index scan can even degrade performance and generally will be avoided by <code>MySQL</code>.</p>
<p>We should use another approach here. Let&#8217;s create a sample table and test this approach:<br />
<span id="more-4884"></span><br />
<a href="#" onclick="xcollapse('X8484');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X8484" style="display: none; background: transparent;">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

CREATE TABLE logfile (
        id INT NOT NULL PRIMARY KEY,
        sparse INT NOT NULL,
        dense INT NOT NULL,
        ts DATETIME NOT NULL,
        stuffing VARCHAR(100) NOT NULL,
        KEY ix_logfile_ts_id (ts, id),
        KEY ix_logfile_sparse_ts_id (sparse, ts, id),
        KEY ix_logfile_dense_ts_id (dense, ts, id)
) ENGINE=InnoDB;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(500000);
COMMIT;

INSERT
INTO    logfile
SELECT  id,
        CEILING(RAND(20100824) * 30),
        CEILING(RAND(20100824 &lt;&lt; 1) * 30000),
        &#039;2010-08-24&#039; - INTERVAL RAND(20100824 &lt;&lt; 2) * 10000000 SECOND,
        LPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    filler;
</pre>
</div>
<p>This table has <strong>1,000,000</strong> records.</p>
<p>Instead of a single field, <code>person</code>, I created two different fields: <code>sparse</code> and <code>dense</code>. The first one has <strong>30</strong> distinct values, while the second one has <strong>30,000</strong>. This will help us to see how data distribution affects performance of different queries.</p>
<p>Let&#8217;s run our original queries. We&#8217;ll adjust them a little to help <code>MySQL</code> to pick correct plans:</p>
<pre class="brush: sql">
SELECT  *
FROM    (
        SELECT  sparse, MAX(ts) AS last_update
        FROM    logfile
        GROUP BY
                sparse
        ) q
ORDER BY
        last_update DESC
LIMIT 20;
</pre>
<p><a href="#" onclick="xcollapse('X4382');return false;"><strong>View query results</strong></a><br />
</p>
<div id="X4382" style="display: none; background: transparent;">
<div class="terminal">
<table class="terminal">
<tr>
<th>sparse</th>
<th>last_update</th>
</tr>
<tr>
<td class="integer">15</td>
<td class="timestamp">2010-08-23 23:59:58</td>
</tr>
<tr>
<td class="integer">26</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">11</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">30</td>
<td class="timestamp">2010-08-23 23:59:42</td>
</tr>
<tr>
<td class="integer">29</td>
<td class="timestamp">2010-08-23 23:59:32</td>
</tr>
<tr>
<td class="integer">13</td>
<td class="timestamp">2010-08-23 23:58:54</td>
</tr>
<tr>
<td class="integer">27</td>
<td class="timestamp">2010-08-23 23:58:53</td>
</tr>
<tr>
<td class="integer">7</td>
<td class="timestamp">2010-08-23 23:58:46</td>
</tr>
<tr>
<td class="integer">5</td>
<td class="timestamp">2010-08-23 23:58:00</td>
</tr>
<tr>
<td class="integer">12</td>
<td class="timestamp">2010-08-23 23:57:44</td>
</tr>
<tr>
<td class="integer">14</td>
<td class="timestamp">2010-08-23 23:57:24</td>
</tr>
<tr>
<td class="integer">6</td>
<td class="timestamp">2010-08-23 23:56:58</td>
</tr>
<tr>
<td class="integer">2</td>
<td class="timestamp">2010-08-23 23:56:48</td>
</tr>
<tr>
<td class="integer">24</td>
<td class="timestamp">2010-08-23 23:56:13</td>
</tr>
<tr>
<td class="integer">17</td>
<td class="timestamp">2010-08-23 23:56:12</td>
</tr>
<tr>
<td class="integer">23</td>
<td class="timestamp">2010-08-23 23:55:08</td>
</tr>
<tr>
<td class="integer">19</td>
<td class="timestamp">2010-08-23 23:55:07</td>
</tr>
<tr>
<td class="integer">20</td>
<td class="timestamp">2010-08-23 23:53:44</td>
</tr>
<tr>
<td class="integer">10</td>
<td class="timestamp">2010-08-23 23:51:52</td>
</tr>
<tr>
<td class="integer">4</td>
<td class="timestamp">2010-08-23 23:50:53</td>
</tr>
<tr class="statusbar">
<td colspan="100">20 rows fetched in 0.0005s (0.0026s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">30</td>
<td class="double">100.00</td>
<td class="varchar">Using filesort</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">logfile</td>
<td class="varchar">range</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_sparse_ts_id</td>
<td class="varchar">4</td>
<td class="varchar"></td>
<td class="bigint">500133</td>
<td class="double">100.00</td>
<td class="varchar">Using index for group-by</td>
</tr>
</table>
</div>
<pre>
select `q`.`sparse` AS `sparse`,`q`.`last_update` AS `last_update` from (select `20100824_latest`.`logfile`.`sparse` AS `sparse`,max(`20100824_latest`.`logfile`.`ts`) AS `last_update` from `20100824_latest`.`logfile` group by `20100824_latest`.`logfile`.`sparse`) `q` order by `q`.`last_update` desc limit 20
</pre>
</div>
<pre class="brush: sql">
SELECT  *
FROM    (
        SELECT  dense, MAX(ts) AS last_update
        FROM    logfile
        GROUP BY
                dense
        ) q
ORDER BY
        last_update DESC
LIMIT 20;
</pre>
<p><a href="#" onclick="xcollapse('X2066');return false;"><strong>View query results</strong></a><br />
</p>
<div id="X2066" style="display: none; background: transparent;">
<div class="terminal">
<table class="terminal">
<tr>
<th>dense</th>
<th>last_update</th>
</tr>
<tr>
<td class="integer">25324</td>
<td class="timestamp">2010-08-23 23:59:58</td>
</tr>
<tr>
<td class="integer">13060</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">3268</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">2327</td>
<td class="timestamp">2010-08-23 23:59:42</td>
</tr>
<tr>
<td class="integer">23968</td>
<td class="timestamp">2010-08-23 23:59:32</td>
</tr>
<tr>
<td class="integer">1622</td>
<td class="timestamp">2010-08-23 23:58:54</td>
</tr>
<tr>
<td class="integer">29693</td>
<td class="timestamp">2010-08-23 23:58:53</td>
</tr>
<tr>
<td class="integer">655</td>
<td class="timestamp">2010-08-23 23:58:46</td>
</tr>
<tr>
<td class="integer">5802</td>
<td class="timestamp">2010-08-23 23:58:07</td>
</tr>
<tr>
<td class="integer">11843</td>
<td class="timestamp">2010-08-23 23:58:00</td>
</tr>
<tr>
<td class="integer">18894</td>
<td class="timestamp">2010-08-23 23:57:44</td>
</tr>
<tr>
<td class="integer">6180</td>
<td class="timestamp">2010-08-23 23:57:26</td>
</tr>
<tr>
<td class="integer">9398</td>
<td class="timestamp">2010-08-23 23:57:24</td>
</tr>
<tr>
<td class="integer">18012</td>
<td class="timestamp">2010-08-23 23:56:58</td>
</tr>
<tr>
<td class="integer">25758</td>
<td class="timestamp">2010-08-23 23:56:49</td>
</tr>
<tr>
<td class="integer">2379</td>
<td class="timestamp">2010-08-23 23:56:48</td>
</tr>
<tr>
<td class="integer">821</td>
<td class="timestamp">2010-08-23 23:56:39</td>
</tr>
<tr>
<td class="integer">4186</td>
<td class="timestamp">2010-08-23 23:56:13</td>
</tr>
<tr>
<td class="integer">20198</td>
<td class="timestamp">2010-08-23 23:56:12</td>
</tr>
<tr>
<td class="integer">18615</td>
<td class="timestamp">2010-08-23 23:56:01</td>
</tr>
<tr class="statusbar">
<td colspan="100">20 rows fetched in 0.0005s (0.5000s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">30000</td>
<td class="double">100.00</td>
<td class="varchar">Using filesort</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">logfile</td>
<td class="varchar">range</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_dense_ts_id</td>
<td class="varchar">4</td>
<td class="varchar"></td>
<td class="bigint">500133</td>
<td class="double">100.00</td>
<td class="varchar">Using index for group-by</td>
</tr>
</table>
</div>
<pre>
select `q`.`dense` AS `dense`,`q`.`last_update` AS `last_update` from (select `20100824_latest`.`logfile`.`dense` AS `dense`,max(`20100824_latest`.`logfile`.`ts`) AS `last_update` from `20100824_latest`.`logfile` group by `20100824_latest`.`logfile`.`dense`) `q` order by `q`.`last_update` desc limit 20
</pre>
</div>
<p>We see that both queries use the same plan and return <strong>20</strong> records,  but the first one is instant, while the second one runs for <strong>500 ms</strong>. Both queries use <strong>filesort</strong>, but in second case it has to sort <strong>30,000</strong> records (compared to <strong>30</strong> in the first case).</p>
<p>In this case, it is better to use another approach.</p>
<p>With our original query, we take each person and see which record is latest for this person. But we can as well do it the other way round: take the records in descending order, one by one, and for each record see if it&#8217;s latest for this person. If it is, we should return it; if it&#8217;s not, this means that the record for this person has already been returned (remember, we take them in descending order).</p>
<p>It&#8217;s easy to see that <strong>20</strong> records returned this way will, first, belong to <strong>20</strong> different people, and, second, be the latest records of their respective persons. This is exactly what we need.</p>
<p>The records can easily be scanned in the descending order using the index on <code>(ts, id)</code>. But how do we check if the record is the latest? It&#8217;s simple: we just take the last record for the given person from the index on <code>(person, ts, id)</code> and compare its <code>id</code>. It takes but a single index seek per record and is almost instant.</p>
<p>Here&#8217;s the query to do it:</p>
<pre class="brush: sql">
SELECT  id, sparse, dense, ts
FROM    logfile lf
WHERE   id =
        (
        SELECT  id
        FROM    logfile lfi
        WHERE   lfi.sparse = lf.sparse
        ORDER BY
                sparse DESC, ts DESC, id DESC
        LIMIT 1
        )
ORDER BY
        ts DESC, id DESC
LIMIT 20
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>sparse</th>
<th>dense</th>
<th>ts</th>
</tr>
<tr>
<td class="integer">121946</td>
<td class="integer">15</td>
<td class="integer">25324</td>
<td class="timestamp">2010-08-23 23:59:58</td>
</tr>
<tr>
<td class="integer">276499</td>
<td class="integer">11</td>
<td class="integer">3268</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">62419</td>
<td class="integer">26</td>
<td class="integer">13060</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">254750</td>
<td class="integer">30</td>
<td class="integer">2327</td>
<td class="timestamp">2010-08-23 23:59:42</td>
</tr>
<tr>
<td class="integer">96079</td>
<td class="integer">29</td>
<td class="integer">23968</td>
<td class="timestamp">2010-08-23 23:59:32</td>
</tr>
<tr>
<td class="integer">290657</td>
<td class="integer">13</td>
<td class="integer">1622</td>
<td class="timestamp">2010-08-23 23:58:54</td>
</tr>
<tr>
<td class="integer">278842</td>
<td class="integer">27</td>
<td class="integer">29693</td>
<td class="timestamp">2010-08-23 23:58:53</td>
</tr>
<tr>
<td class="integer">329318</td>
<td class="integer">7</td>
<td class="integer">655</td>
<td class="timestamp">2010-08-23 23:58:46</td>
</tr>
<tr>
<td class="integer">384956</td>
<td class="integer">5</td>
<td class="integer">11843</td>
<td class="timestamp">2010-08-23 23:58:00</td>
</tr>
<tr>
<td class="integer">386333</td>
<td class="integer">12</td>
<td class="integer">18894</td>
<td class="timestamp">2010-08-23 23:57:44</td>
</tr>
<tr>
<td class="integer">260404</td>
<td class="integer">14</td>
<td class="integer">9398</td>
<td class="timestamp">2010-08-23 23:57:24</td>
</tr>
<tr>
<td class="integer">471000</td>
<td class="integer">6</td>
<td class="integer">18012</td>
<td class="timestamp">2010-08-23 23:56:58</td>
</tr>
<tr>
<td class="integer">172079</td>
<td class="integer">2</td>
<td class="integer">2379</td>
<td class="timestamp">2010-08-23 23:56:48</td>
</tr>
<tr>
<td class="integer">112653</td>
<td class="integer">24</td>
<td class="integer">4186</td>
<td class="timestamp">2010-08-23 23:56:13</td>
</tr>
<tr>
<td class="integer">291683</td>
<td class="integer">17</td>
<td class="integer">20198</td>
<td class="timestamp">2010-08-23 23:56:12</td>
</tr>
<tr>
<td class="integer">144673</td>
<td class="integer">23</td>
<td class="integer">25055</td>
<td class="timestamp">2010-08-23 23:55:08</td>
</tr>
<tr>
<td class="integer">172118</td>
<td class="integer">19</td>
<td class="integer">29039</td>
<td class="timestamp">2010-08-23 23:55:07</td>
</tr>
<tr>
<td class="integer">198913</td>
<td class="integer">20</td>
<td class="integer">9887</td>
<td class="timestamp">2010-08-23 23:53:44</td>
</tr>
<tr>
<td class="integer">491436</td>
<td class="integer">10</td>
<td class="integer">17752</td>
<td class="timestamp">2010-08-23 23:51:52</td>
</tr>
<tr>
<td class="integer">346651</td>
<td class="integer">4</td>
<td class="integer">10951</td>
<td class="timestamp">2010-08-23 23:50:53</td>
</tr>
<tr class="statusbar">
<td colspan="100">20 rows fetched in 0.0007s (0.0034s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">lf</td>
<td class="varchar">index</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_ts_id</td>
<td class="varchar">12</td>
<td class="varchar"></td>
<td class="bigint">20</td>
<td class="double">2500660.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">lfi</td>
<td class="varchar">ref</td>
<td class="varchar">ix_logfile_sparse_ts_id</td>
<td class="varchar">ix_logfile_sparse_ts_id</td>
<td class="varchar">4</td>
<td class="varchar">20100824_latest.lf.sparse</td>
<td class="bigint">27785</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;20100824_latest.lf.sparse&#39; of SELECT #2 was resolved in SELECT #1
select `20100824_latest`.`lf`.`id` AS `id`,`20100824_latest`.`lf`.`sparse` AS `sparse`,`20100824_latest`.`lf`.`dense` AS `dense`,`20100824_latest`.`lf`.`ts` AS `ts` from `20100824_latest`.`logfile` `lf` where (`20100824_latest`.`lf`.`id` = (select `20100824_latest`.`lfi`.`id` from `20100824_latest`.`logfile` `lfi` where (`20100824_latest`.`lfi`.`sparse` = `20100824_latest`.`lf`.`sparse`) order by `20100824_latest`.`lfi`.`sparse` desc,`20100824_latest`.`lfi`.`ts` desc,`20100824_latest`.`lfi`.`id` desc limit 1)) order by `20100824_latest`.`lf`.`ts` desc,`20100824_latest`.`lf`.`id` desc limit 20
</pre>
<p>As we can see, this query uses two different indexes. The first one on <code>(ts, id)</code> is used to scan all records according to the overall timeline; the second one, on <code>(sparse, ts, id)</code> is used to find the <code>id</code> of the latest entry for a person and check if it&#8217;s the same as the record selected from the general timeline.</p>
<p>The query is instant: <strong>3 ms</strong>.</p>
<p>Let&#8217;s check the same query on a column with lots of values:</p>
<pre class="brush: sql">
SELECT  id, sparse, dense, ts
FROM    logfile lf
WHERE   id =
        (
        SELECT  id
        FROM    logfile lfi
        WHERE   lfi.dense = lf.dense
        ORDER BY
                dense DESC, ts DESC, id DESC
        LIMIT 1
        )
ORDER BY
        ts DESC, id DESC
LIMIT 20
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>sparse</th>
<th>dense</th>
<th>ts</th>
</tr>
<tr>
<td class="integer">121946</td>
<td class="integer">15</td>
<td class="integer">25324</td>
<td class="timestamp">2010-08-23 23:59:58</td>
</tr>
<tr>
<td class="integer">276499</td>
<td class="integer">11</td>
<td class="integer">3268</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">62419</td>
<td class="integer">26</td>
<td class="integer">13060</td>
<td class="timestamp">2010-08-23 23:59:56</td>
</tr>
<tr>
<td class="integer">254750</td>
<td class="integer">30</td>
<td class="integer">2327</td>
<td class="timestamp">2010-08-23 23:59:42</td>
</tr>
<tr>
<td class="integer">96079</td>
<td class="integer">29</td>
<td class="integer">23968</td>
<td class="timestamp">2010-08-23 23:59:32</td>
</tr>
<tr>
<td class="integer">290657</td>
<td class="integer">13</td>
<td class="integer">1622</td>
<td class="timestamp">2010-08-23 23:58:54</td>
</tr>
<tr>
<td class="integer">278842</td>
<td class="integer">27</td>
<td class="integer">29693</td>
<td class="timestamp">2010-08-23 23:58:53</td>
</tr>
<tr>
<td class="integer">329318</td>
<td class="integer">7</td>
<td class="integer">655</td>
<td class="timestamp">2010-08-23 23:58:46</td>
</tr>
<tr>
<td class="integer">277612</td>
<td class="integer">15</td>
<td class="integer">5802</td>
<td class="timestamp">2010-08-23 23:58:07</td>
</tr>
<tr>
<td class="integer">384956</td>
<td class="integer">5</td>
<td class="integer">11843</td>
<td class="timestamp">2010-08-23 23:58:00</td>
</tr>
<tr>
<td class="integer">386333</td>
<td class="integer">12</td>
<td class="integer">18894</td>
<td class="timestamp">2010-08-23 23:57:44</td>
</tr>
<tr>
<td class="integer">201899</td>
<td class="integer">7</td>
<td class="integer">6180</td>
<td class="timestamp">2010-08-23 23:57:26</td>
</tr>
<tr>
<td class="integer">260404</td>
<td class="integer">14</td>
<td class="integer">9398</td>
<td class="timestamp">2010-08-23 23:57:24</td>
</tr>
<tr>
<td class="integer">471000</td>
<td class="integer">6</td>
<td class="integer">18012</td>
<td class="timestamp">2010-08-23 23:56:58</td>
</tr>
<tr>
<td class="integer">451808</td>
<td class="integer">26</td>
<td class="integer">25758</td>
<td class="timestamp">2010-08-23 23:56:49</td>
</tr>
<tr>
<td class="integer">172079</td>
<td class="integer">2</td>
<td class="integer">2379</td>
<td class="timestamp">2010-08-23 23:56:48</td>
</tr>
<tr>
<td class="integer">367042</td>
<td class="integer">11</td>
<td class="integer">821</td>
<td class="timestamp">2010-08-23 23:56:39</td>
</tr>
<tr>
<td class="integer">112653</td>
<td class="integer">24</td>
<td class="integer">4186</td>
<td class="timestamp">2010-08-23 23:56:13</td>
</tr>
<tr>
<td class="integer">291683</td>
<td class="integer">17</td>
<td class="integer">20198</td>
<td class="timestamp">2010-08-23 23:56:12</td>
</tr>
<tr>
<td class="integer">127839</td>
<td class="integer">11</td>
<td class="integer">18615</td>
<td class="timestamp">2010-08-23 23:56:01</td>
</tr>
<tr class="statusbar">
<td colspan="100">20 rows fetched in 0.0007s (0.0031s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">lf</td>
<td class="varchar">index</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_ts_id</td>
<td class="varchar">12</td>
<td class="varchar"></td>
<td class="bigint">20</td>
<td class="double">2500660.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">lfi</td>
<td class="varchar">ref</td>
<td class="varchar">ix_logfile_dense_ts_id</td>
<td class="varchar">ix_logfile_dense_ts_id</td>
<td class="varchar">4</td>
<td class="varchar">20100824_latest.lf.dense</td>
<td class="bigint">8</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;20100824_latest.lf.dense&#39; of SELECT #2 was resolved in SELECT #1
select `20100824_latest`.`lf`.`id` AS `id`,`20100824_latest`.`lf`.`sparse` AS `sparse`,`20100824_latest`.`lf`.`dense` AS `dense`,`20100824_latest`.`lf`.`ts` AS `ts` from `20100824_latest`.`logfile` `lf` where (`20100824_latest`.`lf`.`id` = (select `20100824_latest`.`lfi`.`id` from `20100824_latest`.`logfile` `lfi` where (`20100824_latest`.`lfi`.`dense` = `20100824_latest`.`lf`.`dense`) order by `20100824_latest`.`lfi`.`dense` desc,`20100824_latest`.`lfi`.`ts` desc,`20100824_latest`.`lfi`.`id` desc limit 1)) order by `20100824_latest`.`lf`.`ts` desc,`20100824_latest`.`lf`.`id` desc limit 20
</pre>
<p>We see that the query is instant again, despite the data distribution being completely different. This is because the query only skips the records which are not the latest of their persons, and the total number of the records to scan is defined by how many records do we browse before we encounter the <strong>20th</strong> unique value in our scan. This value decreases exponentially as the number of distinct persons in the table grows, but with <strong>99%</strong> probability it won&#8217;t exceed <strong>100</strong> records even for only <strong>20</strong> distinct persons in the table.</p>
<p>The only problem that can arise here is that the number of distinct persons in the table is <em>less</em> than the <code>LIMIT</code> we set. In this case, no new records after the limit is reached can be returned, and a full index scan (accompanied by an index seek once per record) will ultimately be performed.</p>
<p>To work around this, the following simple query can be run in advance:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    (
        SELECT  DISTINCT sparse
        FROM    logfile
        LIMIT 20
        ) q
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="bigint">20</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0015s)</td>
</tr>
</table>
</div>
<p><a href="#" onclick="xcollapse('X1238');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1238" style="display: none; background: transparent;">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint"></td>
<td class="double"></td>
<td class="varchar">Select tables optimized away</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">logfile</td>
<td class="varchar">range</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_sparse_ts_id</td>
<td class="varchar">4</td>
<td class="varchar"></td>
<td class="bigint">19</td>
<td class="double">100.00</td>
<td class="varchar">Using index for group-by; Using temporary</td>
</tr>
</table>
</div>
<pre>
select count(0) AS `COUNT(*)` from (select distinct `20100824_latest`.`logfile`.`sparse` AS `sparse` from `20100824_latest`.`logfile` limit 20) `q`
</pre>
</div>
<p>This query will return the actual number of distinct persons in the table if there are less than <strong>20</strong> (or <strong>20</strong> if these are more).</p>
<p>This query is instant even for the dense data:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    (
        SELECT  DISTINCT dense
        FROM    logfile
        LIMIT 20
        ) q
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="bigint">20</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0024s)</td>
</tr>
</table>
</div>
<p><a href="#" onclick="xcollapse('X1940');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1940" style="display: none; background: transparent;">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint"></td>
<td class="double"></td>
<td class="varchar">Select tables optimized away</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">logfile</td>
<td class="varchar">index</td>
<td class="varchar"></td>
<td class="varchar">ix_logfile_dense_ts_id</td>
<td class="varchar">16</td>
<td class="varchar"></td>
<td class="bigint">500132</td>
<td class="double">12.50</td>
<td class="varchar">Using index; Using temporary</td>
</tr>
</table>
</div>
<pre>
select count(0) AS `COUNT(*)` from (select distinct `20100824_latest`.`logfile`.`dense` AS `dense` from `20100824_latest`.`logfile` limit 20) `q`
</pre>
</div>
<p>This needs to be run as a separate query because <strong>MySQL</strong> does not allow using anything other than constants in the <code>LIMIT</code> clause. The result of this query should be substituted into the <code>LIMIT</code> clause on the client or in a dynamically composed query on the server.</p>
<h3>Summary</h3>
<p>To select a number of latest unique records from a table, one can use aggregate functions, however, this can decrease the query performance.</p>
<p>This can be done more efficiently by creating two different indexes on the table and checking the records taken from the general timeline against the end of the index on the person&#8217;s timeline.</p>
<p>To avoid performance degradation in marginal cases (when the total number of persons in the table is less than <code>LIMIT</code>), it is possible to make an additional check for the total number of distinct records and adjust the <code>LIMIT</code> clause if there are not enough records.</p>
<p><strong>P. S.</strong> I decided to enable comments for the technical posts as well. You are welcome to comment.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/08/24/20-latest-unique-records/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Questions</title>
		<link>http://explainextended.com/2010/07/15/questions/</link>
		<comments>http://explainextended.com/2010/07/15/questions/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 19:00:25 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[Meta]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4866</guid>
		<description><![CDATA[I had tons of work the last month and got a whole bunch of questions in the queue which I haven&#8217;t answered yet. Pawel, Mark, Kate, Felix, Andrew, Ludwig, Santos, Jeremy, John, another John, Anjan and Dave — I do remember about you, guys! Will try to resume the normal blogging schedule the next week. [...]]]></description>
			<content:encoded><![CDATA[<p>I had tons of work the last month and got a whole bunch of questions in the queue which I haven&#8217;t answered yet.</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/07/wall-e021.jpg" alt="" title="WALL-E" width="400" height="236" class="aligncenter size-full wp-image-4868 noborder" /></p>
<p><strong>Pawel</strong>, <strong>Mark</strong>, <strong>Kate</strong>, <strong>Felix</strong>, <strong>Andrew</strong>, <strong>Ludwig</strong>, <strong>Santos</strong>, <strong>Jeremy</strong>, <strong>John</strong>, another <strong>John</strong>, <strong>Anjan</strong> and <strong>Dave</strong> — I do remember about you, guys!</p>
<p>Will try to resume the normal blogging schedule the next week.</p>
<p>Stay subscribed!</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/07/15/questions/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Indexing for ORDER BY / LIMIT</title>
		<link>http://explainextended.com/2010/06/30/indexing-for-order-by-limit/</link>
		<comments>http://explainextended.com/2010/06/30/indexing-for-order-by-limit/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 19:00:34 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4831</guid>
		<description><![CDATA[Answering questions asked on the site. Frode Underhill asks: I have some applications that are logging to a MySQL database table. The table is pretty standard on the form: timeBIGINT(20) sourceTINYINT(4) severityENUM textVARCHAR(255) , where source identifies the system that generated the log entry. There are very many entries in the table (>100 million), of [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Frode Underhill</strong> asks:</p>
<blockquote><p>I have some applications that are logging to a <strong>MySQL</strong> database table.</p>
<p>The table is pretty standard on the form:</p>
<table class="excel">
<tr>
<th>time<br/><code>BIGINT(20)</code></th>
<th>source<br/><code>TINYINT(4)</code></th>
<th>severity<br/><code>ENUM</code></th>
<th>text<br/><code>VARCHAR(255)</code></th>
</tr>
</table>
<p>, where <code>source</code> identifies the system that generated the log entry.</p>
<p>There are very many entries in the table (<strong>>100 million</strong>), of which <strong>99.9999%</strong> are debug or info messages.</p>
<p>I&#8217;m making an interface for browsing this log, which means I&#8217;ll be doing queries like</p>
<pre class="brush: sql">
SELECT  *
FROM    log
WHERE   source = 2
        AND severity IN (1,2)
        AND time &gt; 12345
ORDER BY
        time ASC
LIMIT 30
</pre>
<p><!-- --></p>
<p>, if I want to find debug or info log entries from a certain point in time, or </p>
<pre class="brush: sql">
SELECT  *
FROM    log
WHERE   source = 2
        AND severity IN (1,2)
        AND time &lt; 12345
ORDER BY
        time DESC
LIMIT 30
</pre>
<p><!-- --></p>
<p>for finding entries right before a certain time.</p>
<p>How would one go about indexing &#038; querying such a table?</p>
<p>I thought I had it figured out (I pretty much just tried every different combination of columns in an index), but there&#8217;s always some set of parameters that results in a really slow query.
</p></blockquote>
<p>The problem is that you cannot use a single index both for filtering and ordering if you have a ranged condition (<code>severity IN (1, 2)</code> in this case).</p>
<p>Recently I wrote an article with a proposal to improve <strong>SQL</strong> optimizer to handle these conditions. If a range has low cardinality (this is, there are few values that con possibly satisfy the range), then the query could be improved by rewriting the range as a series of individual queries, each one using one of the values constituting the range in an equijoin:</p>
<ul>
<li><a href="/2010/05/19/things-sql-needs-determining-range-cardinality/"><strong>Things SQL needs: determining range cardinality</strong></a></li>
</ul>
<p>No optimizers can handle this condition automatically yet, so we&#8217;ll need to emulate it.</p>
<p>Since the <code>severity</code> field is defined as an <code>enum</code> with only <strong>5</strong> values possible, any range condition on this field can be satisfied by no more than <strong>5</strong> distinct values, thus making this table ideal for rewriting the query.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4831"></span><br />
<a href="#" onclick="xcollapse('X1733');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X1733" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

CREATE TABLE t_log (
        id INT NOT NULL,
        ts BIGINT NOT NULL,
        source TINYINT(4) NOT NULL,
        severity ENUM(&#039;DEBUG&#039;,&#039;INFO&#039;,&#039;WARNING&#039;,&#039;ERROR&#039;,&#039;FATAL&#039;) NOT NULL,
        tx VARCHAR(255)
) ENGINE=MyISAM;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(3500);
COMMIT;

INSERT
INTO    t_log
SELECT  (f1.id - 1) * 3000 + f2.id,
        UNIX_TIMESTAMP(&#039;2010-06-29&#039; - INTERVAL (f1.id - 1) * 3000 + f2.id SECOND),
        CEILING(RAND(20100629) * 10),
        5 - FLOOR(LOG10(CEILING(RAND(20100629 &lt;&lt; 1) * 99999))),
        CONCAT(&#039;Message &#039;, (f1.id - 1) * 3000 + f2.id)
FROM    filler f1
CROSS JOIN
        filler f2;

CREATE INDEX ix_log_source_ts ON t_log (source, ts);

CREATE INDEX ix_log_source_severity_ts ON t_log (source, severity, ts);
</pre>
</div>
<p>This <strong>MyISAM</strong> table has <strong>12,250,000</strong> records, with <strong>10</strong> random sources (distributed evenly) and <strong>5</strong> random severities (distributed logarithmically):</p>
<pre class="brush: sql">
SELECT  severity, COUNT(*)
FROM    t_log
GROUP BY
        severity;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>severity</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="char">DEBUG</td>
<td class="bigint">11024646</td>
</tr>
<tr>
<td class="char">INFO</td>
<td class="bigint">1102668</td>
</tr>
<tr>
<td class="char">WARNING</td>
<td class="bigint">110557</td>
</tr>
<tr>
<td class="char">ERROR</td>
<td class="bigint">10948</td>
</tr>
<tr>
<td class="char">FATAL</td>
<td class="bigint">1181</td>
</tr>
</table>
</div>
<p>We also created two indexes (one on <code>(source, ts)</code>, the other one on <code>(source, severity, ts)</code>).</p>
<p>Now, let&#8217;s try to run some queries as is:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_log
WHERE   source = 2
        AND severity IN (1, 2)
        AND ts &lt;= 1277754000
ORDER BY
        source DESC, ts DESC
LIMIT 30
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>ts</th>
<th>source</th>
<th>severity</th>
<th>tx</th>
</tr>
<tr>
<td class="integer">1205</td>
<td class="bigint">1277753995</td>
<td class="tinyint">2</td>
<td class="char">DEBUG</td>
<td class="varchar">Message 1205</td>
</tr>
<tr>
<td class="integer">1227</td>
<td class="bigint">1277753973</td>
<td class="tinyint">2</td>
<td class="char">DEBUG</td>
<td class="varchar">Message 1227</td>
</tr>
<tr>
<td class="integer">1243</td>
<td class="bigint">1277753957</td>
<td class="tinyint">2</td>
<td class="char">DEBUG</td>
<td class="varchar">Message 1243</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="integer">1546</td>
<td class="bigint">1277753654</td>
<td class="tinyint">2</td>
<td class="char">DEBUG</td>
<td class="varchar">Message 1546</td>
</tr>
<tr>
<td class="integer">1575</td>
<td class="bigint">1277753625</td>
<td class="tinyint">2</td>
<td class="char">DEBUG</td>
<td class="varchar">Message 1575</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0013s (0.0027s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_ts</td>
<td class="varchar">9</td>
<td class="varchar"></td>
<td class="bigint">997923</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
</table>
</div>
<pre>
select `20100630_range`.`t_log`.`id` AS `id`,`20100630_range`.`t_log`.`ts` AS `ts`,`20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity`,`20100630_range`.`t_log`.`tx` AS `tx` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` in (1,2)) and (`20100630_range`.`t_log`.`ts` &lt;= 1277754000)) order by `20100630_range`.`t_log`.`source` desc,`20100630_range`.`t_log`.`ts` desc limit 30
</pre>
<p>This is very fast. It uses the index which does not include <code>severity</code>: since the values <strong>1</strong> and <strong>2</strong> are very frequent, it&#8217;s much more efficient just to filter them out. The index preserves the order, that&#8217;s why there is no <code>filesort</code> in the plan.</p>
<pre class="brush: sql">
SELECT  *
FROM    t_log
WHERE   source = 2
        AND severity IN (4, 5)
        AND ts &lt;= 1277754000
ORDER BY
        source DESC, ts DESC
LIMIT 30
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>ts</th>
<th>source</th>
<th>severity</th>
<th>tx</th>
</tr>
<tr>
<td class="integer">2333</td>
<td class="bigint">1277752867</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 2333</td>
</tr>
<tr>
<td class="integer">6139</td>
<td class="bigint">1277749061</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 6139</td>
</tr>
<tr>
<td class="integer">6369</td>
<td class="bigint">1277748831</td>
<td class="tinyint">2</td>
<td class="char">FATAL</td>
<td class="varchar">Message 6369</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="integer">297128</td>
<td class="bigint">1277458072</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 297128</td>
</tr>
<tr>
<td class="integer">298729</td>
<td class="bigint">1277456471</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 298729</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0013s (0.0093s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">10</td>
<td class="varchar"></td>
<td class="bigint">1182</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using filesort</td>
</tr>
</table>
</div>
<pre>
select `20100630_range`.`t_log`.`id` AS `id`,`20100630_range`.`t_log`.`ts` AS `ts`,`20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity`,`20100630_range`.`t_log`.`tx` AS `tx` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` in (4,5)) and (`20100630_range`.`t_log`.`ts` &lt;= 1277754000)) order by `20100630_range`.`t_log`.`source` desc,`20100630_range`.`t_log`.`ts` desc limit 30
</pre>
<p>This is very fast too. The index which includes <code>severity</code> is used (along with the <code>filesort</code> of course, because the order cannot be preserved with multiple values of <code>severity</code>), but the total number of records evaluated is so small that the <code>filesort</code> is not much of a problem.</p>
<p>Now, let&#8217;s try to include <strong>3</strong> into the query above:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_log
WHERE   source = 2
        AND severity IN (3, 4)
        AND ts &lt;= 1277754000
ORDER BY
        source DESC, ts DESC
LIMIT 30
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>ts</th>
<th>source</th>
<th>severity</th>
<th>tx</th>
</tr>
<tr>
<td class="integer">1507</td>
<td class="bigint">1277753693</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 1507</td>
</tr>
<tr>
<td class="integer">2333</td>
<td class="bigint">1277752867</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 2333</td>
</tr>
<tr>
<td class="integer">4154</td>
<td class="bigint">1277751046</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 4154</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="integer">30118</td>
<td class="bigint">1277725082</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 30118</td>
</tr>
<tr>
<td class="integer">31321</td>
<td class="bigint">1277723879</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 31321</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0013s (0.2496s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">10</td>
<td class="varchar"></td>
<td class="bigint">12168</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using filesort</td>
</tr>
</table>
</div>
<pre>
select `20100630_range`.`t_log`.`id` AS `id`,`20100630_range`.`t_log`.`ts` AS `ts`,`20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity`,`20100630_range`.`t_log`.`tx` AS `tx` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` in (3,4)) and (`20100630_range`.`t_log`.`ts` &lt;= 1277754000)) order by `20100630_range`.`t_log`.`source` desc,`20100630_range`.`t_log`.`ts` desc limit 30
</pre>
<p>Now, this runs for almost <strong>250 ms</strong>. Why?</p>
<p>There are <strong>110,557</strong> records with <code>severity = 'WARNING'</code>. This is too many for a filesort but too few for <code>using where</code> (filtering the records with the index that preserves the order). There will be too many records that will need to be skipped.</p>
<p>To work around this, we could combine the queries using <code>UNION ALL</code>. Since the original query uses <code>ORDER BY</code> and <code>LIMIT</code>, we may put them into two separate queries (which will yield <strong>60</strong> records) and finally apply it to the end resultset (to get the <strong>30</strong> records that are guaranteed to be contained among these <strong>60</strong>):</p>
<pre class="brush: sql">
SELECT  *
FROM    (
        SELECT  *
        FROM    t_log
        WHERE   source = 2
                AND severity = 3
                AND ts &lt;= 1277754000
        ORDER BY
                source DESC, ts DESC
        LIMIT 30
        ) q
UNION ALL
SELECT  *
FROM    (
        SELECT  *
        FROM    t_log
        WHERE   source = 2
                AND severity = 4
                AND ts &lt;= 1277754000
        ORDER BY
                source DESC, ts DESC
        LIMIT 30
        ) q
ORDER BY
        ts DESC
LIMIT 30
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>ts</th>
<th>source</th>
<th>severity</th>
<th>tx</th>
</tr>
<tr>
<td class="integer">1507</td>
<td class="bigint">1277753693</td>
<td class="tinyint">2</td>
<td class="varchar">WARNING</td>
<td class="varchar">Message 1507</td>
</tr>
<tr>
<td class="integer">2333</td>
<td class="bigint">1277752867</td>
<td class="tinyint">2</td>
<td class="varchar">ERROR</td>
<td class="varchar">Message 2333</td>
</tr>
<tr>
<td class="integer">4154</td>
<td class="bigint">1277751046</td>
<td class="tinyint">2</td>
<td class="varchar">WARNING</td>
<td class="varchar">Message 4154</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="integer">30118</td>
<td class="bigint">1277725082</td>
<td class="tinyint">2</td>
<td class="varchar">WARNING</td>
<td class="varchar">Message 30118</td>
</tr>
<tr>
<td class="integer">31321</td>
<td class="bigint">1277723879</td>
<td class="tinyint">2</td>
<td class="varchar">ERROR</td>
<td class="varchar">Message 31321</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0013s (0.0037s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">30</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">10</td>
<td class="varchar"></td>
<td class="bigint">11094</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">3</td>
<td class="varchar">UNION</td>
<td class="varchar">&lt;derived4&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">30</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">4</td>
<td class="varchar">DERIVED</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">10</td>
<td class="varchar"></td>
<td class="bigint">1074</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint"></td>
<td class="varchar">UNION RESULT</td>
<td class="varchar">&lt;union1,3&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint"></td>
<td class="double"></td>
<td class="varchar">Using filesort</td>
</tr>
</table>
</div>
<pre>
select `q`.`id` AS `id`,`q`.`ts` AS `ts`,`q`.`source` AS `source`,`q`.`severity` AS `severity`,`q`.`tx` AS `tx` from (select `20100630_range`.`t_log`.`id` AS `id`,`20100630_range`.`t_log`.`ts` AS `ts`,`20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity`,`20100630_range`.`t_log`.`tx` AS `tx` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` = 3) and (`20100630_range`.`t_log`.`ts` &lt;= 1277754000)) order by `20100630_range`.`t_log`.`source` desc,`20100630_range`.`t_log`.`ts` desc limit 30) `q` union all select `q`.`id` AS `id`,`q`.`ts` AS `ts`,`q`.`source` AS `source`,`q`.`severity` AS `severity`,`q`.`tx` AS `tx` from (select `20100630_range`.`t_log`.`id` AS `id`,`20100630_range`.`t_log`.`ts` AS `ts`,`20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity`,`20100630_range`.`t_log`.`tx` AS `tx` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` = 4) and (`20100630_range`.`t_log`.`ts` &lt;= 1277754000)) order by `20100630_range`.`t_log`.`source` desc,`20100630_range`.`t_log`.`ts` desc limit 30) `q` order by `ts` desc limit 30
</pre>
<p>This is much faster.</p>
<p>However, this solution requires composing the query dynamically, depending on the number of the severities in the condition. Is it possible to make this all in one static query that will accept the parameters in the <code>IN</code> list?</p>
<p>We can do it by using the applying the solution using to retrieve <q>greatest-n-per-group</q> in <strong>MySQL</strong>.</p>
<p>To do this, we will just select the <strong>30</strong>th timestamp of each <code>severity</code> and find all records with the higher timestamps.</p>
<p>This can be done using a join:</p>
<pre class="brush: sql">
SELECT  *
FROM    (
        SELECT  l.*
        FROM    (
                SELECT  source,
                        severity,
                        (
                        SELECT  ts
                        FROM    t_log li
                        WHERE   li.source = ss.source
                                AND li.severity = ss.severity
                                AND ts &lt;= 1277754000
                        ORDER BY
                                li.source DESC, li.severity DESC, li.ts DESC
                        LIMIT 29, 1
                        ) AS mts
                FROM    (
                        SELECT  DISTINCT source, severity
                        FROM    t_log
                        WHERE   source = 2
                                AND severity IN (3, 4)
                        ) ss
                ) s
        JOIN    t_log l
        ON      l.source &gt;= s.source
                AND l.source &lt;= s.source
                AND l.severity = s.severity
                AND l.ts &gt;= s.mts
                AND l.ts &lt;= 1277754000
        ) q
ORDER BY
        ts DESC
LIMIT 30;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>ts</th>
<th>source</th>
<th>severity</th>
<th>tx</th>
</tr>
<tr>
<td class="integer">1507</td>
<td class="bigint">1277753693</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 1507</td>
</tr>
<tr>
<td class="integer">2333</td>
<td class="bigint">1277752867</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 2333</td>
</tr>
<tr>
<td class="integer">4154</td>
<td class="bigint">1277751046</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 4154</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="integer">30118</td>
<td class="bigint">1277725082</td>
<td class="tinyint">2</td>
<td class="char">WARNING</td>
<td class="varchar">Message 30118</td>
</tr>
<tr>
<td class="integer">31321</td>
<td class="bigint">1277723879</td>
<td class="tinyint">2</td>
<td class="char">ERROR</td>
<td class="varchar">Message 31321</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0014s (0.0040s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">60</td>
<td class="double">100.00</td>
<td class="varchar">Using filesort</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">&lt;derived3&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">2</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">l</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">10</td>
<td class="varchar"></td>
<td class="bigint">30</td>
<td class="double">40833332.00</td>
<td class="varchar">Range checked for each record (index map: 0&#215;3)</td>
</tr>
<tr>
<td class="bigint">3</td>
<td class="varchar">DERIVED</td>
<td class="varchar">&lt;derived5&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">2</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">5</td>
<td class="varchar">DERIVED</td>
<td class="varchar">t_log</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">2</td>
<td class="varchar"></td>
<td class="bigint">1</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index for group-by</td>
</tr>
<tr>
<td class="bigint">4</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">li</td>
<td class="varchar">ref</td>
<td class="varchar">ix_log_source_ts,ix_log_source_severity_ts</td>
<td class="varchar">ix_log_source_severity_ts</td>
<td class="varchar">2</td>
<td class="varchar">ss.source,ss.severity</td>
<td class="bigint">245000</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;ss.source&#39; of SELECT #4 was resolved in SELECT #3
Field or reference &#39;ss.severity&#39; of SELECT #4 was resolved in SELECT #3
select `q`.`id` AS `id`,`q`.`ts` AS `ts`,`q`.`source` AS `source`,`q`.`severity` AS `severity`,`q`.`tx` AS `tx` from (select `20100630_range`.`l`.`id` AS `id`,`20100630_range`.`l`.`ts` AS `ts`,`20100630_range`.`l`.`source` AS `source`,`20100630_range`.`l`.`severity` AS `severity`,`20100630_range`.`l`.`tx` AS `tx` from (select `ss`.`source` AS `source`,`ss`.`severity` AS `severity`,(select `20100630_range`.`li`.`ts` from `20100630_range`.`t_log` `li` where ((`20100630_range`.`li`.`source` = `ss`.`source`) and (`20100630_range`.`li`.`severity` = `ss`.`severity`) and (`20100630_range`.`li`.`ts` &lt;= 1277754000)) order by `20100630_range`.`li`.`source` desc,`20100630_range`.`li`.`severity` desc,`20100630_range`.`li`.`ts` desc limit 29,1) AS `mts` from (select distinct `20100630_range`.`t_log`.`source` AS `source`,`20100630_range`.`t_log`.`severity` AS `severity` from `20100630_range`.`t_log` where ((`20100630_range`.`t_log`.`source` = 2) and (`20100630_range`.`t_log`.`severity` in (3,4)))) `ss`) `s` join `20100630_range`.`t_log` `l` where ((`20100630_range`.`l`.`severity` = `s`.`severity`) and (`20100630_range`.`l`.`source` &gt;= `s`.`source`) and (`20100630_range`.`l`.`source` &lt;= `s`.`source`) and (`20100630_range`.`l`.`ts` &gt;= `s`.`mts`) and (`20100630_range`.`l`.`ts` &lt;= 1277754000))) `q` order by `q`.`ts` desc limit 30
</pre>
<p>All possible values of <code>source</code> and <code>severity</code> are selected using a loose scan (which is instant since there are few of them). Each pair of values is then used as a join condition. A single index range satisfies each pair of values, so each join iteration uses an index efficiently (actually, the access path is reevaluated for each iteration as shown by <code>Range checked for each record (index map: 0x3)</code>.</p>
<p>The total number of records that would be returned by this query be there no <code>LIMIT</code> is <strong>60</strong> or maybe even more (in case of ties on <code>ts</code>). However, we don&#8217;t need to resolve the ties in the subqueries, since the final <code>ORDER BY / LIMIT</code> does this for us.</p>
<p>The query completes in <strong>4 ms</strong> which is instant. More than that, it does not need to be rewritten to handle different combinations of values: they could be provided in a single <code>IN</code> clause.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/06/30/indexing-for-order-by-limit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GROUP_CONCAT in SQL Server</title>
		<link>http://explainextended.com/2010/06/21/group_concat-in-sql-server/</link>
		<comments>http://explainextended.com/2010/06/21/group_concat-in-sql-server/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 19:00:49 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4818</guid>
		<description><![CDATA[I&#8217;m finally back from my vacation. Tunisia&#8217;s great: dates, Carthage, sea and stuff. Now, to the questions. Mahen asks: Create a table called Group: Group id prodname 1 X 1 Y 1 Z 2 A 2 B 2 C The resultset should look like this: id prodname 1 X,Y,Z 2 A,B,C Can you please help [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m finally back from my vacation. Tunisia&#8217;s great: dates, Carthage, sea and stuff.</p>
<p>Now, to the questions.</p>
<p><strong>Mahen</strong> asks:</p>
<blockquote><p>
Create a table called <code>Group</code>:</p>
<table class="excel">
<caption>Group</caption>
<tr>
<th>id</th>
<th>prodname</th>
</tr>
<tr>
<td>1</td>
<td>X</td>
</tr>
<tr>
<td>1</td>
<td>Y</td>
</tr>
<tr>
<td>1</td>
<td>Z</td>
</tr>
<tr>
<td>2</td>
<td>A</td>
</tr>
<tr>
<td>2</td>
<td>B</td>
</tr>
<tr>
<td>2</td>
<td>C</td>
</tr>
</table>
<p>The resultset should look like this:</p>
<table class="excel">
<tr>
<th>id</th>
<th>prodname</th>
</tr>
<tr>
<td>1</td>
<td>X,Y,Z</td>
</tr>
<tr>
<td>2</td>
<td>A,B,C</td>
</tr>
</table>
<p>Can you please help me to solve the above problem using a recursive <strong>CTE</strong>?
</p></blockquote>
<p>This is out good old friend, <code>GROUP_CONCAT</code>. It&#8217;s an aggregate function that returns all strings within a group, concatenated. It&#8217;s somewhat different from the other aggregate functions, because, first, dealing with the concatenated string can be quite a tedious task for the groups with lots of records (large strings tend to overflow), and, second, the result depends on the order of the arguments (which is normally not the case for the aggregate functions). It&#8217;s not a part of a standard <strong>SQL</strong> and as for now is implemented only by <strong>MySQL</strong> with some extra vendor-specific keywords (like <code>ORDER BY</code> within the argument list).</p>
<p>This functionality, however, is often asked for and I have written some articles about implementing this in <a href="/2009/05/02/group_concat-in-postgresql-without-aggregate-functions/"><strong>PostgreSQL</strong></a> and <a href="/2009/05/02/group_concat-in-postgresql-without-aggregate-functions/"><strong>Oracle</strong></a>.</p>
<p>Now, let&#8217;s see how to do it in <strong>SQL Server</strong>.</p>
<p>Usually, <strong>SQL Server</strong>&#8216;s <code>FOR XML</code> clause is exploited to concatenate the strings. To do this, we obtain a list of group identifiers and for each group, retrieve all it&#8217;s product names with a subquery appended with <code>FOR XML PATH('')</code>. This makes a single <code>XML</code> column out of the recordset:<br />
<span id="more-4818"></span></p>
<pre class="brush: sql">
WITH    q (id, prodname) AS
        (
        SELECT  1, &#039;X&#039;
        UNION ALL
        SELECT  1, &#039;Y&#039;
        UNION ALL
        SELECT  1, &#039;Z&#039;
        UNION ALL
        SELECT  2, &#039;A&#039;
        UNION ALL
        SELECT  2, &#039;B&#039;
        UNION ALL
        SELECT  2, &#039;C&#039;
        )
SELECT  *
FROM    (
        SELECT  DISTINCT id
        FROM    q
        ) qo
CROSS APPLY
        (
        SELECT  CASE ROW_NUMBER() OVER(ORDER BY prodname) WHEN 1 THEN &#039;&#039; ELSE &#039;, &#039; END + qi.prodname
        FROM    q qi
        WHERE   qi.id = qo.id
        ORDER BY
                prodname
        FOR XML PATH (&#039;&#039;)
        ) qi(r)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>r</th>
</tr>
<tr>
<td class="int">1</td>
<td class="ntext">X, Y, Z</td>
</tr>
<tr>
<td class="int">2</td>
<td class="ntext">A, B, C</td>
</tr>
</table>
</div>
<p>This solution works, but converting to and from <code>XML</code> is not the best way to deal with the strings: things like ampersands, angle brackets, line feeds etc. get mangled and require some additional effort to cope with.</p>
<p>However, this functionality can really be implemented using a recursive <strong>CTE</strong>.</p>
<p>To do this, we need to do the following:</p>
<ol>
<li>Assign a group-wise row number and a group-wise count to each record (in required order)</li>
<li>Select the first record (that with the row number <strong>1</strong>) from each group in the anchor part of the <code>CTE</code></li>
<li>Recursively append the next record to the previous record. The next record can be obtained by joining on <code>rn = rn + 1</code></li>
<li>Finally, select the only last record from each group (whose row number is equal to the group-wise count). It will contain the accumulated string.</li>
</ol>
<p>Here&#8217;s how we do it:</p>
<pre class="brush: sql">
WITH    q (id, prodname) AS
        (
        SELECT  1, &#039;X&#039;
        UNION ALL
        SELECT  1, &#039;Y&#039;
        UNION ALL
        SELECT  1, &#039;Z&#039;
        UNION ALL
        SELECT  2, &#039;A&#039;
        UNION ALL
        SELECT  2, &#039;B&#039;
        UNION ALL
        SELECT  2, &#039;C&#039;
        ),
        qs(id, prodname, rn, cnt) AS
        (
        SELECT  id, prodname,
                ROW_NUMBER() OVER (PARTITION BY id ORDER BY prodname),
                COUNT(*) OVER (PARTITION BY id)
        FROM    q
        ),
        t (id, prodname, gc, rn, cnt) AS
        (
        SELECT  id, prodname,
                CAST(prodname AS NVARCHAR(MAX)), rn, cnt
        FROM    qs
        WHERE   rn = 1
        UNION ALL
        SELECT  qs.id, qs.prodname,
                CAST(t.gc + &#039;, &#039; + qs.prodname AS NVARCHAR(MAX)),
                qs.rn, qs.cnt
        FROM    t
        JOIN    qs
        ON      qs.id = t.id
                AND qs.rn = t.rn + 1
        )
SELECT  id, gc
FROM    t
WHERE   rn = cnt
OPTION (MAXRECURSION 0)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>gc</th>
</tr>
<tr>
<td class="int">2</td>
<td class="ntext">A, B, C</td>
</tr>
<tr>
<td class="int">1</td>
<td class="ntext">X, Y, Z</td>
</tr>
</table>
</div>
<p>As we can see, this only deals with native <code>NVARCHAR</code> and is free from <code>XML</code> conversions.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/06/21/group_concat-in-sql-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LEFT JOIN / IS NULL vs. NOT IN vs. NOT EXISTS: nullable columns</title>
		<link>http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/</link>
		<comments>http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/#comments</comments>
		<pubDate>Thu, 27 May 2010 19:00:15 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4793</guid>
		<description><![CDATA[In one of the previous articles I discussed performance of the three methods to implement an anti-join in MySQL. Just a quick reminder: an anti-join is an operation that returns all records from one table which share a value of a certain column with no records from another table. In SQL, there are at least [...]]]></description>
			<content:encoded><![CDATA[<p>In one of the previous articles I discussed performance of the <a href="/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/">three methods to implement an anti-join in <strong>MySQL</strong></a>.</p>
<p>Just a quick reminder: an anti-join is an operation that returns all records from one table which share a value of a certain column with no records from another table.</p>
<p>In <strong>SQL</strong>, there are at least three methods to implement it:</p>
<h3>LEFT JOIN / IS NULL</h3>
<pre class="brush: sql">
SELECT  o.*
FROM    outer o
LEFT JOIN
        inner i
ON      i.value = o.value
WHERE   i.value IS NULL
</pre>
<h3>NOT IN</h3>
<pre class="brush: sql">
SELECT  o.*
FROM    outer o
WHERE   o.value NOT IN
        (
        SELECT  value
        FROM    inner
        )
</pre>
<h3>NOT EXISTS</h3>
<pre class="brush: sql">
SELECT  o.*
FROM    outer o
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    inner i
        WHERE   i.value = o.value
        )
</pre>
<p>When <code>inner.value</code> is marked as <code>NOT NULL</code>, all these queries are semantically equivalent and with proper indexing have similarly optimized execution plans in <strong>MySQL</strong>.</p>
<p>Now, what if <code>inner.value</code> is not nullable and does contain some <code>NULL</code> values?</p>
<p>Let&#8217;s create some sample tables:<br />
<span id="more-4793"></span><br />
<a href="#" onclick="xcollapse('X2583');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X2583" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=MyISAM;

CREATE TABLE t_inner (
        id INT NOT NULL PRIMARY KEY,
        val INT,
        stuffing VARCHAR(200) NOT NULL,
        KEY ix_inner_val (val)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE t_outer (
        id INT NOT NULL PRIMARY KEY,
        val INT,
        stuffing VARCHAR(200) NOT NULL,
        KEY ix_outer_val (val)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(1000000);
COMMIT;

INSERT
INTO    t_inner
SELECT  id,
        NULLIF(CEILING(RAND(20100527) * 100000), 100000),
        RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    filler;

INSERT
INTO    t_outer
SELECT  id,
        NULLIF(CEILING(RAND(20100527 &lt;&lt; 1) * 100000), 100000),
        RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    filler;
</pre>
</div>
<p>There are two identical <strong>MyISAM</strong> tables. Each of the tables contains <strong>1,000,000</strong> random values from <strong>1</strong> to <strong>99,999</strong> and also some <code>NULL</code> values. There is an index on <code>value</code> in both tables.</p>
<p>Now, let&#8217;s check the queries.</p>
<h3>NOT EXISTS</h3>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing)), COUNT(*)
FROM    t_outer o
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    t_inner i
        WHERE   i.val = o.val
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal">14600</td>
<td class="bigint">73</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (9.9061s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">ref</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">20100527_anti.o.val</td>
<td class="bigint">10</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;20100527_anti.o.val&#39; of SELECT #2 was resolved in SELECT #1
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` where (not(exists(select NULL from `20100527_anti`.`t_inner` `i` where (`20100527_anti`.`i`.`val` = `20100527_anti`.`o`.`val`))))
</pre>
<p>The query completes in <strong>9.9</strong> seconds. As we can see, it is optimized to use the index on <code>t_inner.val</code> and return on the first match.</p>
<h3>LEFT JOIN / IS NULL</h3>
<pre class="brush: sql">
SELECT  SUM(LENGTH(o.stuffing)), COUNT(*)
FROM    t_outer o
LEFT JOIN
        t_inner i
ON      i.val = o.val
WHERE   i.id IS NULL
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(o.stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal">14600</td>
<td class="bigint">73</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (13.5154s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">i</td>
<td class="varchar">ref</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">20100527_anti.o.val</td>
<td class="bigint">10</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Not exists</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(o.stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` left join `20100527_anti`.`t_inner` `i` on((`20100527_anti`.`i`.`val` = `20100527_anti`.`o`.`val`)) where isnull(`20100527_anti`.`i`.`id`)
</pre>
<p>The query semantics are the same as those of <code>NOT EXISTS</code>, and we even see the <code>Not exists</code> optimization in the plan, however this query performs much more poorly than <code>NOT EXISTS</code>: <strong>13</strong> seconds. Why?</p>
<p><strong>MySQL</strong> documentation on <code>EXPLAIN</code> <a href="http://dev.mysql.com/doc/refman/5.5/en/using-explain.html">states</a> that <code>Not exists</code> is used to optimize the queries similar to the one we have just run: <code>LEFT JOIN</code> with <code>IS NULL</code> predicate applied to a non-nullable column.</p>
<p><strong>MySQL</strong> is aware that such a predicate can only be satisfied by a record resulting from a <code>JOIN</code> miss (i. e. when no matching record was found in the rightmost table) and stops reading records after first index hit.</p>
<p>However, this optimization is implemented in a way that is far from being perfect. Despite the fact that no actual value of <code>id</code> can be returned by such a query, the engine still looks up <code>id</code> in the table (since it&#8217;s not a part of the index). We can see it in the plan: unlike <code>NOT EXISTS</code> query, there is no <code>Using index</code> for <code>t_inner</code>. This means that a table lookup is performed.</p>
<p>Even we replace <code>id</code> with <code>val</code> in the query, it still performs poorly:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(o.stuffing)), COUNT(*)
FROM    t_outer o
LEFT JOIN
        t_inner i
ON      i.val = o.val
WHERE   i.val IS NULL
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(o.stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal">14600</td>
<td class="bigint">73</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (14.4997s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">i</td>
<td class="varchar">ref</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">20100527_anti.o.val</td>
<td class="bigint">10</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(o.stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` left join `20100527_anti`.`t_inner` `i` on((`20100527_anti`.`i`.`val` = `20100527_anti`.`o`.`val`)) where isnull(`20100527_anti`.`i`.`val`)
</pre>
<p>This time, no table lookups are made but there is no <code>Not exists</code> optimization either.</p>
<p>Despite the fact that the join condition eliminates possibility of an actual <code>NULL</code> being returned by the query and any <code>val IS NULL</code> reaching the <code>WHERE</code> clause is a result of a join miss, <strong>MySQL</strong> still examines all records in <code>t_inner</code>, not stopping after the first hit.</p>
<p>This had been submitted as a <a href="http://bugs.mysql.com/bug.php?id=47454">bug</a>.</p>
<p>Now, what about <code>NOT IN</code>?</p>
<h3>NOT IN</h3>
<p>Unlike the previous two queries that only differ in implementation, not in semantics, <code>NOT IN</code>, being applied as is, would yield the different results.</p>
<p><code>NOT EXISTS</code> and <code>IS NULL</code> are two-state predicates, they can only return <code>TRUE</code> or <code>FALSE</code>. <code>NOT IN</code> is a <em>three-state</em> predicate: it can return <code>TRUE</code>, <code>FALSE</code> or <code>NULL</code>.</p>
<p><code>NULL</code> value is returned in two cases:</p>
<ul>
<li>When <code>t_outer.value</code> being tested is <code>NULL</code></li>
<li>When <em>at least one</em> of <code>t_inner.value</code> is <code>NULL</code></li>
</ul>
<p>This means that having but a single <code>NULL</code> in <code>t_inner</code> would prevent the query from returning anything.</p>
<h4>Naive approach</h4>
<p>Let&#8217;s see what happens if we just substitute <code>NOT IN</code> instead of <code>NOT EXISTS</code>:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing)), COUNT(*)
FROM    t_outer o
WHERE   val NOT IN
        (
        SELECT  val
        FROM    t_inner i
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal"></td>
<td class="bigint">0</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (10.3748s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">index_subquery</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">func</td>
<td class="bigint">20</td>
<td class="double">100.00</td>
<td class="varchar">Using index; Full scan on NULL key</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` where (not(&lt;in_optimizer&gt;(`20100527_anti`.`o`.`val`,&lt;exists&gt;(&lt;index_lookup&gt;(&lt;cache&gt;(`20100527_anti`.`o`.`val`) in t_inner on ix_inner_val checking NULL having trigcond(&lt;is_not_null_test&gt;(`20100527_anti`.`i`.`val`)))))))
</pre>
<p>Since there are <code>NULL</code>s in <code>t_inner</code>, no record in <code>t_outer</code> can satisfy the predicate.</p>
<p><strong>MySQL</strong> does not optimize this very well. It takes but a single index scan to find out if there are <code>NULL</code> values in <code>t_inner</code> and return if they are, but for some reason <strong>MySQL</strong> still applies the condition to each record in <code>t_outer</code>.</p>
<h4>Naive approach, improved</h4>
<p>With a little help from our side, this can be improved:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing)), COUNT(*)
FROM    t_outer o
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    t_inner i
        WHERE   val IS NULL
        )
        AND val NOT IN
        (
        SELECT  val
        FROM    t_inner i
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal"></td>
<td class="bigint">0</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0014s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint"></td>
<td class="double"></td>
<td class="varchar">Impossible WHERE</td>
</tr>
<tr>
<td class="bigint">3</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">index_subquery</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">func</td>
<td class="bigint">20</td>
<td class="double">100.00</td>
<td class="varchar">Using index; Full scan on NULL key</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">ref</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar"></td>
<td class="bigint">4</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` where 0
</pre>
<p>We added an explicit check for <code>NULL</code> values. Since it&#8217;s not correlated, <strong>MySQL</strong> could instantly prove it false, cache it and avoid the table scan at all.</p>
<h4>Ignoring right side NULLs</h4>
<p>Now, let&#8217;s make a <code>NOT IN</code> query that does not take the <code>NULL</code> values in <code>t_inner</code> into account:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing)), COUNT(*)
FROM    t_outer o
WHERE   val NOT IN
        (
        SELECT  val
        FROM    t_inner i
        WHERE   val IS NOT NULL
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal">13400</td>
<td class="bigint">67</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (10.4060s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">index_subquery</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">func</td>
<td class="bigint">20</td>
<td class="double">100.00</td>
<td class="varchar">Using index; Using where; Full scan on NULL key</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` where (not(&lt;in_optimizer&gt;(`20100527_anti`.`o`.`val`,&lt;exists&gt;(&lt;index_lookup&gt;(&lt;cache&gt;(`20100527_anti`.`o`.`val`) in t_inner on ix_inner_val checking NULL where (`20100527_anti`.`i`.`val` is not null) having trigcond(&lt;is_not_null_test&gt;(`20100527_anti`.`i`.`val`)))))))
</pre>
<p>This time, the query returns records, but not as many as the previous queries did.</p>
<p>We made an additional check for <code>NULL</code> in <code>t_inner</code> but not in <code>t_outer</code>. There are some records in <code>t_outer</code> that have a <code>NULL</code> in <code>val</code>. Both <code>IN</code> and <code>NOT IN</code> would evaluate to <code>NULL</code> and <code>WHERE</code> would filter them out.</p>
<p>We see another glitch in <strong>MySQL</strong> optimizer here: a <code>Full scan on NULL key</code> applied. Since <code>NOT IN</code> should always return <code>TRUE</code> when the subquery returns no records (even if the value checked is a <code>NULL</code>), on correlated queries a fullscan should be applied to check for the records and find out whether to return <code>NULL</code> or <code>FALSE</code>. However, in this case the <code>IN</code> subquery is not correlated, so the check could only be performed once and cached, like with the <code>LEFT JOIN</code>.</p>
<p>In our case the overhead would be negligible, since the subquery would return on the first match, but it could matter if we had more <code>NULL</code> values in <code>t_outer</code>.</p>
<p>Now, what if we want <code>NULL</code> records on <code>t_outer</code> to be returned as well? We just need to add an additional check for <code>NULL</code>s.</p>
<h4>Ignoring all <code>NULL</code>s</h4>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing)), COUNT(*)
FROM    t_outer o
WHERE   val IS NULL
        OR val NOT IN
        (
        SELECT  val
        FROM    t_inner i
        WHERE   val IS NOT NULL
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>SUM(LENGTH(stuffing))</th>
<th>COUNT(*)</th>
</tr>
<tr>
<td class="decimal">14600</td>
<td class="bigint">73</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0002s (10.4842s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">o</td>
<td class="varchar">ALL</td>
<td class="varchar">ix_outer_val</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1000000</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">i</td>
<td class="varchar">index_subquery</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">ix_inner_val</td>
<td class="varchar">5</td>
<td class="varchar">func</td>
<td class="bigint">20</td>
<td class="double">100.00</td>
<td class="varchar">Using index; Using where; Full scan on NULL key</td>
</tr>
</table>
</div>
<pre>
select sum(length(`20100527_anti`.`o`.`stuffing`)) AS `SUM(LENGTH(stuffing))`,count(0) AS `COUNT(*)` from `20100527_anti`.`t_outer` `o` where (isnull(`20100527_anti`.`o`.`val`) or (not(&lt;in_optimizer&gt;(`20100527_anti`.`o`.`val`,&lt;exists&gt;(&lt;index_lookup&gt;(&lt;cache&gt;(`20100527_anti`.`o`.`val`) in t_inner on ix_inner_val checking NULL where (`20100527_anti`.`i`.`val` is not null) having trigcond(&lt;is_not_null_test&gt;(`20100527_anti`.`i`.`val`))))))))
</pre>
<p>Here, the query returns the same results as <code>NOT EXISTS</code>.</p>
<p><code>Full scan on NULL key</code> is still present in the plan but will never actually be executed because it will be short circuited by the previous <code>IS NULL</code> check.</p>
<h3>Summary</h3>
<p>As was shown in the <a href="/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/">earlier article</a>, <code>LEFT JOIN / IS NULL</code> and <code>NOT IN</code> are best used to implement an anti-join in <strong>MySQL</strong> if the columns on both sides are not nullable.</p>
<p>The situation is different when the columns are nullable:</p>
<ul>
<li><code>NOT EXISTS</code> performs in most straightforward way: just checks equality and returns <code>TRUE</code> or <code>FALSE</code> on the first hit / miss.</li>
<li><code>LEFT JOIN / IS NULL</code> either makes an additional table lookup or does not return on the first match and performs more poorly in both cases.</li>
<li><code>NOT IN</code>, having different semantics, requires additional checks for <code>NULL</code> values. These checks should be coded into the query</li>
</ul>
<p>With nullable columns, <code>NOT EXISTS</code> and <code>NOT IN</code> (with additional checks for <code>NULLS</code>) are the most efficient methods to implement an anti-join in <strong>MySQL</strong>.</p>
<p><code>LEFT JOIN / IS NULL</code> performs poorly.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Things SQL needs: determining range cardinality</title>
		<link>http://explainextended.com/2010/05/19/things-sql-needs-determining-range-cardinality/</link>
		<comments>http://explainextended.com/2010/05/19/things-sql-needs-determining-range-cardinality/#comments</comments>
		<pubDate>Wed, 19 May 2010 19:00:51 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4624</guid>
		<description><![CDATA[What is the problem with this query? SELECT * FROM orders WHERE quantity &#60;= 4 AND urgency &#60;= 4 The problem is indexing strategy, of course. Which columns should we index? If we index quantity, the optimizer will be able to use the index to filter on it. However, filtering on urgency will require scanning [...]]]></description>
			<content:encoded><![CDATA[<p>What is the problem with this query?</p>
<pre class="brush: sql">
SELECT  *
FROM    orders
WHERE   quantity &lt;= 4
        AND urgency &lt;= 4
</pre>
<p><!-- --><br />
The problem is indexing strategy, of course. Which columns should we index?</p>
<p>If we index <code>quantity</code>, the optimizer will be able to use the index to filter on it. However, filtering on <code>urgency</code> will require scanning all records with <code>quantity &lt; 4</code> and applying the <code>urgency</code> filter to each record found.</p>
<p>Same with <code>urgency</code>. We can use range access on <code>urgency</code> using an index, but this will require filtering on <code>quantity</code>.</p>
<p><q>Why, create a composite index!</q>, some will say.</p>
<p>Unfortunately, that won&#8217;t help much.</p>
<p>A composite <strong>B-Tree</strong> index maintains what is called a <a href="http://en.wikipedia.org/wiki/Lexicographical_order">lexicographical order</a> of the records. This means that an index on <code>(quantity, urgency)</code> will sort on <code>quantity</code>, and only if the quantities are equal, it will take the <code>urgency</code> into account.</p>
<p>The picture below shows how would the records be ordered in such an index:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/05/top.png" alt="" title="Top" width="650" height="450" class="aligncenter size-full wp-image-4752 noborder" /></p>
<p>As we can see, with a single index range scan (i. e. just following the arrows) we cannot select only the records within the dashed rectangle. There is no single index range that could be used to filter on both columns.</p>
<p>Even if we changed the field order in the index, it would just change the direction of the arrows connecting the records:<br />
<span id="more-4624"></span><br />
<img src="http://explainextended.com/wp-content/uploads/2010/05/left.png" alt="" title="Left" width="650" height="450" class="aligncenter size-full wp-image-4754 noborder" /></p>
<p>, and still no single range that contains only the records we need.</p>
<p>Can we improve it?</p>
<p>If we take at closer look at the table contents we will see that despite the fact that there is no single range that contains the records we need (and only them), there are four ranges that do filter our records.</p>
<p>If we rewrote the query condition like that:</p>
<pre class="brush: sql">
SELECT  *
FROM    orders
WHERE   quantity IN (0, 1, 2, 3, 4)
        AND urgency &lt;= 4
</pre>
<p>with the first index, or like that:</p>
<pre class="brush: sql">
SELECT  *
FROM    orders
WHERE   quantity &lt;= 4
        AND urgency IN (0, 1, 2, 3, 4)
</pre>
<p>with the second index, then any decent <strong>SQL</strong> engine would build a decent and efficient plan (<code>Index range</code> in <strong>MySQL</strong>, <code>INLIST ITERATOR</code> in <strong>Oracle</strong> etc.)</p>
<p>The problem is that rewriting this query still requires an <strong>SQL</strong> developer. But this could be done automatically by the optimizer. There are several methods to do that.</p>
<h3>Smallest superset</h3>
<p>A range condition defines a set of the values that could possibly satisfy the condition (i. e. belong to the range). The <q>number</q> of the values that satisfy this condition is called set cardinality. The word number is quoted here because it can be infinite, and not all infinities are created equal: some are more infinite than the others!</p>
<p>However, if we take the column definition into account, we can see that some ranges define finite (and quite constrained) sets of possible values. For instance, a condition like <code>quantity &lt; 4</code> on a column that is defined as <code>UNSIGNED INT</code> can <em>possibly</em> be satisfied by five values: <strong>0</strong>, <strong>1</strong>, <strong>2</strong>, <strong>3</strong> and <strong>4</strong>.</p>
<p>This set is the smallest superset of all sets of values that satisfy the range condition.</p>
<h3>Loose index scan</h3>
<p>Even with the conditions that theoretically define the infinite sets of values that <em>could</em> satisfy them, practically there is always a finite number of values in the table that <em>do</em> satisfy them (the table itself contains the finite number of records, to begin with).</p>
<p>And most engines keep track of that number in their statistics tables: this is what is called <em>field cardinality</em>, a measure of field uniqueness.</p>
<p>If the range cardinality is expected to be low (either from the set of values that can possibly belong to the range, or from the actual number of distinct values that do belong to the range, according to statistics), it would be a wise idea to rewrite the range condition as an <code>IN</code> condition containing all possible values that can belong or do belong to the range.</p>
<p>This will replace a single <q>less than</q> or <q>greater then</q> with a small number of <q>equals to</q>. And an <q>equals to</q> gives the optimizer much more space to, um, optimize. It can use a <code>HASH JOIN</code>, split the index into a number of continuous ranges or do some other interesting things that can only be done with an equijoin.</p>
<p>With an <code>UNSIGNED INTEGER</code> column, it is easy to generate a set of values that could satisfy the range. But what if we know the range cardinality to be low from the statistics, not from the column datatype?</p>
<p>In this case, we could build the set of possible values using what <strong>MySQL</strong> calls a <a href="http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html">loose index scan</a>.</p>
<p>Basically, it takes the first record from the index and then recursively searches the next lowest record whose key value is greater than the previous one, using the <code>index seek</code> (as opposed to <code>index scan</code>). This means instead of mere scanning the index and applying the condition to each field, the engine would walk up and down the <strong>B-Tree</strong> to locate the first record the the greater key. It is much more efficient when the number of distinct keys is small. And in fact, <strong>MySQL</strong> does use this method for queries that involve <code>SELECT DISTINCT</code> on an indexed field.</p>
<h3>MIN / MAX</h3>
<p>This method would be useful for the open ranges (like <code>&gt;</code> or <code>&lt;</code>, as opposed to <code>BETWEEN</code>).</p>
<p>By default, the <code>INT</code> column means <code>SIGNED INT</code>. For a condition like <code>quantity &lt;= 4</code> this would mean all integers from <strong>-2,147,483,647</strong> to <strong>4</strong> which is way too many.</p>
<p>In real tables, the quantities would be something greater than <strong>0</strong>. But not all developers bother to pick a right datatype or add <code>CHECK</code> constraints for their columns (and some databases like <strong>PostgreSQL</strong> lack the unsigned datatypes anyway).</p>
<p>To work around this, we could find the minimum existing value in the index using a single index seek. It would serve as a lower bound for the range. Since in real table that would most probably be something like <strong>0</strong> or <strong>1</strong>, that would make the range much more constrained.</p>
<p>All these three methods could be used at the same time. Since the methods require nothing but just a single lookup of the statistics table and a single index seek, the most efficient method to return the subset of values that could satisfy the range could be chosen at runtime.</p>
<h3>Implementation</h3>
<p>Now, let&#8217;s make a sample table in <strong>PostgreSQL</strong> and see how could we benefit from replacing the low-cardinality ranges with the lists of values:</p>
<p><a href="#" onclick="xcollapse('X5181');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X5181" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_composite (
        id INT NOT NULL,
        uint1 INT NOT NULL,
        uint2 INT NOT NULL,
        real1 DOUBLE PRECISION NOT NULL,
        real2 DOUBLE PRECISION NOT NULL,
        stuffing VARCHAR(200) NOT NULL
        );

SELECT SETSEED(0.20100518);

INSERT
INTO    t_composite
SELECT  n,
        CEILING(RANDOM() * 40),
        CEILING(RANDOM() * 400000),
        CEILING(RANDOM() * 40) * 0.01,
        CEILING(RANDOM() * 400000) * 0.01,
        RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 16000000) n;

ALTER TABLE t_composite ADD CONSTRAINT pk_composite_id PRIMARY KEY (id);

CREATE INDEX ix_composite_uint ON t_composite (uint1, uint2);

CREATE INDEX ix_composite_real ON t_composite (real1, real2);
</pre>
</div>
<p>This table contains <strong>16,000,000</strong> records with two integer fields and two <code>double precision</code> fields.</p>
<p>There are composite indexes on the pairs of fields. This indexes are intentionally created with the least selective column leading to demonstrate the benefits of range transformation.</p>
<p>First, let&#8217;s run a query similar to the original one:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 &lt;= 20
        AND uint2 &lt;= 20
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">75600</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (5.1249s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=171331.57..171331.58 rows=1 width=204)
  -&gt;  Index Scan using ix_composite_uint on t_composite  (cost=0.00..171329.57 rows=796 width=204)
        Index Cond: ((uint1 &lt;= 20) AND (uint2 &lt;= 20))
</pre>
<p>The plan says that the index condition involves both fields. However, only the first field is used in the <strong>B-Tree</strong> search: the second is just being filtered on, though no actual table access is performed yet on this step.</p>
<p>The index keys are not too long, however, there are several millions of them that need to be scanned. That&#8217;s why the query takes more than 5 seconds to complete.</p>
<p>The I/O statistics show the following:</p>
<pre class="brush: sql">
SELECT  pg_stat_reset();

SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 &lt;= 20
        AND uint2 &lt;= 20;

SELECT  pg_stat_get_blocks_fetched(&#039;ix_composite_uint&#039;::regclass);
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>pg_stat_get_blocks_fetched</th>
</tr>
<tr>
<td class="int8">21865</td>
</tr>
</table>
</div>
<p>The query required more than twenty thousands of index blocks to be read and examined</p>
<h4>Smallest superset</h4>
<p>Now, let&#8217;s try to substitute the hard-coded list of possible values instead of the range condition:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 IN
        (
        SELECT  generate_series(0, 20)
        )
        AND uint2 &lt;= 20
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">75600</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0063s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=167.47..167.49 rows=1 width=204)
  -&gt;  Nested Loop  (cost=0.02..167.37 rows=40 width=204)
        -&gt;  HashAggregate  (cost=0.02..0.03 rows=1 width=4)
              -&gt;  Result  (cost=0.00..0.01 rows=1 width=0)
        -&gt;  Index Scan using ix_composite_uint on t_composite  (cost=0.00..166.84 rows=40 width=208)
              Index Cond: ((t_composite.uint1 = (generate_series(0, 20))) AND (t_composite.uint2 &lt;= 20))
</pre>
<p>Instead of a giant singe range, there are <strong>21</strong> short ranges examined in a nested loop. This is instant (<strong>6 ms</strong>).</p>
<p>Let&#8217;s look into the I/O statistics again:</p>
<pre class="brush: sql">
SELECT  pg_stat_reset();

SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 IN
        (
        SELECT  generate_series(0, 20)
        )
        AND uint2 &lt;= 20;

SELECT  pg_stat_get_blocks_fetched(&#039;ix_composite_uint&#039;::regclass);
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>pg_stat_get_blocks_fetched</th>
</tr>
<tr>
<td class="int8">64</td>
</tr>
</table>
</div>
<p>Now, only <strong>64</strong> blocks need to be read.</p>
<h4>MIN / MAX</h4>
<p>We took <strong>0</strong> as the initial value, but since theoretically there can be negative numbers in the columns, this assumption is not safe.</p>
<p>We need to get the least value from the table instead of assuming it:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 IN
        (
        SELECT  generate_series(
                (
                SELECT  MIN(uint1)
                FROM    t_composite
                ), 20)
        )
        AND uint2 &lt;= 20
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">75600</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0066s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=171.40..171.41 rows=1 width=204)
  -&gt;  Nested Loop  (cost=3.95..171.29 rows=40 width=204)
        -&gt;  HashAggregate  (cost=3.95..3.96 rows=1 width=4)
              -&gt;  Result  (cost=3.92..3.93 rows=1 width=0)
                    InitPlan 2 (returns $1)
                      -&gt;  Result  (cost=3.91..3.92 rows=1 width=0)
                            InitPlan 1 (returns $0)
                              -&gt;  Limit  (cost=0.00..3.91 rows=1 width=4)
                                    -&gt;  Index Scan using ix_composite_uint on t_composite  (cost=0.00..62571556.68 rows=15999664 width=4)
                                          Filter: (uint1 IS NOT NULL)
        -&gt;  Index Scan using ix_composite_uint on t_composite  (cost=0.00..166.84 rows=40 width=208)
              Index Cond: ((&quot;20100519_cardinality&quot;.t_composite.uint1 = (generate_series($1, 20))) AND (&quot;20100519_cardinality&quot;.t_composite.uint2 &lt;= 20))
</pre>
<p>This is instant again.</p>
<p>Now, what about statistics?</p>
<pre class="brush: sql">
SELECT  pg_stat_reset();

SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND uint1 IN
        (
        SELECT  generate_series(
                (
                SELECT  MIN(uint1)
                FROM    t_composite
                ), 20)
        )
        AND uint2 &lt;= 20;

SELECT  pg_stat_get_blocks_fetched(&#039;ix_composite_uint&#039;::regclass);
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>pg_stat_get_blocks_fetched</th>
</tr>
<tr>
<td class="int8">67</td>
</tr>
</table>
</div>
<p>The query is only <strong>3</strong> block reads heavier, but this time it is guaranteed to be correct.</p>
<h4>Loose index scan</h4>
<p>The <code>real*</code> columns hold the double precision data.</p>
<p>First, let&#8217;s run the original query:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND real1 &lt;= 0.2
        AND real2 &lt;= 0.2
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">83600</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (6.6561s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=206210.82..206210.84 rows=1 width=204)
  -&gt;  Index Scan using ix_composite_real on t_composite  (cost=0.00..206208.84 rows=793 width=204)
        Index Cond: ((real1 &lt;= 0.2::double precision) AND (real2 &lt;= 0.2::double precision))
</pre>
<p>Again, there are way too many block reads:</p>
<pre class="brush: sql">
SELECT  pg_stat_reset();

SELECT  SUM(LENGTH(stuffing))
FROM    t_composite
WHERE   1 = 1
        AND real1 &lt;= 0.2
        AND real2 &lt;= 0.2;

SELECT  pg_stat_get_blocks_fetched(&#039;ix_composite_real&#039;::regclass);
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>pg_stat_get_blocks_fetched</th>
</tr>
<tr>
<td class="int8">30657</td>
</tr>
</table>
</div>
<p>, and the query takes more than <strong>6 seconds</strong>.</p>
<p>For a condition like <code>real1 &lt;= 0.2</code>, the smallest superset of all possible values (that is all possible double-precision values between <strong>0</strong> and <strong>0.2</strong>) would be too large (though still finite of course) to be generated and joined. That&#8217;s why we need to use server-collected statistics to decide whether a loose index scan would be efficient to get the list of all distinct values of <code>real1</code> in the table:</p>
<pre class="brush: sql">
SELECT  n_distinct, most_common_vals, histogram_bounds
FROM    pg_stats
WHERE   schemaname = &#039;20100519_cardinality&#039;
        AND tablename = &#039;t_composite&#039;
        AND attname IN (&#039;real1&#039;, &#039;real2&#039;)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>n_distinct</th>
<th>most_common_vals</th>
<th>histogram_bounds</th>
</tr>
<tr>
<td class="float4">40</td>
<td class="anyarray">{0.09,0.22,0.01,0.37,0.04,0.25,0.28,0.34,0.27,0.06,0.36,0.11,0.08,0.39,0.12,0.2,0.02,0.16,0.17,0.21,0.29,0.18,0.19,0.26,0.3,0.32,0.35,0.1,0.13,0.4,0.15,0.23,0.38,0.03,0.33,0.24,0.07,0.14,0.05,0.31}</td>
<td class="anyarray"></td>
</tr>
<tr>
<td class="float4">362181</td>
<td class="anyarray">{1781.85,128.6,142.73,257.88,332.62,618.61,705.35,829.91,845.82,874.08,1432.82,1469.16,1486.01,1569.16,1866.43,2111.48,2234.48,2282.78,2340.7,2382.54,2468.19,2491.35,2494.73,2508.51,2587.51,2750.98,2876.07,2956.11,3222.62,3463.2,3564.41,3872.25,3.43,13.22,14.3,15.82,16.12,16.36,19.98,21.79,23.03,25.41,28.24,28.96,31.95,35.1,36.31,38.46,39.72,39.87,42.42,49.32,50.84,56.59,60.31,81.8,84.48,84.74,86.62,88.58,91.82,103.48,112.88,114.99,117.55,119.55,122.52,123.04,127.98,128.26,130.56,131.15,134.57,134.91,135.8,139.05,139.41,141.39,146.13,151.92,158.16,158.58,161.71,169.92,170.43,173.3,173.74,178.06,189.43,192.99,199.69,210.98,218.45,225.44,230.16,233.25,233.67,240.9,246.5,249.77}</td>
<td class="anyarray">{0.07,44.3,86.34,129.18,172.01,212.46,254.83,294.83,336.18,374.44,414.75,458.07,498.72,540.88,582.92,621.95,660.28,700.75,738.46,778.95,822.09,857.34,896.27,937.63,977.05,1017.49,1062.93,1104.88,1147.17,1185.99,1225.41,1267.8,1307.45,1347.49,1386.1,1424.05,1465.05,1503.23,1542.03,1581.45,1619.53,1658.57,1697.74,1741.65,1782.59,1823.19,1867.54,1905.59,1945.39,1986.04,2022.32,2061.52,2102.77,2143.07,2180.29,2218.06,2262.62,2302.28,2342.05,2382.29,2416.18,2455.43,2489.25,2528.07,2572.15,2606.91,2648.59,2685.43,2724.9,2765.49,2805.78,2845.94,2886.33,2925.52,2967.07,3005.12,3046.34,3084.44,3122.81,3158.25,3199.74,3242.4,3279.34,3319.41,3359.39,3401.15,3436.32,3477.28,3517.05,3559.6,3599.6,3639.63,3678.82,3720.27,3760.32,3801.89,3844.98,3883.96,3922.63,3963.01,3999.98}</td>
</tr>
<tr class="statusbar">
<td colspan="100">2 rows fetched in 0.0003s (0.0072s)</td>
</tr>
</table>
</div>
<p>From the server statistics we see that there are only <strong>40</strong> distinct values of <code>real1</code>. Hence, a loose index scan as such would be efficient.</p>
<p>Let&#8217;s look into the stats on <code>real2</code>. We see that there are <code>362181</code> distinct values in the table, and the range in question (<code>&lt;= 0.2</code>) corresponds to the first entry in the histogram (<code>BETWEEN 0.07 AND 44.3</code>). Since the histogram splits the values into <strong>100</strong> percentiles, this means that there are about <strong>3622</strong> values from <strong>0.07</strong> to <strong>44.3</strong>, and <code>((0.2 - 0.07) / (44.3 - 0.07) * 362181 / 100) ≈</code> <strong>11</strong> distinct values inside our range, and about <code>11 * 16000000 /362181 ≈</code> <strong>486</strong> records with these values.</p>
<p>Taking into account that the original query would need to scan <strong>8,000,000</strong> records, the loose index scan seems to be a good idea.</p>
<p>Unfortunately, <strong>PostgreSQL</strong> does not support it directly, but with minimal effort it can be emulated:</p>
<pre class="brush: sql">
WITH    RECURSIVE q (real1) AS
        (
        SELECT  MIN(real1)
        FROM    t_composite
        UNION ALL
        SELECT  (
                SELECT  c.real1
                FROM    t_composite c
                WHERE   c.real1 &gt; q.real1
                        AND c.real1 &lt;= 0.2
                ORDER BY
                        c.real1
                LIMIT 1
                )
        FROM    q
        WHERE   q.real1 IS NOT NULL
        )
SELECT  SUM(LENGTH(stuffing))
FROM    q
JOIN    t_composite c
ON      c.real1 = q.real1
        AND c.real2 &lt;= 0.2
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">83600</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0082s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=17319.78..17319.80 rows=1 width=204)
  CTE q
    -&gt;  Recursive Union  (cost=3.92..401.33 rows=101 width=8)
          -&gt;  Result  (cost=3.92..3.93 rows=1 width=0)
                InitPlan 1 (returns $1)
                  -&gt;  Limit  (cost=0.00..3.92 rows=1 width=8)
                        -&gt;  Index Scan using ix_composite_real on t_composite  (cost=0.00..62694798.76 rows=15999664 width=8)
                              Filter: (real1 IS NOT NULL)
          -&gt;  WorkTable Scan on q  (cost=0.00..39.54 rows=10 width=8)
                Filter: (q.real1 IS NOT NULL)
                SubPlan 2
                  -&gt;  Limit  (cost=0.00..3.93 rows=1 width=8)
                        -&gt;  Index Scan using ix_composite_real on t_composite c  (cost=0.00..314695.34 rows=79998 width=8)
                              Index Cond: ((real1 &gt; $2) AND (real1 &lt;= 0.2::double precision))
  -&gt;  Nested Loop  (cost=0.00..16914.48 rows=1588 width=204)
        -&gt;  CTE Scan on q  (cost=0.00..2.02 rows=101 width=8)
        -&gt;  Index Scan using ix_composite_real on t_composite c  (cost=0.00..166.95 rows=40 width=212)
              Index Cond: ((c.real1 = q.real1) AND (c.real2 &lt;= 0.2::double precision))
</pre>
<p>This is almost instant again. Here are the I/O statistics:</p>
<pre class="brush: sql">
SELECT  pg_stat_reset();

WITH    RECURSIVE q (real1) AS
        (
        SELECT  MIN(real1)
        FROM    t_composite
        UNION ALL
        SELECT  (
                SELECT  c.real1
                FROM    t_composite c
                WHERE   c.real1 &gt; q.real1
                        AND c.real1 &lt;= 0.2
                ORDER BY
                        c.real1
                LIMIT 1
                )
        FROM    q
        WHERE   q.real1 IS NOT NULL
        )
SELECT  SUM(LENGTH(stuffing))
FROM    q
JOIN    t_composite c
ON      c.real1 = q.real1
        AND c.real2 &lt;= 0.2;

SELECT  pg_stat_get_blocks_fetched(&#039;ix_composite_real&#039;::regclass);
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>pg_stat_get_blocks_fetched</th>
</tr>
<tr>
<td class="int8">165</td>
</tr>
</table>
</div>
<p>The number of index reads is reduced greatly again.</p>
<h3>Summary</h3>
<p>In some cases, a range predicate (like <q>less than</q>, <q>greater than</q> or <q>between</q>) can be rewritten as an <code>IN</code> predicate against the list of values that could satisfy the range condition.</p>
<p>Depending on the column datatype, check constraints and statistics, that list could be comprised of all possible values defined by the column&#8217;s domain; all possible values defined by column&#8217;s minimal and maximal value, or all actual distinct values contained in the table. In the latter case, a loose index scan could be used to retrieve the list of such values.</p>
<p>Since an equality condition is applied to each value in the list, more access and join methods could be used to build the query plain, including range conditions on secondary index columns, hash lookups etc.</p>
<p>Whenever the optimizer builds a plan for a query that contains a range predicate, it should consider rewriting the range condition as an <code>IN</code> predicate and use the latter method if it proves more efficient.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/05/19/things-sql-needs-determining-range-cardinality/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>MAX and MIN on a composite index</title>
		<link>http://explainextended.com/2010/05/08/max-and-min-on-a-composite-index/</link>
		<comments>http://explainextended.com/2010/05/08/max-and-min-on-a-composite-index/#comments</comments>
		<pubDate>Sat, 08 May 2010 19:00:47 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4733</guid>
		<description><![CDATA[Answering questions asked on the site. Ivo Radev asks: I am trying to make a very simple query. We have a log table which different machines write to. Given the machine list, I need to find the latest log timestamp. Currently, the query looks like this: SELECT MAX(log_time) FROM log_table WHERE log_machine IN ($machines) , [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Ivo Radev</strong> asks:</p>
<blockquote><p>I am trying to make a very simple query.</p>
<p>We have a log table which different machines write to. Given the machine list, I need to find the latest log timestamp.</p>
<p>Currently, the query looks like this:</p>
<pre class="brush: sql">
SELECT  MAX(log_time)
FROM    log_table
WHERE   log_machine IN ($machines)
</pre>
<p>, and I pass the comma-separated list of <code>$machines</code> from <strong>PHP</strong>.</p>
<p>The weird thing is that the query is literally instant when there is only one machine (any) in the list but slow when there are multiple machines.</p>
<p>I&#8217;m considering doing it in separate queries and then process the results in PHP. However I&#8217;d like to know if there is a fast solution in MySQL.</p></blockquote>
<p>Most probably, there is a composite index on <code>(log_machine, log_time)</code> which is being used for the query.</p>
<p>Usually, a query like this:</p>
<pre class="brush: sql">
SELECT  MAX(log_time)
FROM    log_table
</pre>
<p>on the indexed field <code>log_time</code> can be served with a single index seek on the index.</p>
<p>Indeed, the <code>MAX(log_time)</code>, by definition, is the latest entry in the index order, and can be fetched merely by finding the trailing index entry. It&#8217;s a matter of several page reads in the <code>B-Tree</code>, each one following the rightmost link to the lower-level page.</p>
<p>Similarly, this query:</p>
<pre class="brush: sql">
SELECT  MAX(log_time)
FROM    log_table
WHERE   log_machine = $my_machine
</pre>
<p>can be served with a single index seek too. However, the index should include <code>log_machine</code> as a leading column.</p>
<p>In this case, a set of records satisfying the <code>WHERE</code> clause of the query is represented by a single logically continuous block of records in the index, each one sharing the same value of <code>log_machine</code>. <code>MAX(log_time)</code> will of course be held by the last record in this block. <strong>MySQL</strong> just finds that last record and takes the <code>log_time</code> out of it.</p>
<p>Now, what if we have a multiple condition on <code>log_machine</code>?<br />
<span id="more-4733"></span><br />
The index remains the same, but the record holding <code>MAX(log_time)</code> is not the last record in a single continuous block anymore. Instead, there are multiple blocks each having its own <code>MAX(log_date)</code>. <code>log_time</code> cannot be found merely by taking the last record from the index block: it is not known which one is the correct one.</p>
<p>On composite indexes, however, <strong>MySQL</strong> offers <a href="http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html"><strong>loose index scan</strong></a>. This means that it jumps over the distinct values of the leading column, doing an index seek (instead of index scan) to retrieve each next value.</p>
<p>As stated in the documentation, this method is ideal to doing the queries like that:</p>
<pre class="brush: sql">
SELECT  log_machine, MAX(log_time)
FROM    log_table
WHERE   log_machine IN ($my_machine_list)
</pre>
<p>As we said earlier, for each <code>log_machine</code>, its <code>MAX(log_time)</code> can be returned very fast, and the list of the <code>log_machines</code> could be obtained with a loose index scan, by seeking the keys in the index.</p>
<p>This query, however, will not produce a single <code>MAX(log_time)</code>: instead, it will return as many maximums as there are values in the list (which are found in the table, of course).</p>
<p>But this can be easily worked around: we just select the greatest one of these records. Since the subquery will only return several records, the greatest one if them can be found almost instantly.</p>
<p>Let&#8217;s create a sample table:</p>
<p><a href="#" onclick="xcollapse('X3511');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X3511" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

CREATE TABLE log_table (
        id INT NOT NULL PRIMARY KEY,
        log_machine VARCHAR(20) NOT NULL,
        log_time DATETIME NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(1000000);
COMMIT;

INSERT
INTO    log_table
SELECT  id,
        CONCAT(&#039;Machine &#039; , CEILING(RAND(20100508) * 10)),
        &#039;2010-05-08&#039; - INTERVAL CEILING(RAND(20100508 &lt;&lt; 1) * 10000000) SECOND
FROM    filler f1;

CREATE INDEX ix_log_machine_time ON log_table (log_machine, log_time);
</pre>
</div>
<p>The table has <strong>1,000,000</strong> records</p>
<p>Now, let&#8217;s see how the original query performs:</p>
<pre class="brush: sql">
SELECT  MAX(log_time) AS maxtime
FROM    log_table
WHERE   log_machine IN (&#039;Machine 3&#039;, &#039;Machine 5&#039;, &#039;Machine 7&#039;, &#039;Machine 9&#039;)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>maxtime</th>
</tr>
<tr>
<td class="timestamp">2010-05-07 23:59:49</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.6406s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">log_table</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_machine_time</td>
<td class="varchar">ix_log_machine_time</td>
<td class="varchar">62</td>
<td class="varchar"></td>
<td class="bigint">826326</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
select max(`20100508_max`.`log_table`.`log_time`) AS `maxtime` from `20100508_max`.`log_table` where (`20100508_max`.`log_table`.`log_machine` in (&#39;Machine 3&#39;,&#39;Machine 5&#39;,&#39;Machine 7&#39;,&#39;Machine 9&#39;))
</pre>
<p>The query uses <code>range</code> access to retrieve the records and browses all records to find the maximum. It takes <strong>640 ms</strong> on a table <strong>1,000,000</strong> log records (which is about a day&#8217;s output of a single web server under a load decent but not super hard).</p>
<p>Now, let&#8217;s try to select the greatest of the group-wise maximums:</p>
<pre class="brush: sql">
SELECT  MAX(log_time) AS maxtime
FROM    log_table
WHERE   log_machine IN (&#039;Machine 3&#039;, &#039;Machine 5&#039;, &#039;Machine 7&#039;, &#039;Machine 9&#039;)
GROUP BY
        log_machine
ORDER BY
        1 DESC
LIMIT 1
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>maxtime</th>
</tr>
<tr>
<td class="timestamp">2010-05-07 23:59:49</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0020s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">SIMPLE</td>
<td class="varchar">log_table</td>
<td class="varchar">range</td>
<td class="varchar">ix_log_machine_time</td>
<td class="varchar">ix_log_machine_time</td>
<td class="varchar">62</td>
<td class="varchar"></td>
<td class="bigint">16</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index for group-by; Using temporary; Using filesort</td>
</tr>
</table>
</div>
<pre>
select max(`20100508_max`.`log_table`.`log_time`) AS `maxtime` from `20100508_max`.`log_table` where (`20100508_max`.`log_table`.`log_machine` in (&#39;Machine 3&#39;,&#39;Machine 5&#39;,&#39;Machine 7&#39;,&#39;Machine 9&#39;)) group by `20100508_max`.`log_table`.`log_machine` order by 1 desc limit 1
</pre>
<p>Now, it&#8217;s instant, as it should be.</p>
<p>As it often happens, by appending three seemingly redundant clauses to a query we made <strong>MySQL</strong> to choose a more efficient plan and the query is now instant even with multiple machines in the list.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/05/08/max-and-min-on-a-composite-index/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Things SQL needs: MERGE JOIN that would seek</title>
		<link>http://explainextended.com/2010/05/07/things-sql-needs-merge-join-that-would-seek/</link>
		<comments>http://explainextended.com/2010/05/07/things-sql-needs-merge-join-that-would-seek/#comments</comments>
		<pubDate>Fri, 07 May 2010 19:00:53 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4708</guid>
		<description><![CDATA[One of the most known and least used join algorithms in SQL engines is MERGE JOIN. This algorithm operates on two sorted recordsets, keeping two pointers that chase each other. The Wikipedia entry above describes it quite well in terms of algorithms. I&#8217;ll just make an animated GIF to make it more clear: This is [...]]]></description>
			<content:encoded><![CDATA[<p>One of the most known and least used join algorithms in <strong>SQL</strong> engines is <a href="http://en.wikipedia.org/wiki/Merge_join"><code>MERGE JOIN</code></a>.</p>
<p>This algorithm operates on two sorted recordsets, keeping two pointers that chase each other.</p>
<p>The Wikipedia entry above describes it quite well in terms of algorithms. I&#8217;ll just make an animated <strong>GIF</strong> to make it more clear:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/05/test.gif" alt="" title="Merge join" width="520" height="640" class="aligncenter size-full wp-image-4709 noborder" /></p>
<p>This is quite a nice and elegant algorithm, which, unfortunately, has two major drawbacks:</p>
<ol>
<li>It needs the recordsets to be sorted</li>
<li>Even with the recordsets sorted, it is no better than a <code>HASH JOIN</code></li>
</ol>
<p>The sorting part is essential for this algorithm and there is nothing that can be done with it: the recordsets should be sorted, period. Databases, however, often provide the records in the sorted order: from clustered tables, indexes, previously sorted and ordered subqueries, spool tables etc.</p>
<p>But even when the recordsets are already sorted, on equijoins the <code>MERGE JOIN</code> is hardly faster than a <code>HASH JOIN</code>.</p>
<p>Why?<br />
<span id="more-4708"></span></p>
<h3>MERGE JOIN vs. HASH JOIN</h3>
<p>Let&#8217;s remember how the <code>HASH JOIN</code> works:</p>
<ul>
<li>It takes the smaller table and builds a hash table out of it, with the join key as the hash key.</li>
<li>Then it takes each record from the larger table and looks it up in the hash table. If found, the records are returned.</li>
</ul>
<p>We see that there are four major steps involved:</p>
<ol>
<li>Scan the smaller table</li>
<li>Build a hash table (i. e. copy each record from the smaller table into the hash slot)</li>
<li>Scan the larger table</li>
<li>Look up the larger table</li>
</ol>
<p>Since building and looking up the hash table are performed in memory (or, depending on the <strong>SQL</strong> engine implementation, in memory-mapped temporary database, which is almost the same), these steps take negligible time compared to the time required to scan the table.</p>
<p>But we see that <code>MERGE JOIN</code>, as it is implemented now, also requires scanning both recordsets. Each record should be evaluated by the pointer to figure out if its join key is more, less or equal to that of the the other pointer.</p>
<p>This means that both <code>MERGE JOIN</code> and <code>HASH JOIN</code> require scanning both recordsets. However, <code>HASH JOIN</code> does not require any special order, which means it can use a table scan, index fast full scan and any other methods to get the records all at once, while <code>MERGE JOIN</code> need either to sort the records (which is obviously slow) or to traverse the index with the subsequent key lookups (which is not fast too).</p>
<p>In some terminal cases <code>MERGE JOIN</code> can be more efficient indeed: say, when the hash table does not fit completely into memory and would require either extensive disk writes or several scans over the source tables, while a <code>MERGE JOIN</code> could be performed on a pair of indexes.</p>
<p>It is also efficient for <code>FULL OUTER JOIN</code>: each record is evaluated, returned and forgotten only once, while <code>HASH JOIN</code> would require a second pass over the records that had not been ever matched.</p>
<h3>Seeks instead of scans</h3>
<p>But does the <code>MERGE JOIN</code> really always need to traverse all records?</p>
<p>Let&#8217;s see some more pictures:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/05/scan.png" alt="" title="Scan" width="620" height="480" class="aligncenter size-full wp-image-4713 noborder" /></p>
<p>Here, the right recordset is <strong>100,000</strong> records ahead of the left recordset. With <code>MERGE JOIN</code>, <strong>100,000</strong> records should be scanned from the left recordset and <strong>100,000</strong> comparisons made.</p>
<p>This is unavoidable if the recordset is a result of a sort operation.</p>
<p>However, <code>MERGE JOIN</code> is usually chosen when there is a more efficient sorted row source available: an index or a spool table (temporary index built in runtime). And both these sources allow efficient random seeks.</p>
<p>If an index served as the left recordset, we could see that right pointer is too far ahead, and just seek for its value in the left recordset instead of scanning <strong>100,000</strong> records:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/05/seek.png" alt="" title="Seek" width="620" height="480" class="aligncenter size-full wp-image-4714 noborder" /></p>
<p>Here, we can see that <strong>100,000</strong> is too far away and could advance the left pointer to the position of the right pointer in only several reads, traversing the <strong>B-Tree</strong>.</p>
<p>Since the indexes usually collect statistics, all we would need to do to decide whether we need to seek or scan was to check the histograms to estimate how may records are there between the current and the opposite pointers. If there are too many, the seek cost would overweight the scan cost and a seek should be performed. The statistics table itself would not need to be queried too often: since the records are always selected in order, the statistics table could be also read sequentially.</p>
<h3>Emulation</h3>
<p>Let&#8217;s create a couple of <strong>PostgreSQL</strong> tables and see the performance benefit:</p>
<p><a href="#" onclick="xcollapse('X9277');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X9277" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_left (
        id INT NOT NULL PRIMARY KEY,
        good INT NOT NULL,
        bad INT NOT NULL,
        stuffing VARCHAR(200) NOT NULL
);

INSERT
INTO    t_left
SELECT  s, s, s, RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) s;

CREATE UNIQUE INDEX ix_left_good ON t_left (good);

CREATE UNIQUE INDEX ix_left_bad ON t_left (bad);

CREATE TABLE t_right (
        id INT NOT NULL PRIMARY KEY,
        good INT NOT NULL,
        bad INT NOT NULL,
        stuffing VARCHAR(200) NOT NULL
);

INSERT
INTO    t_right
SELECT  s, s, s + 999000, RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) s;

CREATE UNIQUE INDEX ix_right_good ON t_right (good);

CREATE UNIQUE INDEX ix_right_bad ON t_right (bad);
</pre>
</div>
<p>These two tables have <strong>1,000,000</strong> records each, and a common column that would return only <strong>1,000</strong> records in a join.</p>
<p>Here&#8217;s the plain query runs against these tables:</p>
<pre class="brush: sql">
SELECT  SUM(LENGTH(l.stuffing) + LENGTH(r.stuffing))
FROM    t_left l
JOIN    t_right r
ON      r.bad = l.bad
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">400000</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (1.4062s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=71338.33..71338.35 rows=1 width=408)
  -&gt;  Merge Join  (cost=58737.16..68838.33 rows=1000000 width=408)
        Merge Cond: (l.bad = r.bad)
        -&gt;  Index Scan using ix_left_bad on t_left l  (cost=0.00..56287.36 rows=1000000 width=208)
        -&gt;  Index Scan using ix_right_bad on t_right r  (cost=0.00..56287.36 rows=1000000 width=208)
</pre>
<p>Note that <strong>PostgreSQL</strong> used a <code>MERGE JOIN</code> without any tricks from our side. This is because the table records are too large and could not fit into a hash table all at once.</p>
<p>Of course, <strong>PostgreSQL</strong> could only store the record pointers in the hash table and do the record lookups after the join, however, for some reason it would not select this plan.</p>
<p><code>MERGE JOIN</code>, in our case, is quite efficient, since the indexes are read first and the actual records are only looked up for the matched records (which are not too numerous). However, it still requires traversing <strong>2,000,000</strong> records which takes more than a second.</p>
<p>Now, let&#8217;s emulate the <code>MERGE JOIN</code> doing the seeks instead of scans. To do that, we will write a recursive query:</p>
<pre class="brush: sql">
WITH    RECURSIVE q (l, r) AS
        (
        SELECT  (
                SELECT  l
                FROM    t_left l
                ORDER BY
                        bad
                LIMIT 1
                ),
                (
                SELECT  r
                FROM    t_right r
                ORDER BY
                        bad
                LIMIT 1
                )
        UNION ALL
        SELECT  CASE
                WHEN (q.l).bad &lt; (q.r).bad THEN
                        (
                        SELECT  li
                        FROM    t_left li
                        WHERE   li.bad &gt;= (q.r).bad
                        ORDER BY
                                bad
                        LIMIT 1
                        )
                WHEN (q.l).bad = (q.r).bad THEN
                        (
                        SELECT  li
                        FROM    t_left li
                        WHERE   li.bad &gt; (q.r).bad
                        ORDER BY
                                bad
                        LIMIT 1
                        )
                ELSE
                        l
                END,
                CASE
                WHEN (q.r).bad &lt; (q.l).bad THEN
                        (
                        SELECT  ri
                        FROM    t_right ri
                        WHERE   ri.bad &gt;= (q.l).bad
                        ORDER BY
                                bad
                        LIMIT 1
                        )
                WHEN (q.r).bad = (q.l).bad THEN
                        (
                        SELECT  ri
                        FROM    t_right ri
                        WHERE   ri.bad &gt; (q.l).bad
                        ORDER BY
                                bad
                        LIMIT 1
                        )
                ELSE
                        r
                END
        FROM    q
        WHERE   l IS NOT NULL
                AND r IS NOT NULL
        )
SELECT  SUM(LENGTH((q.l).stuffing) + LENGTH((q.r).stuffing))
FROM    q
WHERE   (q.l).bad = (q.r).bad
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">400000</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0481s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=30.94..30.96 rows=1 width=64)
  CTE q
    -&gt;  Recursive Union  (cost=0.11..28.66 rows=101 width=64)
          -&gt;  Result  (cost=0.11..0.12 rows=1 width=0)
                InitPlan 1 (returns $1)
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_left_bad on t_left l  (cost=0.00..56287.36 rows=1000000 width=36)
                InitPlan 2 (returns $2)
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_right_bad on t_right r  (cost=0.00..56287.36 rows=1000000 width=36)
          -&gt;  WorkTable Scan on q  (cost=0.00..2.65 rows=10 width=64)
                Filter: ((q.l IS NOT NULL) AND (q.r IS NOT NULL))
                SubPlan 3
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_left_bad on t_left li  (cost=0.00..19598.69 rows=333333 width=36)
                              Index Cond: (bad &gt;= ($3).bad)
                SubPlan 4
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_left_bad on t_left li  (cost=0.00..19598.69 rows=333333 width=36)
                              Index Cond: (bad &gt; ($3).bad)
                SubPlan 5
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_right_bad on t_right ri  (cost=0.00..19598.69 rows=333333 width=36)
                              Index Cond: (bad &gt;= ($4).bad)
                SubPlan 6
                  -&gt;  Limit  (cost=0.00..0.06 rows=1 width=36)
                        -&gt;  Index Scan using ix_right_bad on t_right ri  (cost=0.00..19598.69 rows=333333 width=36)
                              Index Cond: (bad &gt; ($4).bad)
  -&gt;  CTE Scan on q  (cost=0.00..2.27 rows=1 width=64)
        Filter: ((l).bad = (r).bad)
</pre>
<p>This query makes a seek each time it needs to advance a pointer. This is not the most efficient way, but despite that fact, this query completes in only <strong>40 ms</strong>, which is <strong>25</strong> times as fast as the plain <code>MERGE JOIN</code> query.</p>
<h3>Summary</h3>
<p>With its current implementation, <code>MERGE JOIN</code> is not the most efficient algorithm, however, for several types of queries it outperforms <code>HASH JOIN</code>.</p>
<p>The main drawback of the <code>MERGE JOIN</code> is its inability to use seeks to advance the record pointers. Even if the opposite pointer is far away, the sequential scan is used instead of a <strong>B-Tree</strong> seek, even if the recordset is an index or a spool table.</p>
<p>To improve this, the accumulated index statistics should be taken into account when deciding whether to perform a seek or a sequential scan to catch up with the opposite pointer. If the statistics show a high number of the records in between, an index seek should be used instead of the index scan.</p>
<p>With this improvement, <code>MERGE JOIN</code> would perform much better, especially when joining two large indexed tables. It would require much less resources than a <code>HASH JOIN</code>, and, unlike <code>NESTED LOOPS</code>, the seeks would be performed only when really needed, thus preserving the benefits of the sequential access to the tables.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/05/07/things-sql-needs-merge-join-that-would-seek/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Groups holding highest ranked items</title>
		<link>http://explainextended.com/2010/04/22/groups-holding-highest-ranked-items/</link>
		<comments>http://explainextended.com/2010/04/22/groups-holding-highest-ranked-items/#comments</comments>
		<pubDate>Thu, 22 Apr 2010 19:00:51 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4700</guid>
		<description><![CDATA[Answering questions asked on the site. Nate asks: I know you&#8217;ve addressed similar issues related to the greatest-per-group query but this seems to be a different take on that. Example table: t_group item_id group_id score 100 1 2 100 2 3 200 1 1 300 1 4 300 2 2 Each item may be in [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Nate</strong> asks:</p>
<blockquote><p>I know you&#8217;ve addressed similar issues related to the <a href="/2009/11/24/mysql-selecting-records-holding-group-wise-maximum-on-a-unique-column/"><q>greatest-per-group</q> query</a> but this seems to be a different take on that.</p>
<p>Example table:</p>
<table class="excel">
<caption>t_group</caption>
<tr>
<th>item_id</th>
<th>group_id</th>
<th>score</th>
</tr>
<tr>
<td>100</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>100</td>
<td>2</td>
<td>3</td>
</tr>
<tr>
<td>200</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>300</td>
<td>1</td>
<td>4</td>
</tr>
<tr>
<td>300</td>
<td>2</td>
<td>2</td>
</tr>
</table>
<p>Each item may be in multiple groups.  Each instance of an item in that group is given a score (how relevant it is the the group).</p>
<p>So given the data above, when querying for group <strong>1</strong> it should return items <strong>200</strong> and <strong>300</strong> (item <strong>100</strong>&#8216;s highest score is for group <strong>2</strong>, so it&#8217;s excluded).
</p></blockquote>
<p>The classical <q>greatest-n-per-group</q> problem requires selecting a single record from each group holding a group-wise maximum. This case is a little bit different: for a given group, we need to select all records holding an item-wise maximum.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4700"></span><br />
<a href="#" onclick="xcollapse('X3879');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X3879" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

CREATE TABLE t_groups (
        item_id INT NOT NULL,
        group_id INT NOT NULL,
        score INT NOT NULL,
        PRIMARY KEY (group_id, item_id),
        KEY ix_groups_gsi (item_id, score, group_id)
) ENGINE=InnoDB;              

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(1000000);
COMMIT;

INSERT
INTO    t_groups
SELECT  (id - 1) % 1000 + 1,
        (id - 1) div 1000 + 1,
        CEILING(RAND(20100422) * 10000)
FROM    filler;
</pre>
</div>
<p>This table contains <strong>1,000,000</strong> records: <strong>1,000</strong> items in <strong>1,000</strong> groups with random scores.</p>
<p>Let&#8217;s write a query which would return us all items whose largest score is in group <strong>1</strong>.</p>
<p>To do this, we need to select all items from group <strong>1</strong> and check that no other group has a greater value of <code>score</code> for that item. The most intuitive query for this would look like that:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_groups go
WHERE   group_id = 1
        AND NOT EXISTS
        (
        SELECT  group_id
        FROM    t_groups gi
        WHERE   gi.item_id = go.item_id
                AND (gi.score, gi.group_id) &gt; (go.score, go.group_id)
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>item_id</th>
<th>group_id</th>
<th>score</th>
</tr>
<tr>
<td class="integer">288</td>
<td class="integer">1</td>
<td class="integer">9997</td>
</tr>
<tr>
<td class="integer">778</td>
<td class="integer">1</td>
<td class="integer">9995</td>
</tr>
<tr>
<td class="integer">970</td>
<td class="integer">1</td>
<td class="integer">9999</td>
</tr>
<tr class="statusbar">
<td colspan="100">3 rows fetched in 0.0002s (0.4210s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">go</td>
<td class="varchar">ref</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">4</td>
<td class="varchar">const</td>
<td class="bigint">1496</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">gi</td>
<td class="varchar">ref</td>
<td class="varchar">ix_groups_gsi</td>
<td class="varchar">ix_groups_gsi</td>
<td class="varchar">4</td>
<td class="varchar">20100422_rank.go.item_id</td>
<td class="bigint">366</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;20100422_rank.go.item_id&#39; of SELECT #2 was resolved in SELECT #1
Field or reference &#39;20100422_rank.go.score&#39; of SELECT #2 was resolved in SELECT #1
Field or reference &#39;20100422_rank.go.group_id&#39; of SELECT #2 was resolved in SELECT #1
select `20100422_rank`.`go`.`item_id` AS `item_id`,`20100422_rank`.`go`.`group_id` AS `group_id`,`20100422_rank`.`go`.`score` AS `score` from `20100422_rank`.`t_groups` `go` where ((`20100422_rank`.`go`.`group_id` = 1) and (not(exists(select `20100422_rank`.`gi`.`group_id` AS `group_id` from `20100422_rank`.`t_groups` `gi` where ((`20100422_rank`.`gi`.`item_id` = `20100422_rank`.`go`.`item_id`) and ((`20100422_rank`.`gi`.`score`,`20100422_rank`.`gi`.`group_id`) &gt; (`20100422_rank`.`go`.`score`,`20100422_rank`.`go`.`group_id`)))))))
</pre>
<p>This query returns us <strong>3</strong> items and their score. We see that these items are not ranked higher in any other group.</p>
<p>However, this query is quite inefficient: it takes almost half a second. This is way too long for <strong>1K</strong> items. Let&#8217;s look how can it be optimized.</p>
<p>If we look into the query plan we will see that no <code>range</code> access path is used for the subquery, despite the condition that can easily be optimized used such an access path. Instead, <strong>MySQL</strong> uses only the <code>ref</code> access on <code>item_id</code>, combined with <code>Using where; Using index</code>.</p>
<p>What does it all mean?</p>
<p>We have the composite index on <code>(item_id, score, group_id)</code>. This means that within the index, the records are ordered first on <code>item_id</code> then by <code>score</code> and then, in a case of a tie, by <code>group_id</code>.</p>
<p>For group <strong>1</strong>, <strong>MySQL</strong> should perform <strong>1,000</strong> comparisons: for each item within the group, the engine should make sure that no other group has a higher score for the same item.</p>
<p>Ideally, <strong>MySQL</strong> should have found each given set of <code>(item_id, score, group_id)</code> in the index and then just make a single next key search to check if this record is last in the index within the given <code>item_id</code>. That would show as a <code>range</code> search in the query plan, since we actually are checking values between <code>(item_id, score, group_id)</code> and <code>(item_id, +INF, +INF)</code>.</p>
<p>However, <strong>MySQL</strong> cannot do such things in a correlated subquery. Instead, it uses the <code>ref</code> access path: takes all records with the given <code>item_id</code> (i. e. between <code>(item_id, -INF, -INF)</code> and <code>(item_id, +INF, +INF)</code>) and traverses them applying the <code>WHERE</code> filter to each record.</p>
<p>For group <strong>1</strong>, only <strong>3</strong> items are returned. This means that for majority of items (<strong>997</strong> items), the whole index range for the items (<strong>1,000</strong> records per item) had to be scanned.</p>
<p>And since the matching record is always the last one in the index range (because it&#8217;s the one with the greatest score), this means that even for returned items, the whole ranged had to be scanned too. The only difference is that the last record in the range satisfies the <code>WHERE</code> condition. No wonder it takes so long.</p>
<p>To speed up the query we need to trick <strong>MySQL</strong> a little.</p>
<p>Since it&#8217;s always the last record in the index range that satisfies the <code>WHERE</code> condition, why don&#8217;t we just take it and compare? If we see that it holds our group, we return <code>TRUE</code>, if we don&#8217;t, we return <code>FALSE</code> right away.</p>
<p>Here&#8217;s how we can do this:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_groups go
WHERE   group_id = 1
        AND group_id =
        (
        SELECT  group_id
        FROM    t_groups gi
        WHERE   gi.item_id = go.item_id
        ORDER BY
                item_id DESC, score DESC, group_id DESC
        LIMIT 1
        )
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>item_id</th>
<th>group_id</th>
<th>score</th>
</tr>
<tr>
<td class="integer">288</td>
<td class="integer">1</td>
<td class="integer">9997</td>
</tr>
<tr>
<td class="integer">778</td>
<td class="integer">1</td>
<td class="integer">9995</td>
</tr>
<tr>
<td class="integer">970</td>
<td class="integer">1</td>
<td class="integer">9999</td>
</tr>
<tr class="statusbar">
<td colspan="100">3 rows fetched in 0.0002s (0.0100s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">go</td>
<td class="varchar">ref</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">4</td>
<td class="varchar">const</td>
<td class="bigint">1496</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DEPENDENT SUBQUERY</td>
<td class="varchar">gi</td>
<td class="varchar">ref</td>
<td class="varchar">ix_groups_gsi</td>
<td class="varchar">ix_groups_gsi</td>
<td class="varchar">4</td>
<td class="varchar">20100422_rank.go.item_id</td>
<td class="bigint">366</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
Field or reference &#39;20100422_rank.go.item_id&#39; of SELECT #2 was resolved in SELECT #1
select `20100422_rank`.`go`.`item_id` AS `item_id`,`20100422_rank`.`go`.`group_id` AS `group_id`,`20100422_rank`.`go`.`score` AS `score` from `20100422_rank`.`t_groups` `go` where ((`20100422_rank`.`go`.`group_id` = 1) and (1 = (select `20100422_rank`.`gi`.`group_id` AS `group_id` from `20100422_rank`.`t_groups` `gi` where (`20100422_rank`.`gi`.`item_id` = `20100422_rank`.`go`.`item_id`) order by `20100422_rank`.`gi`.`item_id` desc,`20100422_rank`.`gi`.`score` desc,`20100422_rank`.`gi`.`group_id` desc limit 1)))
</pre>
<p>Instead of checking the <em>existence</em> of the last row in a range, we check it&#8217;s <em>value</em>. This means that exactly one index record will be evaluated for each item within the group.</p>
<p>Note that the execution plan used by this query looks <em>exactly</em> the same as the first query&#8217;s one. Same <code>ref</code> condition on <code>item_id</code>, same <code>Using where; Using index</code>. The only difference (not shown in the <code>EXPLAIN</code> output) it that the index is traversed in <em>descending</em> order now. Instead of fetching <strong>1,000</strong> records from the beginning to check the existence of the last one, we just take one record from the end and check its value.</p>
<p>Those who read my blog already know why do we use seemingly redundant <code>item_id DESC</code> in the <code>ORDER BY</code> clause here. To make <strong>MySQL</strong> to use descending index access path instead of a filesort in an ordered query, we should list <em>all</em> clauses that constitute the index, even if some of them are filtered by the <code>WHERE</code> condition.</p>
<p>The second query completes in only <strong>10 ms</strong> which is <strong>40</strong> times as fast as the original query.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/04/22/groups-holding-highest-ranked-items/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hierarchical query in MySQL: limiting parents</title>
		<link>http://explainextended.com/2010/04/18/hierarchical-query-in-mysql-limiting-parents/</link>
		<comments>http://explainextended.com/2010/04/18/hierarchical-query-in-mysql-limiting-parents/#comments</comments>
		<pubDate>Sun, 18 Apr 2010 19:00:08 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4691</guid>
		<description><![CDATA[Answering questions asked on the site. James asks: Your series on hierarchical queries in MySQL is tremendous! I&#8217;m using it to create a series of threaded conversations. I&#8217;m wondering if there is a way to paginate these results. Specifically, let&#8217;s say I want to limit the conversations to return 10 root nodes (parent=0) and all [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>James</strong> asks:</p>
<blockquote><p>Your series on <a href="/2009/03/17/hierarchical-queries-in-mysql/">hierarchical queries in MySQL</a> is tremendous! I&#8217;m using it to create a series of threaded conversations.</p>
<p>I&#8217;m wondering if there is a way to paginate these results.</p>
<p>Specifically, let&#8217;s say I want to limit the conversations to return <strong>10</strong> root nodes (<code>parent=0</code>) and all of their children in a query.</p>
<p>I can&#8217;t just limit the final query, because that will clip off children. I&#8217;ve tried to add <code>LIMIT</code>s to your stored functions, but I&#8217;m not getting the magic just right.</p>
<p>How would you go about doing this?
</p></blockquote>
<p>A quick reminder: <strong>MySQL</strong> does not support recursion (either <code>CONNECT BY</code> style or recursive <strong>CTE</strong> style), so using an adjacency list model is a somewhat complicated task.</p>
<p>However, it is still possible. The main idea is storing the recursion state in a session variable and call a user-defined function repeatedly to iterate over the tree, thus emulating recursion. The article mentioned in the question shows how to do that.</p>
<p>Normally, reading and assigning session variables in the same query is discouraged in <strong>MySQL</strong>, since the order of evaluation is not guaranteed. However, in the case we only use the table as a dummy recordset and no values of the records are actually used in the function, so the actual values returned by the function are completely defined by the function itself. The table is only used to ensure that the function is called enough times, and to present its results in form of a native resultset (which can be returned or joined with).</p>
<p>To do something with the logic of the function (like, imposing a limit on the parent nodes without doing the same on the child nodes), we, therefore, should tweak the function code, not the query that calls the functions. The only thing that matters in such a query is the number of records returned and we don&#8217;t know it in design time.</p>
<p>Limiting the parent nodes is quite simple: we just use another session variable to track the number of parent branches yet to be returned and stop processing as soon as the limit is hit, that is the variable becomes zero.</p>
<p>Let&#8217;s create a sample table and see how to do this:<br />
<span id="more-4691"></span><br />
<a href="#" onclick="xcollapse('X1510');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X1510" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE filler (
        id INT NOT NULL PRIMARY KEY AUTO_INCREMENT
) ENGINE=Memory;

CREATE TABLE t_hierarchy (
        id int(10) unsigned NOT NULL AUTO_INCREMENT,
        parent int(10) unsigned NOT NULL,
        root INT NOT NULL,
        PRIMARY KEY (id),
        KEY ix_hierarchy_parent (parent, id),
        KEY ix_hierarchy_root (root)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DELIMITER $$

CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
        DECLARE _cnt INT;
        SET _cnt = 1;
        WHILE _cnt &lt;= cnt DO
                INSERT
                INTO    filler
                SELECT  _cnt;
                SET _cnt = _cnt + 1;
        END WHILE;
END
$$

DELIMITER ;

START TRANSACTION;
CALL prc_filler(100000);
COMMIT;

INSERT
INTO    t_hierarchy
SELECT  id,
        CASE (id - 1) % 8
        WHEN 0 THEN 0
        ELSE FLOOR((((id - 1) % 8) + 1) / 2) + ((id - 1) div 8) * 8
        END,
        ((id - 1) div 8) * 8 + 1
FROM    filler;
</pre>
</div>
<p>There are <strong>100,000</strong> hierarchical records in multiple trees (<strong>8</strong> records in each tree).</p>
<p>To limit the number of trees returned, we create the function similar to the one created in the earlier posts and add a little condition that would decrease the session variable, <code>@parent_limit</code>, each time a parent entry is returned. When this variable hits zero, it&#8217;s a signal to stop processing the records:</p>
<pre class="brush: sql">
CREATE FUNCTION hierarchy_connect_by_parent_eq_prior_id(value INT) RETURNS INT
NOT DETERMINISTIC
READS SQL DATA
BEGIN
        DECLARE _id INT;
        DECLARE _parent INT;
        DECLARE _next INT;
        DECLARE CONTINUE HANDLER FOR NOT FOUND SET @id = NULL;

        SET _parent = @id;
        SET _id = -1;

        IF @id IS NULL THEN
                RETURN NULL;
        END IF;

        LOOP
                SELECT  MIN(id)
                INTO    @id
                FROM    t_hierarchy
                WHERE   parent = _parent
                        AND id &gt; _id;
                IF @id IS NOT NULL OR _parent = @start_with THEN
                        SET @level = @level + 1;
                        IF _parent = @start_with AND @parent_limit &gt; 0 THEN
                                SET @parent_limit = @parent_limit - 1;
                        END IF;
                        IF @parent_limit = 0 THEN
                                SET @id = NULL;
                        END IF;
                        RETURN @id;
                END IF;
                SET @level := @level - 1;
                SELECT  id, parent
                INTO    _id, _parent
                FROM    t_hierarchy
                WHERE   id = _parent;
        END LOOP;
END;
</pre>
<p>Let&#8217;s check it:</p>
<pre class="brush: sql">
SELECT  CONCAT(REPEAT(&#039;    &#039;, level - 1), CAST(hi.id AS CHAR)) AS treeitem, parent, level, lmt
FROM    (
        SELECT  hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level, @parent_limit as lmt
        FROM    (
                SELECT  @start_with := 0,
                        @parent_limit := 4,
                        @id := @start_with,
                        @level := 0
                ) vars, t_hierarchy
        WHERE   @id IS NOT NULL
        ) ho
JOIN    t_hierarchy hi
ON      hi.id = ho.id
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>treeitem</th>
<th>parent</th>
<th>level</th>
<th>lmt</th>
</tr>
<tr>
<td class="blob">1</td>
<td class="integer">0</td>
<td class="blob">1</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">    2</td>
<td class="integer">1</td>
<td class="blob">2</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">        4</td>
<td class="integer">2</td>
<td class="blob">3</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">            8</td>
<td class="integer">4</td>
<td class="blob">4</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">        5</td>
<td class="integer">2</td>
<td class="blob">3</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">    3</td>
<td class="integer">1</td>
<td class="blob">2</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">        6</td>
<td class="integer">3</td>
<td class="blob">3</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">        7</td>
<td class="integer">3</td>
<td class="blob">3</td>
<td class="blob">3</td>
</tr>
<tr>
<td class="blob">9</td>
<td class="integer">0</td>
<td class="blob">1</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">    10</td>
<td class="integer">9</td>
<td class="blob">2</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">        12</td>
<td class="integer">10</td>
<td class="blob">3</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">            16</td>
<td class="integer">12</td>
<td class="blob">4</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">        13</td>
<td class="integer">10</td>
<td class="blob">3</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">    11</td>
<td class="integer">9</td>
<td class="blob">2</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">        14</td>
<td class="integer">11</td>
<td class="blob">3</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">        15</td>
<td class="integer">11</td>
<td class="blob">3</td>
<td class="blob">2</td>
</tr>
<tr>
<td class="blob">17</td>
<td class="integer">0</td>
<td class="blob">1</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">    18</td>
<td class="integer">17</td>
<td class="blob">2</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">        20</td>
<td class="integer">18</td>
<td class="blob">3</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">            24</td>
<td class="integer">20</td>
<td class="blob">4</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">        21</td>
<td class="integer">18</td>
<td class="blob">3</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">    19</td>
<td class="integer">17</td>
<td class="blob">2</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">        22</td>
<td class="integer">19</td>
<td class="blob">3</td>
<td class="blob">1</td>
</tr>
<tr>
<td class="blob">        23</td>
<td class="integer">19</td>
<td class="blob">3</td>
<td class="blob">1</td>
</tr>
<tr class="statusbar">
<td colspan="100">24 rows fetched in 0.0008s (0.0711s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">25</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">hi</td>
<td class="varchar">eq_ref</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">4</td>
<td class="varchar">ho.id</td>
<td class="bigint">1</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">&lt;derived3&gt;</td>
<td class="varchar">system</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">1</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">t_hierarchy</td>
<td class="varchar">index</td>
<td class="varchar"></td>
<td class="varchar">PRIMARY</td>
<td class="varchar">4</td>
<td class="varchar"></td>
<td class="bigint">100650</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
<tr>
<td class="bigint">3</td>
<td class="varchar">DERIVED</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint"></td>
<td class="double"></td>
<td class="varchar">No tables used</td>
</tr>
</table>
</div>
<pre>
select concat(repeat(&#39;    &#39;,(`ho`.`level` - 1)),cast(`20100418_limit`.`hi`.`id` as char charset latin1)) AS `treeitem`,`20100418_limit`.`hi`.`parent` AS `parent`,`ho`.`level` AS `level`,`ho`.`lmt` AS `lmt` from (select `hierarchy_connect_by_parent_eq_prior_id`(`20100418_limit`.`t_hierarchy`.`id`) AS `id`,(@level) AS `level`,(@parent_limit) AS `lmt` from (select (@start_with:=0) AS `@start_with := 0`,(@parent_limit:=4) AS `@parent_limit := 4`,(@id:=(@start_with)) AS `@id := @start_with`,(@level:=0) AS `@level := 0`) `vars` join `20100418_limit`.`t_hierarchy` where ((@id) is not null)) `ho` join `20100418_limit`.`t_hierarchy` `hi` where (`20100418_limit`.`hi`.`id` = `ho`.`id`)
</pre>
<p>The first <strong>3</strong> branches in a proper hierarchy, almost instantly. Note that we need set <code>@parent_limit</code> to <strong>4</strong>, i. e. one greater than the value we need.</p>
<p>Note that it is possible to achieve the similar behavior without using any recursive functions at all.</p>
<p>On most forums the filtering is performed on the topic starters, so it&#8217;s often a good idea to store the id of the topic starter along with each reply. In the sample table a did that as well.</p>
<p>With this, the filtering required becomes very simple:</p>
<pre class="brush: sql">
SELECT  h.*
FROM    (
        SELECT  id
        FROM    t_hierarchy
        WHERE   parent = 0
        ORDER BY
                id
        LIMIT 3
        ) q
JOIN    t_hierarchy h
ON      h.root = q.id
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>root</th>
</tr>
<tr>
<td class="integer">1</td>
<td class="integer">0</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">2</td>
<td class="integer">1</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">3</td>
<td class="integer">1</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">4</td>
<td class="integer">2</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">5</td>
<td class="integer">2</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">6</td>
<td class="integer">3</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">7</td>
<td class="integer">3</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">8</td>
<td class="integer">4</td>
<td class="integer">1</td>
</tr>
<tr>
<td class="integer">9</td>
<td class="integer">0</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">10</td>
<td class="integer">9</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">11</td>
<td class="integer">9</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">12</td>
<td class="integer">10</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">13</td>
<td class="integer">10</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">14</td>
<td class="integer">11</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">15</td>
<td class="integer">11</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">16</td>
<td class="integer">12</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">17</td>
<td class="integer">0</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">18</td>
<td class="integer">17</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">19</td>
<td class="integer">17</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">20</td>
<td class="integer">18</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">21</td>
<td class="integer">18</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">22</td>
<td class="integer">19</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">23</td>
<td class="integer">19</td>
<td class="integer">17</td>
</tr>
<tr>
<td class="integer">24</td>
<td class="integer">20</td>
<td class="integer">17</td>
</tr>
<tr class="statusbar">
<td colspan="100">24 rows fetched in 0.0007s (0.0024s)</td>
</tr>
</table>
</div>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>filtered</th>
<th>Extra</th>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">&lt;derived2&gt;</td>
<td class="varchar">ALL</td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="varchar"></td>
<td class="bigint">3</td>
<td class="double">100.00</td>
<td class="varchar"></td>
</tr>
<tr>
<td class="bigint">1</td>
<td class="varchar">PRIMARY</td>
<td class="varchar">h</td>
<td class="varchar">ref</td>
<td class="varchar">ix_hierarchy_root</td>
<td class="varchar">ix_hierarchy_root</td>
<td class="varchar">4</td>
<td class="varchar">q.id</td>
<td class="bigint">4</td>
<td class="double">100.00</td>
<td class="varchar">Using where</td>
</tr>
<tr>
<td class="bigint">2</td>
<td class="varchar">DERIVED</td>
<td class="varchar">t_hierarchy</td>
<td class="varchar">ref</td>
<td class="varchar">ix_hierarchy_parent</td>
<td class="varchar">ix_hierarchy_parent</td>
<td class="varchar">4</td>
<td class="varchar"></td>
<td class="bigint">16830</td>
<td class="double">100.00</td>
<td class="varchar">Using where; Using index</td>
</tr>
</table>
</div>
<pre>
select `20100418_limit`.`h`.`id` AS `id`,`20100418_limit`.`h`.`parent` AS `parent`,`20100418_limit`.`h`.`root` AS `root` from (select `20100418_limit`.`t_hierarchy`.`id` AS `id` from `20100418_limit`.`t_hierarchy` where (`20100418_limit`.`t_hierarchy`.`parent` = 0) order by `20100418_limit`.`t_hierarchy`.`id` limit 3) `q` join `20100418_limit`.`t_hierarchy` `h` where (`20100418_limit`.`h`.`root` = `q`.`id`)
</pre>
<p>This solution (that had been used on numerous forum engines for ages) is more efficient, since no function calls are involved, and more simple too.</p>
<p>However, it does not preserve the hierarchical order and does not allow sorting on anything but the topic starter, so if you need anything of these, the recursive function is still a way to go.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/04/18/hierarchical-query-in-mysql-limiting-parents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
