<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>EXPLAIN EXTENDED &#187; PostgreSQL</title>
	<atom:link href="http://explainextended.com/category/postgresql/feed/" rel="self" type="application/rss+xml" />
	<link>http://explainextended.com</link>
	<description>How to create fast database queries</description>
	<lastBuildDate>Mon, 02 Jan 2012 00:31:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>PostgreSQL: parametrizing a recursive CTE</title>
		<link>http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/</link>
		<comments>http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/#comments</comments>
		<pubDate>Fri, 24 Dec 2010 20:00:45 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=5146</guid>
		<description><![CDATA[An anchor part of a recursive CTE cannot be easily parametrized in a view. To work around this, we can wrap the CTE into a set-returning function which would accept the parameter and use it in the anchor part.]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Jan Suchal</strong> asks:</p>
<blockquote>
<p>We&#8217;ve started playing with <strong>PostgreSQL</strong> and recursive queries. Looking at example that does basic graph traversal from <a href="http://www.postgresql.org/docs/9.0/static/queries-with.html">http://www.postgresql.org/docs/9.0/static/queries-with.html</a>.</p>
<p>We would like to have a view called <code>paths</code> defined like this:</p>
<pre class="brush: sql">
WITH RECURSIVE
        search_graph(id, path) AS
        (
        SELECT  id, ARRAY[id]
        FROM    node
        UNION ALL
        SELECT  g.dest, sg.path || g.dest
        FROM    search_graph sg
        JOIN    graph g
        ON      g.source = sg.id
                AND NOT g.dest = ANY(sg.path)
        )
SELECT  path
FROM    search_graph
</pre>
<p>By calling</p>
<pre class="brush: sql">
SELECT  *
FROM    paths
WHERE   path[1] = :node_id
</pre>
<p>we would get all paths from a certain node.</p>
<p>The problem here is with performance. When you want this to be quick you need to add a condition for the anchor part of the <code>UNION</code> like this:</p>
<pre class="brush: sql">
WITH RECURSIVE
        search_graph(id, path) AS
        (
        SELECT  id, ARRAY[id]
        FROM    node
        WHERE   id = :node_id
        UNION ALL
        SELECT  g.dest, sg.path || g.dest
        FROM    search_graph sg
        JOIN    graph g
        ON      g.source = sg.id
                AND NOT g.dest = ANY(sg.path)
        )
SELECT  path
FROM    search_graph
</pre>
<p>Now it&#8217;s perfectly fast, but we cannot create a view because that would only contain paths from one specific node.</p>
<p>Any ideas?</p>
</blockquote>
<p>An often overlooked feature of <strong>PostgreSQL</strong> is its ability to create set-returning functions and use them in <code>SELECT</code> list.</p>
<p>The record will be cross-joined with the set returned by the function and the result of the join will be added to the resultset.</p>
<p>This is best demonstrated with <code>generate_series</code>, probably a most used <strong>PostgreSQL</strong> set-returning function.<br />
<span id="more-5146"></span></p>
<h3>Emulating CROSS APPLY with set-returning functions</h3>
<p>Let&#8217;s write a simple query that returns numbers from <strong>1</strong> to <strong>3</strong>:</p>
<pre class="brush: sql">
SELECT  id
FROM    (
        VALUES
        (1),
        (2),
        (3)
        ) vals (id)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
</tr>
<tr>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
</tr>
</table>
</div>
<p>Now, let&#8217;s add <code>generate_series(1, id)</code> to the <code>SELECT</code> list. As we can see, each of the three values is passed as an argument to <code>generate_series</code>:</p>
<pre class="brush: sql">
SELECT  id, generate_series(1, id)
FROM    (
        VALUES
        (1),
        (2),
        (3)
        ) vals (id)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>generate_series</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
</table>
</div>
<p>We see that each record of the original query was cross-joined with the set returned by the function, and the final resultset now has <strong>6</strong> records instead of <strong>3</strong>, since <strong>3</strong> sets, having <strong>1</strong>, <strong>2</strong> and <strong>3</strong> records, respectively, were results of these cross-joins.</p>
<p>This is almost what <code>CROSS APPLY</code> does in <strong>SQL Server</strong>, but with some limitations and caveats.</p>
<h3>Limitations of set-returning functions</h3>
<ul>
<li>
<p>One should create a function explicitly. Anonymous blocks won&#8217;t work.</p>
</li>
<li>
<p>The functions should be written in <strong>SQL</strong> or <strong>C</strong>. Procedural language set-returning functions cannot be used in <code>SELECT</code> lists</p>
</li>
<li>
<p>If the function return an empty set, the result of the cross join will be an empty set too (i. e. the corresponding record won&#8217;t be returned at all). This is exactly how <code>CROSS APPLY</code> behaves in <strong>SQL Server</strong>.</p>
<p>However, the latter supports <code>OUTER APPLY</code> which always returns a single record with a <code>NULL</code> in corresponding fields instead of the empty set, and it&#8217;s impossible to emulate it in <strong>PostgreSQL</strong> using this syntax.</p>
<p>A workaround would be to develop the function so that it would return a single <code>NULL</code> instead of an empty set.</p>
</li>
<li>
<p>The function should be marked <code>VOLATILE</code> so that the optimizer would execute it each time it&#8217;s called. This is especially important when the function is used multiple times in the same <code>SELECT</code> list, like this:</p>
<pre class="brush: sql">
SELECT  generate_series(1, 3), generate_series(1, 2)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>generate_series</th>
<th>generate_series</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
</tr>
</table>
</div>
<p>This query correctly returns <strong>6</strong> records.</p>
<pre class="brush: sql">
SELECT  generate_series(1, 3), generate_series(1, 3)
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>generate_series</th>
<th>generate_series</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
</table>
</div>
<p>This query <em>incorrectly</em> returns <strong>3</strong> records (it should return <strong>9</strong>). This is because function is not reevaluated.</p>
</li>
</ul>
<h3>Parametrizing the path query</h3>
<p>Now, we are informed enough to create our own function and test it. To do this, we will first create some sample tables:</p>
<p><a href="#" onclick="xcollapse('X2468');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X2468" style="display: none; background: transparent;">
<pre class="brush: sql">
CREATE TABLE node
        (
        id BIGINT NOT NULL PRIMARY KEY,
        name VARCHAR(100) NOT NULL
        );

CREATE TABLE graph
        (
        source BIGINT NOT NULL,
        dest BIGINT NOT NULL,
        data FLOAT NOT NULL,
        PRIMARY KEY (source, dest)
        );

CREATE INDEX
        ix_graph_dest
ON      graph (dest);

SELECT  SETSEED(0.20101224);

INSERT
INTO    node
SELECT  s, &#039;Node &#039; || s
FROM    generate_series(1, 10000) s;

INSERT
INTO    graph (source, dest, data)
SELECT  source, dest, RANDOM()
FROM    (
        SELECT  DISTINCT
                CEIL(RANDOM() * 10000) AS source,
                CEIL(RANDOM() * 10000) AS dest
        FROM    generate_series(1, 10000)
        ) q
WHERE   source &lt;&gt; dest
</pre>
</div>
<p>There are <strong>10,000</strong> nodes with <strong>9,996</strong> random paths between them.</p>
<p>Here&#8217;s what the function would look like:</p>
<pre class="brush: sql">
CREATE OR REPLACE FUNCTION fn_search_graph_cte(parent BIGINT)
RETURNS TABLE
        (
        path BIGINT[]
        )
AS
$$
        WITH RECURSIVE
                search_graph(id, path) AS
                (
                SELECT  $1, ARRAY[$1]
                UNION ALL
                SELECT  g.dest, sg.path || g.dest
                FROM    search_graph sg
                JOIN    graph g
                ON      g.source = sg.id
                        AND NOT g.dest = ANY(sg.path)
                )
        SELECT  path
        FROM    search_graph;
$$
LANGUAGE &#039;sql&#039;
VOLATILE;
</pre>
<p>The anchor part of the recursive <strong>CTE</strong> is parametrized.</p>
<h3>Paths from a single node</h3>
<p>Now, let&#8217;s compare performance of the queries that select paths from a single node:</p>
<pre class="brush: sql">
WITH RECURSIVE
        search_graph(id, path) AS
        (
        SELECT  id, ARRAY[id]
        FROM    node
        UNION ALL
        SELECT  g.dest, sg.path || g.dest
        FROM    search_graph sg
        JOIN    graph g
        ON      g.source = sg.id
                AND NOT g.dest = ANY(sg.path)
        )
SELECT  path::TEXT
FROM    search_graph
WHERE   path[1] = 69
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>path</th>
</tr>
<tr>
<td class="text">{69}</td>
</tr>
<tr>
<td class="text">{69,3804}</td>
</tr>
<tr>
<td class="text">{69,3642}</td>
</tr>
<tr>
<td class="text">{69,3642,3768}</td>
</tr>
<tr>
<td class="text">{69,3642,2925}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683,8668}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683,8668,3705}</td>
</tr>
<tr class="statusbar">
<td colspan="100">8 rows fetched in 0.0002s (0.5156s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on search_graph  (cost=151096.75..187130.22 rows=7999 width=32)
  Filter: (path[1] = 69)
  CTE search_graph
    -&gt;  Recursive Union  (cost=0.00..151096.75 rows=1599710 width=40)
          -&gt;  Seq Scan on node  (cost=0.00..164.00 rows=10000 width=8)
          -&gt;  Hash Join  (cost=288.91..11893.85 rows=158971 width=40)
                Hash Cond: (sg.id = g.source)
                Join Filter: (g.dest &lt;&gt; ALL (sg.path))
                -&gt;  WorkTable Scan on search_graph sg  (cost=0.00..2000.00 rows=100000 width=40)
                -&gt;  Hash  (cost=163.96..163.96 rows=9996 width=16)
                      -&gt;  Seq Scan on graph g  (cost=0.00..163.96 rows=9996 width=16)
</pre>
<p>This query materializes the whole <strong>CTE</strong> and then filters it for the paths beginning from <strong>69</strong>. It takes more than <strong>500 ms</strong>.</p>
<pre class="brush: sql">
SELECT  fn_search_graph_cte(id)::TEXT
FROM    node
WHERE   id = 69
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>fn_search_graph_cte</th>
</tr>
<tr>
<td class="text">{69}</td>
</tr>
<tr>
<td class="text">{69,3642}</td>
</tr>
<tr>
<td class="text">{69,3804}</td>
</tr>
<tr>
<td class="text">{69,3642,2925}</td>
</tr>
<tr>
<td class="text">{69,3642,3768}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683,8668}</td>
</tr>
<tr>
<td class="text">{69,3642,2925,5683,8668,3705}</td>
</tr>
<tr class="statusbar">
<td colspan="100">8 rows fetched in 0.0003s (0.0030s)</td>
</tr>
</table>
</div>
<pre>
Index Scan using node_pkey on node  (cost=0.00..8.52 rows=1 width=8)
  Index Cond: (id = 69)
</pre>
<p>This uses the function. Since the anchor part of the <strong>CTE</strong> has only one record, this is much faster and completes in <strong>3 ms</strong>.</p>
<h3>All paths</h3>
<p>Let&#8217;s check how long does it take to return all paths.</p>
<p>First, let&#8217;s use the <strong>CTE</strong>:</p>
<pre class="brush: sql">
WITH RECURSIVE
        search_graph(id, path) AS
        (
        SELECT  id, ARRAY[id]
        FROM    node
        UNION ALL
        SELECT  g.dest, sg.path || g.dest
        FROM    search_graph sg
        JOIN    graph g
        ON      g.source = sg.id
                AND NOT g.dest = ANY(sg.path)
        )
SELECT  COUNT(*), AVG(ARRAY_LENGTH(path, 1))
FROM    search_graph
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
<th>avg</th>
</tr>
<tr>
<td class="int8">159522</td>
<td class="numeric">10.3358157495517860</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.5469s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=191089.51..191089.52 rows=1 width=32)
  CTE search_graph
    -&gt;  Recursive Union  (cost=0.00..151096.75 rows=1599710 width=40)
          -&gt;  Seq Scan on node  (cost=0.00..164.00 rows=10000 width=8)
          -&gt;  Hash Join  (cost=288.91..11893.85 rows=158971 width=40)
                Hash Cond: (sg.id = g.source)
                Join Filter: (g.dest &lt;&gt; ALL (sg.path))
                -&gt;  WorkTable Scan on search_graph sg  (cost=0.00..2000.00 rows=100000 width=40)
                -&gt;  Hash  (cost=163.96..163.96 rows=9996 width=16)
                      -&gt;  Seq Scan on graph g  (cost=0.00..163.96 rows=9996 width=16)
  -&gt;  CTE Scan on search_graph  (cost=0.00..31994.20 rows=1599710 width=32)
</pre>
<p>Now, the function:</p>
<pre class="brush: sql">
SELECT  COUNT(*), AVG(ARRAY_LENGTH(path, 1))
FROM    (
        SELECT  fn_search_graph_cte(id) AS path
        FROM    node
        ) q
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
<th>avg</th>
</tr>
<tr>
<td class="int8">159522</td>
<td class="numeric">10.3358157495517860</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (2.1093s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=2814.01..2814.02 rows=1 width=32)
  -&gt;  Seq Scan on node  (cost=0.00..2664.00 rows=10000 width=8)
</pre>
<p>We see that the function introduces some overhead: it needs to be called <strong>10,000</strong> times and this takes <strong>2</strong> seconds (as opposed to <strong>500 ms</strong> for the plain <strong>CTE</strong>). This means that the function is more efficient if the total number of the root nodes is less than <strong>25%</strong> of all nodes.</p>
<h3>Conclusion</h3>
<p>A set-returning function is a good replacement for a view over a recursive <strong>CTE</strong> whose anchor part cannot be easily parametrized.</p>
<p>The sets returned by the function, when called in a <code>SELECT</code> list, are cross-joined with the corresponding records. When the filter on the <strong>CTE</strong> is selective, the function will only be applied few times to the records satisfying the filter.</p>
<p>Call to a function, however, introduces some overhead which should be taken into account. In some cases, when a filter is not very selective, it is better to use the benefits of set-based operations to generate the view and then filter it than to call a function multiple times, adding the overhead of each call.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Date ranges: overlapping with priority</title>
		<link>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/</link>
		<comments>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 19:00:31 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4663</guid>
		<description><![CDATA[Answering questions asked on the site. Jason Foster asks: We have a table of student registrations: Students student_code course_code course_section session_cd 987654321 ESC102H1 Y 20085 998766543 ELEE203H F 20085 course_code and course_section identify a course, session_cd is an academic session, e. g. 20085, 20091, 20079. The courses (stored in another table) have associated values for [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Jason Foster</strong> asks:</p>
<blockquote>
<p>We have a table of student registrations: </p>
<table class="excel">
<caption>Students</caption>
<tr>
<th>student_code</th>
<th>course_code</th>
<th>course_section</th>
<th>session_cd</th>
</tr>
<tr>
<td>987654321</td>
<td>ESC102H1</td>
<td>Y</td>
<td>20085</td>
</tr>
<tr>
<td>998766543</td>
<td>ELEE203H</td>
<td>F</td>
<td>20085</td>
</tr>
</table>
<p><code>course_code</code> and <code>course_section</code> identify a course, <code>session_cd</code> is an academic session, e. g. <strong>20085</strong>, <strong>20091</strong>, <strong>20079</strong>.</p>
<p>The courses (stored in another table) have associated values for <q>engineering design</q>, <q>complementary studies</q>, etc., like that:</p>
<table class="excel">
<caption>Courses</caption>
<tr>
<th>course_code</th>
<th>course_section</th>
<th>start_session</th>
<th>end_session</th>
<th>design</th>
<th>science</th>
<th>studies</th>
</tr>
<tr>
<td>ESC102H1</td>
<td>F</td>
<td>20071</td>
<td>20099</td>
<td>10</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>AER201Y1</td>
<td>Y</td>
<td>20059</td>
<td>NULL</td>
<td>0</td>
<td>0</td>
<td>30</td>
</tr>
</table>
<p>, or like that:</p>
<table class="excel">
<caption>In-house courses</caption>
<tr>
<th>course_code</th>
<th>course_section</th>
<th>student_code</th>
<th>design</th>
<th>science</th>
<th>studies</th>
</tr>
<tr>
<td>ESC102H1</td>
<td>F</td>
<td>998766543</td>
<td>10</td>
<td>0</td>
<td>0</td>
</tr>
</table>
<p>We are required by an external accreditation body to add up all of the units of <q>engineering design</q>, <q>complementary studies</q>, etc., taken by an individual student.</p>
<p>Where it gets really messy is that we have multiple data feeds for the associated values of courses. For example we have a set from the <strong>Registrar&#8217;s Office</strong>, the <strong>Civil Department</strong>, our <strong>In-House</strong> version, etc.</p>
<p>The rule is that <strong>In-House</strong> beats <strong>Civil</strong> beats the <strong>Registrar&#8217;s Office</strong> in the case of any duplication within the overlapping intervals.</p>
<p>The <code>session_cd</code> is of the form <code>YYYY{1,5,9}</code>.</p>
</blockquote>
<p>Basically, we have three sets here.</p>
<p>To get the course hours for a given student we should find a record for him in the in-house set, or, failing that, find if the session is within the ranges of one of the external sets (<strong>Civil</strong> or <strong>Registrar</strong>). If both ranges contain the academic session the student took the course, <strong>Civil</strong> should be taken.</p>
<p>The first part is quite simple: we just <code>LEFT JOIN</code> students with the in-house courses and get the hours for the courses which are filled. The real problem is the next part: searching for the ranges containing a given value.</p>
<p>As I already mentioned in the previous posts, relational databases are not in general that efficient for the queries like that. It&#8217;s easy to use an index to find a value of a column within a given range, but <strong>B-Tree</strong> indexes are of little help in searching for a range of two columns containing a given value.</p>
<p>However, in this case, the <a href="http://en.wikipedia.org/wiki/Data_domain">data domain</a> of <code>session_cd</code> is quite a limited set. For each pair of <code>session_start</code> and <code>session_end</code> it is easy to create a set of <em>all</em> possible values between <code>session_start</code> and <code>session_end</code>.</p>
<p>The overlapping parts of the session ranges from the two sets will yields two records for each of the sessions belonging to the range. Of these two records we will need to take the relevant one (that is <strong>Civil</strong>) by using <code>DISTINCT ON</code> with the additional sorting on the source (<strong>Civil</strong> goes first).</p>
<p>Then we just join the relevant records to the subset of the <code>students</code> which does not have corresponding records in the in-house version.</p>
<p>Finally, we need to union this with the in-house recordset.<br />
<span id="more-4663"></span></p>
<h3>Pictures</h3>
<p>Here&#8217;s the same thing in pictures:</p>
<ol>
<li>
<p>Within each source, courses are defined by the a single record holding the start and end session:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/lines.png" alt="" title="Lines" width="300" height="600" class="aligncenter size-full wp-image-4676 noborder" /></p>
</li>
<li>
<p>To find the courses superposition (regarding the priority) we split each range into a number of records, each corresponding to a single session, then combine these records in a singe recordset, ordered by session then by source:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/records.png" alt="" title="Records" width="300" height="450" class="aligncenter size-full wp-image-4682 noborder" /></p>
</li>
<li>
<p>From each session, we take a single record with the higher priority and use to to join with the students table:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/bricks.png" alt="" title="Bricks" width="300" height="600" class="aligncenter size-full wp-image-4675 noborder" /></p>
</li>
</ol>
<h3>Query</h3>
<p>Now, let&#8217;s create some sample tables and see how it works:</p>
<p><a href="#" onclick="xcollapse('X8397');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X8397" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_inhouse
        (
        course INT NOT NULL,
        student INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL,
        PRIMARY KEY (course, student)
        );

CREATE TABLE t_civil
        (
        id INT NOT NULL PRIMARY KEY,
        session_start INT NOT NULL, session_end INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL
        );

CREATE TABLE t_registrar
        (
        id INT NOT NULL PRIMARY KEY,
        session_start INT NOT NULL, session_end INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL
        );

CREATE TABLE t_student
        (
        id INT NOT NULL,
        course INT NOT NULL,
        session INT NOT NULL,
        PRIMARY KEY (id, course, session)
        );

SELECT  SETSEED(0.20100407);

INSERT
INTO    t_civil
SELECT  n,
        ((e / 3)) * 100 + (ARRAY[1, 5, 9])[e % 3 + 1],
        (((e + l) / 3)) * 100 + (ARRAY[1, 5, 9])[(e + l) % 3 + 1],
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n,
                6000 + CEILING(RANDOM() * 10)::INTEGER AS e,
                CEILING(RANDOM() * 20)::INTEGER AS l
        FROM    generate_series(1, 200) n
        ) q;

INSERT
INTO    t_registrar
SELECT  n,
        ((e / 3)) * 100 + (ARRAY[1, 5, 9])[e % 3 + 1],
        (((e + l) / 3)) * 100 + (ARRAY[1, 5, 9])[(e + l) % 3 + 1],
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n,
                6000 + CEILING(RANDOM() * 10)::INTEGER AS e,
                CEILING(RANDOM() * 20)::INTEGER AS l
        FROM    generate_series(1, 200) n
        ) q;

INSERT
INTO    t_inhouse
SELECT  n,
        s,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n, generate_series(1, 100) s
        FROM    (
                SELECT  n
                FROM    generate_series(1, 200) n
                ) q
        ) q;

INSERT
INTO    t_student
SELECT  i, c,
        ((s / 3)) * 100 + (ARRAY[1, 5, 9])[s % 3 + 1]
FROM    (
        SELECT  *, 6000 + CEILING(RANDOM() * 30)::INTEGER AS s,
                RANDOM() AS rnd
        FROM    (
                SELECT  *,
                        generate_series(1, 200) c
                FROM    (
                        SELECT  i
                        FROM    generate_series(1, 500) i
                        ) q
                ) q
        ) q
WHERE   rnd &lt; 0.1
</pre>
</div>
<p>The tables contain <strong>500</strong> students, random civil and registrar ranges for <strong>200</strong> courses, and <strong>20,000</strong> in-house records for first <strong>100</strong> students.</p>
<p>And here&#8217;s the query (limited to return first <strong>10</strong> records for the sake of readability):</p>
<pre class="brush: sql">
SELECT  1 AS sourse, s.id AS student, i.course AS course, hours1, hours2, hours3
FROM    t_student s
JOIN    t_inhouse i
ON      i.student = s.id
        AND i.course = s.course
UNION ALL
SELECT  source, s.id, q.id, hours1, hours2, hours3
FROM    t_student s
JOIN    (
        SELECT  DISTINCT ON (current_session, id) *
        FROM    (
                SELECT  *,
                        ((cs / 3)) * 100 + (ARRAY[1, 5, 9])[cs % 3 + 1] AS current_session
                FROM    (
                        SELECT  *,
                                generate_series(e, l) AS cs
                        FROM    (
                                SELECT  *,
                                        session_start * 3 / 100 + CASE (session_start % 100) WHEN 1 THEN 0 WHEN 5 THEN 1 ELSE 2 END AS e,
                                        session_end * 3 / 100 + CASE (session_end % 100) WHEN 1 THEN 0 WHEN 5 THEN 1 ELSE 2 END AS l
                                FROM    (
                                        SELECT  2 AS source, *
                                        FROM    t_civil
                                        UNION ALL
                                        SELECT  3 AS source, *
                                        FROM    t_registrar
                                        ) q
                                ) q
                        ) q
                ) q
        ORDER BY
                current_session, id, source
        ) q
ON      q.current_session = s.session
        AND q.id = s.course
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    t_inhouse ih
        WHERE   ih.student = s.id
                AND ih.course = s.course
        )
ORDER BY
        student, course
LIMIT 10
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sourse</th>
<th>student</th>
<th>course</th>
<th>hours1</th>
<th>hours2</th>
<th>hours3</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">21</td>
<td class="int4">25</td>
<td class="int4">44</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">18</td>
<td class="int4">49</td>
<td class="int4">49</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">26</td>
<td class="int4">12</td>
<td class="int4">26</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">27</td>
<td class="int4">32</td>
<td class="int4">38</td>
<td class="int4">22</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">39</td>
<td class="int4">39</td>
<td class="int4">27</td>
<td class="int4">36</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">44</td>
<td class="int4">32</td>
<td class="int4">26</td>
<td class="int4">30</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">51</td>
<td class="int4">3</td>
<td class="int4">21</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">54</td>
<td class="int4">7</td>
<td class="int4">10</td>
<td class="int4">34</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">57</td>
<td class="int4">46</td>
<td class="int4">34</td>
<td class="int4">10</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">63</td>
<td class="int4">50</td>
<td class="int4">18</td>
<td class="int4">7</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0008s (0.0968s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=2323.47..2323.50 rows=10 width=20)
  -&gt;  Sort  (cost=2323.47..2328.52 rows=2019 width=20)
        Sort Key: s.id, i.course
        -&gt;  Append  (cost=817.31..2279.84 rows=2019 width=20)
              -&gt;  Merge Join  (cost=817.31..1837.75 rows=1979 width=20)
                    Merge Cond: ((i.course = s.course) AND (i.student = s.id))
                    -&gt;  Index Scan using t_inhouse_pkey on t_inhouse i  (cost=0.00..825.94 rows=20000 width=20)
                    -&gt;  Sort  (cost=817.21..842.18 rows=9986 width=8)
                          Sort Key: s.course, s.id
                          -&gt;  Seq Scan on t_student s  (cost=0.00..153.86 rows=9986 width=8)
              -&gt;  Nested Loop Anti Join  (cost=362.94..421.91 rows=40 width=24)
                    -&gt;  Hash Join  (cost=362.94..404.44 rows=50 width=28)
                          Hash Cond: ((((((q.cs / 3) * 100) + (&#39;{1,5,9}&#39;::integer[])[((q.cs % 3) + 1)])) = s.session) AND (q.id = s.course))
                          -&gt;  Unique  (cost=59.29..62.29 rows=200 width=40)
                                -&gt;  Sort  (cost=59.29..60.29 rows=400 width=40)
                                      Sort Key: ((((q.cs / 3) * 100) + (&#39;{1,5,9}&#39;::integer[])[((q.cs % 3) + 1)])), q.id, q.source
                                      -&gt;  Subquery Scan q  (cost=0.00..42.00 rows=400 width=40)
                                            -&gt;  Result  (cost=0.00..33.00 rows=400 width=56)
                                                  -&gt;  Append  (cost=0.00..8.00 rows=400 width=56)
                                                        -&gt;  Seq Scan on t_civil  (cost=0.00..4.00 rows=200 width=56)
                                                        -&gt;  Seq Scan on t_registrar  (cost=0.00..4.00 rows=200 width=56)
                          -&gt;  Hash  (cost=153.86..153.86 rows=9986 width=12)
                                -&gt;  Seq Scan on t_student s  (cost=0.00..153.86 rows=9986 width=12)
                    -&gt;  Index Scan using t_inhouse_pkey on t_inhouse ih  (cost=0.00..0.35 rows=1 width=8)
                          Index Cond: ((ih.course = s.course) AND (ih.student = s.id))
</pre>
<p>The query returns the required hours and the source of these hours for each of the courses a student attended.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: using recursive functions in nested sets</title>
		<link>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/</link>
		<comments>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 20:00:45 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4528</guid>
		<description><![CDATA[In the previous article, I discussed a way to improve nested sets model in PostgreSQL. The approach shown in the article used an analytical function to filter all immediate children of a node in a recursive CTE. This allowed us to filter a node&#8217;s children on the level more efficiently than R-Tree or B-Tree approaches [...]]]></description>
			<content:encoded><![CDATA[<p>In the previous article, I discussed <a href="/2010/03/01/postgresql-nested-sets-and-r-tree/">a way to improve nested sets model in <strong>PostgreSQL</strong></a>.</p>
<p>The approach shown in the article used an analytical function to filter all immediate children of a node in a recursive <strong>CTE</strong>.</p>
<p>This allowed us to filter a node&#8217;s children on the level more efficiently than <strong>R-Tree</strong> or <strong>B-Tree</strong> approaches do (since they rely on <code>COUNT(*)</code>).</p>
<p>That solution was pure <strong>SQL</strong> and it was quite fast, but not optimal.</p>
<p>The drawback of that solution is that it still needs to fetch all children of a node to apply the analytic function to them. This can take much time for the top of the hierarchy. And since the top of the hierarchy is what is what usually shown at the start page, it would be very nice to improve this query yet a little more.</p>
<p>We can do it by creating and using a simple recursive <strong>SQL</strong> function. This function does not even require <strong>PL/pgSQL</strong> to be enabled.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4528"></span><br />
<a href="#" onclick="xcollapse('X9881');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X9881" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_hierarchy (
        id INT NOT NULL,
        parent INT NOT NULL,
        lft INT NOT NULL,
        rgt INT NOT NULL,
        data VARCHAR(100) NOT NULL,
        stuffing VARCHAR(100) NOT NULL
);

INSERT
INTO    t_hierarchy
WITH RECURSIVE
        ini AS
        (
        SELECT  8 AS level, 5 AS children
        ),
        range AS
        (
        SELECT  level, children,
                (
                SELECT  SUM(POW(children, n)::INTEGER * ((n &lt; level)::INTEGER + 1))
                FROM    generate_series(level, 0, -1) n
                ) width
        FROM    ini
        ),
        q AS
        (
        SELECT  s AS id, 0 AS parent, level, children,
                1 + width * (s - 1) AS lft,
                1 + width * s - 1 AS rgt,
                width / children AS width
        FROM    (
                SELECT  r.*, generate_series(1, children) s
                FROM    range r
                ) q2
        UNION ALL
        SELECT  id * children + position, id, level - 1, children,
                1 + lft + width * (position - 1),
                1 + lft + width * position - 1,
                width / children
        FROM    (
                SELECT  generate_series(1, children) AS position, q.*
                FROM    q
                ) q2
        WHERE   level &gt; 0
        )
SELECT  id, parent, lft, rgt, &#039;Value &#039; || id, RPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    q;

ALTER TABLE t_hierarchy ADD CONSTRAINT pk_hierarchy_id PRIMARY KEY (id);
CREATE UNIQUE INDEX ux_hierarchy_lft ON t_hierarchy (lft);
CREATE UNIQUE INDEX ux_hierarchy_rgt ON t_hierarchy (rgt);
CREATE INDEX ix_hierarchy_parent ON t_hierarchy (parent);
CREATE INDEX ix_hierarchy_sets ON t_hierarchy USING GIST(POLYGON(BOX(POINT(-1, lft), POINT(1, rgt))));
</pre>
</div>
<p>If we run the query introduced in the previous article to fetch all children up to level <strong>2</strong> from a really top node, we get the following results:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, lft, rgt, 1 AS lvl
        FROM    t_hierarchy
        WHERE   id = 1
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt, lvl + 1
                FROM    q
                JOIN    t_hierarchy hc
                ON      hc.lft &gt; q.lft
                        AND hc.lft &lt; q.rgt
                WHERE   lvl &lt;= 2
                ORDER BY
                        MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft), hc.lft
                ) q2
        )
SELECT  *
FROM    q
</pre>
<p><a href="#" onclick="xcollapse('X3758');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X3758" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>lvl</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">585937</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">2</td>
<td class="int4">117188</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">117189</td>
<td class="int4">234375</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">234376</td>
<td class="int4">351562</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">351563</td>
<td class="int4">468749</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">468750</td>
<td class="int4">585936</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">31</td>
<td class="int4">3</td>
<td class="int4">23439</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">32</td>
<td class="int4">23440</td>
<td class="int4">46876</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">46877</td>
<td class="int4">70313</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">34</td>
<td class="int4">70314</td>
<td class="int4">93750</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">35</td>
<td class="int4">93751</td>
<td class="int4">117187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">36</td>
<td class="int4">117190</td>
<td class="int4">140626</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">37</td>
<td class="int4">140627</td>
<td class="int4">164063</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">38</td>
<td class="int4">164064</td>
<td class="int4">187500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">39</td>
<td class="int4">187501</td>
<td class="int4">210937</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">40</td>
<td class="int4">210938</td>
<td class="int4">234374</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">41</td>
<td class="int4">234377</td>
<td class="int4">257813</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">43</td>
<td class="int4">281251</td>
<td class="int4">304687</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">44</td>
<td class="int4">304688</td>
<td class="int4">328124</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">45</td>
<td class="int4">328125</td>
<td class="int4">351561</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">46</td>
<td class="int4">351564</td>
<td class="int4">375000</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">47</td>
<td class="int4">375001</td>
<td class="int4">398437</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">48</td>
<td class="int4">398438</td>
<td class="int4">421874</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">49</td>
<td class="int4">421875</td>
<td class="int4">445311</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">50</td>
<td class="int4">445312</td>
<td class="int4">468748</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">51</td>
<td class="int4">468751</td>
<td class="int4">492187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">52</td>
<td class="int4">492188</td>
<td class="int4">515624</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">53</td>
<td class="int4">515625</td>
<td class="int4">539061</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">54</td>
<td class="int4">539062</td>
<td class="int4">562498</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">55</td>
<td class="int4">562499</td>
<td class="int4">585935</td>
<td class="int4">3</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0169s (14.7499s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on q  (cost=3923687.62..4086447.04 rows=8137971 width=16)
  CTE q
    -&gt;  Recursive Union  (cost=0.00..3923687.62 rows=8137971 width=16)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy  (cost=0.00..8.54 rows=1 width=12)
                Index Cond: (id = 1)
          -&gt;  Subquery Scan q2  (cost=363885.01..376091.97 rows=813797 width=16)
                -&gt;  Unique  (cost=363885.01..367954.00 rows=813797 width=20)
                      -&gt;  Sort  (cost=363885.01..365919.50 rows=813797 width=20)
                            Sort Key: (max(hc.rgt) OVER (?)), hc.lft
                            -&gt;  WindowAgg  (cost=265682.87..283993.30 rows=813797 width=20)
                                  -&gt;  Sort  (cost=265682.87..267717.36 rows=813797 width=20)
                                        Sort Key: q.id, hc.lft
                                        -&gt;  Nested Loop  (cost=5335.66..185791.16 rows=813797 width=20)
                                              -&gt;  WorkTable Scan on q  (cost=0.00..0.22 rows=3 width=16)
                                                    Filter: (lvl &lt;= 2)
                                              -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5335.66..57861.32 rows=271266 width=12)
                                                    Recheck Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
                                                    -&gt;  Bitmap Index Scan on ux_hierarchy_lft  (cost=0.00..5267.84 rows=271266 width=0)
                                                          Index Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
</pre>
</div>
<p>This runs for almost <strong>15 seconds</strong>: too much.</p>
<p>This can be improved by exploiting these two properties of the nested sets model:</p>
<ol>
<li>
<p>The first immediate child of a node is the node holding the first <code>lft</code> next to the node&#8217;s <code>lft</code></p>
</li>
<li>
<p>The next sibling of a node is the node holding the first <code>lft</code> next to the node&#8217;s <code>rgt</code></p>
</li>
</ol>
<p>If we recursively traverse through the nodes, we can find the first child as well as all of its siblings. This is enough to build a hierarchy, and level filter can be implemented merely by limiting the recursion depth.</p>
<p>However, recursive <strong>CTE</strong>&#8216;s only allow one recursion level. We cannot nest the <code>WITH</code> clause.</p>
<p>To work around that, we can use <strong>PostgreSQL</strong>&#8216;s ability to run set-returning functions recursively. We will use the function-based recursion to iterate the parent-child axis, and the <strong>CTE</strong>-based recursion to iterate siblings axis.</p>
<p>We need to create a function that would take a node&#8217;s id on input and return a set of its children on output, with the function recursively applied to each of the children. To find a set of children, we will implement a recursive <strong>CTE</strong> that finds the first child in the anchor part and the next sibling in the recursive part.</p>
<p>Here&#8217;s the function:</p>
<pre class="brush: sql">
CREATE OR REPLACE FUNCTION fn_get_children(id INT, level INT)
RETURNS SETOF INT[] AS
$$
        WITH    RECURSIVE q AS
                (
                SELECT  (hc).id, (hc).lft, (hc).rgt, prgt
                FROM    (
                        SELECT  (
                                SELECT  hc
                                FROM    t_hierarchy hc
                                WHERE   hc.lft &gt; hp.lft
                                        AND hc.lft &lt; hp.rgt
                                ORDER BY
                                        hc.lft
                                LIMIT 1
                                ) hc,
                                rgt AS prgt
                        FROM    t_hierarchy hp
                        WHERE   hp.id = $1
                        ) q2
                UNION ALL
                SELECT  (hc).id, (hc).lft, (hc).rgt, prgt
                FROM    (
                        SELECT  (
                                SELECT  hc
                                FROM    t_hierarchy hc
                                WHERE   hc.lft &gt; q.rgt
                                        AND hc.lft &lt; q.prgt
                                ORDER BY
                                        hc.lft
                                LIMIT 1
                                ) hc,
                                prgt
                        FROM    q
                        WHERE   q.lft IS NOT NULL
                        ) q2
                )
        SELECT  CASE which
                WHEN 1 THEN ARRAY[q.id, $2]
                ELSE fn_get_children(q.id, $2 - 1)
                END
        FROM    (
                VALUES (1), (2)
                ) vals(which)
        CROSS JOIN
                q
        WHERE   q.id IS NOT NULL
                AND $2 &gt; 0
        ORDER BY
                id, which;
$$ LANGUAGE sql;
</pre>
<p>The function accepts a node&#8217;s <code>id</code> and a level on input, and returns a set of arrays, each corresponding to one of the node&#8217;s children and its level. The level returned by the function decreases and in fact represents not the level as such, but the number of levels left to reach the filter-set bottom. But since the initial level is user-set, it is easy to cast it to the actual level.</p>
<p>Let&#8217;s run the function:</p>
<pre class="brush: sql">
SELECT  c[1], 3 - c[2]
FROM    fn_get_children(1, 2) c;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>c</th>
<th>?column?</th>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">34</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">35</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">31</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">32</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">38</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">40</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">39</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">36</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">37</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">41</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">43</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">44</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">45</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">47</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">50</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">46</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">48</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">49</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">53</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">55</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">51</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">52</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">54</td>
<td class="int4">2</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0023s (0.0658s)</td>
</tr>
</table>
</div>
<pre>
Function Scan on fn_get_children c  (cost=0.00..262.50 rows=1000 width=32)
</pre>
<p>As we can see, the function returned all children and grandchildren of the node <strong>1</strong> along with their level, and did it in only <strong>65 ms</strong>.</p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: nested sets and R-Tree</title>
		<link>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/</link>
		<comments>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 20:00:04 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4506</guid>
		<description><![CDATA[A feedback on one of my previous articles comparing adjacency list and nested sets models for PostgreSQL. Jay writes: In your series on adjacency lists vs nested sets, you discuss geometric types and R-Tree indexes in MySQL, but you don&#8217;t discuss them when discussing the same subject with PostgreSQL, which also has geometric types and [...]]]></description>
			<content:encoded><![CDATA[<p>A feedback on one of my previous articles comparing <a href="/2009/09/24/adjacency-list-vs-nested-sets-postgresql/">adjacency list and nested sets models for <strong>PostgreSQL</strong></a>.</p>
<p><strong>Jay</strong> writes:</p>
<blockquote>
<p>In your series on adjacency lists vs nested sets, you discuss <a href="/2009/09/29/adjacency-list-vs-nested-sets-mysql/">geometric types and <strong>R-Tree</strong> indexes in <strong>MySQL</strong></a>, but you don&#8217;t discuss them when discussing the same subject with <strong>PostgreSQL</strong>, which also has geometric types and <strong>R-Tree</strong> indexing (mostly available through <a href="http://www.postgresql.org/docs/8.4/static/gist-examples.html"><strong>GiST</strong> indexes</a>).</p>
<p>To make it simple I added the following line after the data insertion part of the script at Nested Sets &#8211; Postgresql:</p>
<pre class="brush: sql">
ALTER TABLE t_hierarchy ADD COLUMN sets POLYGON;
UPDATE t_hierarchy SET sets = POLYGON(BOX(POINT(-1,lft), POINT(1, rgt)));
</pre>
<p>It needed to be a <code>POLYGON</code> instead of a <code>BOX</code> since there is a <code>@>(POLYGON,POLYGON)</code> function but no <code>@>(BOX,BOX)</code> function, and the polygon was cast from the box to create the rectangle shape required.</p>
<p>It outperforms the adjacency list on <q>all descendants</q>; outperforms it on <q>all ancestors</q> (not by much); performs reasonably well on <q>all descendants up to a certain level</q> on items with few descendants (e. g. <strong>31415</strong>) and badly on items with many descendants (e. g. <strong>42</strong>).</p>
<p>It still completes in less than <strong>20</strong> seconds though, which is an improvement over <strong>1</strong> minute.</p>
</blockquote>
<p><strong>PostgreSQL</strong> does support <strong>R-Tree</strong> indexes indeed (through <strong>GiST</strong> interface), and they can be used to improve the efficiency of the nested sets model.</p>
<p>Let&#8217;s create a sample table and try some of the queries that <strong>Jay</strong> proposed:<br />
<span id="more-4506"></span><br />
<a href="#" onclick="xcollapse('X9066');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X9066" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_hierarchy (
        id INT NOT NULL,
        parent INT NOT NULL,
        lft INT NOT NULL,
        rgt INT NOT NULL,
        data VARCHAR(100) NOT NULL,
        stuffing VARCHAR(100) NOT NULL
);

INSERT
INTO    t_hierarchy
WITH RECURSIVE
        ini AS
        (
        SELECT  8 AS level, 5 AS children
        ),
        range AS
        (
        SELECT  level, children,
                (
                SELECT  SUM(POW(children, n)::INTEGER * ((n &lt; level)::INTEGER + 1))
                FROM    generate_series(level, 0, -1) n
                ) width
        FROM    ini
        ),
        q AS
        (
        SELECT  s AS id, 0 AS parent, level, children,
                1 + width * (s - 1) AS lft,
                1 + width * s - 1 AS rgt,
                width / children AS width
        FROM    (
                SELECT  r.*, generate_series(1, children) s
                FROM    range r
                ) q2
        UNION ALL
        SELECT  id * children + position, id, level - 1, children,
                1 + lft + width * (position - 1),
                1 + lft + width * position - 1,
                width / children
        FROM    (
                SELECT  generate_series(1, children) AS position, q.*
                FROM    q
                ) q2
        WHERE   level &gt; 0
        )
SELECT  id, parent, lft, rgt, &#039;Value &#039; || id, RPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    q;

ALTER TABLE t_hierarchy ADD CONSTRAINT pk_hierarchy_id PRIMARY KEY (id);
CREATE INDEX ix_hierarchy_lft ON t_hierarchy (lft);
CREATE INDEX ix_hierarchy_rgt ON t_hierarchy (rgt);
CREATE INDEX ix_hierarchy_parent ON t_hierarchy (parent);
CREATE INDEX ix_hierarchy_sets ON t_hierarchy USING GIST(POLYGON(BOX(POINT(-1, lft), POINT(1, rgt))));
</pre>
</div>
<p>To make the management of the table easier, I didn&#8217;t create an additional column with the geometric representation of the nested sets, but instead just defined an index on a derived expression, so that updating <code>lft</code> and <code>rgt</code> columns would be enough to update the set.</p>
<p>Now, let&#8217;s see how these queries perform.</p>
<h3>All descendants</h3>
<pre class="brush: sql">
SELECT  SUM(LENGTH(hc.stuffing)), COUNT(*)
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
</pre>
<p><a href="#" onclick="xcollapse('X3393');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X3393" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
<th>count</th>
</tr>
<tr>
<td class="int8">1953100</td>
<td class="int8">19531</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0003s (0.2139s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=8253.58..8253.60 rows=1 width=101)
  -&gt;  Nested Loop  (cost=136.32..8241.37 rows=2441 width=101)
        -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
              Index Cond: (id = 42)
        -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=136.32..8129.10 rows=2441 width=109)
              Recheck Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
              -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                    Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
</pre>
</div>
<p>Quite fast, <strong>213 ms</strong>.</p>
<h3>All ancestors</h3>
<pre class="brush: sql">
SELECT  hc.id, hc.lft, hc.rgt, hc.parent
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) @&gt; POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
</pre>
<p><a href="#" onclick="xcollapse('X4471');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4471" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>parent</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">585937</td>
<td class="int4">0</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">234376</td>
<td class="int4">351562</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">8</td>
</tr>
<tr class="statusbar">
<td colspan="100">3 rows fetched in 0.0007s (0.0127s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=136.32..8241.37 rows=2441 width=16)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 42)
  -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=136.32..8129.10 rows=2441 width=16)
        Recheck Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) @&gt; polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
        -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
              Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) @&gt; polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
</pre>
</div>
<p>Extremely fast: only <strong>10 ms</strong>.</p>
<h3>All descendants up to a certain level</h3>
<pre class="brush: sql">
SELECT  hc.id, hc.lft, hc.rgt, hc.parent
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hcp
        WHERE   POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hcp.lft), POINT(1, hcp.rgt)))
        ) -
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hpp
        WHERE   POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hpp.lft), POINT(1, hpp.rgt)))
        ) &lt;= 2
</pre>
<p><a href="#" onclick="xcollapse('X2169');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X2169" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>parent</th>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="int4">215</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0039s (20.2523s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=0.03..40113216.41 rows=814 width=16)
  Join Filter: (((SubPlan 1) - (SubPlan 2)) &lt;= 2)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 42)
  -&gt;  Index Scan using ix_hierarchy_sets on t_hierarchy hc  (cost=0.03..9692.12 rows=2441 width=16)
        Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
  SubPlan 1
    -&gt;  Aggregate  (cost=8214.53..8214.54 rows=1 width=0)
          -&gt;  Bitmap Heap Scan on t_hierarchy hcp  (cost=136.32..8208.43 rows=2441 width=0)
                Recheck Cond: (polygon(box(point((-1)::double precision, ($0)::double precision), point(1::double precision, ($1)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
                -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                      Index Cond: (polygon(box(point((-1)::double precision, ($0)::double precision), point(1::double precision, ($1)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
  SubPlan 2
    -&gt;  Aggregate  (cost=8214.53..8214.54 rows=1 width=0)
          -&gt;  Bitmap Heap Scan on t_hierarchy hpp  (cost=136.32..8208.43 rows=2441 width=0)
                Recheck Cond: (polygon(box(point((-1)::double precision, ($2)::double precision), point(1::double precision, ($3)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
                -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                      Index Cond: (polygon(box(point((-1)::double precision, ($2)::double precision), point(1::double precision, ($3)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
</pre>
</div>
<p>This, exactly as was mentioned by <strong>Jay</strong>, is much faster than using a <strong>B-Tree</strong> index but still too slow: <strong>20</strong> seconds.</p>
<h3>Analysis</h3>
<p>The nested sets model, improved by using the <strong>R-Tree</strong> indexes, provides a way to tell if two records are in the same ancestry chain.</p>
<p>However, even with the <strong>R-Tree</strong>, the model provides no simple means to tell how deep is a record nested.</p>
<p>To check it, an <strong>R-Tree</strong> index scan should be made which would return all of the record&#8217;s ancestors, the the number of the ancestors is to be compared with that of the parent node.</p>
<p>For a record with lots of ancestors (which was the case for the record <strong>42</strong> we used in the test queries), this means that thousands of records should be checked in a nested loop, out of which only a dozen will be returned.</p>
<p>Ironically, for the real-world models, this type of query is most often used, and used against the records with lots of descendants it is.</p>
<p>Usually, when hierarchical data are stored in a database, they are presented to a user in the form of a tree view. When the user opens the catalog, the first-level entries are show; when the user clicks on <q>expand</q> button of an entry, all immediate children of the entry should be shown.</p>
<p>Since users usually start browsing from the beginning, clicking the expand buttons on the first-level or second-level entries is what happens most often. And, unfortunately, it takes the most time to execute these queries.</p>
<p>Adjacency list model provides a constant time solution to this problem, since fetching all immediate children requires a single index scan. This is extremely fast on showing the immediate children.</p>
<p>A user can also click on <q>expand all</q> which should just return all children of the given entry.</p>
<p>However, clicking on <q>expand all</q> on a high-level entry will return too many records, so a time to download them or represent them in the GUI will be much more than that required to fetch them out of the table. A properly written GUI usually limits the level of the records returned so that GUI remains responsive, which, it its turn, implies the same problem of filtering on level.</p>
<p>The low-level entries (for which it makes sense to implement <q>expand all</q> without any limitations) can be queried for their descendants with the <strong>R-Tree</strong> query in the nested sets model or with a recursive query in the adjacency list model almost equally fast, since low-level entries contain few records.</p>
<p>The same applies to selecting all ancestors. Despite the fact that the nested sets model outperforms slightly the adjacency list model on this type of query, the absolute numbers are very small and the times that both queries take are almost imperceptible to the bare eye. A hierarchy is seldom more than a dozen levels deep, and fetching each ancestor even with a recursive query requires but one unique index scan per ancestor.</p>
<p>However, one may still be forced to use the nested tree model. This may be the way an ORM stores its data in the database; a heavily used legacy schema too old and scary to touch; or just some obscure model which mostly requires fetching all descendants fast with an occasional need to filter on the level.</p>
<p>Here are some methods to deal with it.</p>
<h3>Analytic functions</h3>
<p>Though there is no efficient way to filter all descendants on the level, there still is a way to fetch all <em>immediate</em> children of a record.</p>
<p>If we select all records within <code>lft</code> and <code>rgt</code> of a given entry and order them by <code>lft</code>, the first record will be the first immediate child of the entry.</p>
<p>All descendants of the first child will be returned before the second child and have <code>rgt</code> less than that of the first child.</p>
<p>This means that if we record the <code>MAX(rgt)</code> fetched so far, it will be that of the last immediate child of the entry fetched so far:</p>
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>MAX(rgt)</th>
</tr>
<tr>
<td>  2</td>
<td>2</td>
<td>11</td>
<td>11</td>
</tr>
<tr>
<td>    3</td>
<td>3</td>
<td>4</td>
<td>11</td>
</tr>
<tr>
<td>    4</td>
<td>5</td>
<td>8</td>
<td>11</td>
</tr>
<tr>
<td>      5</td>
<td>6</td>
<td>7</td>
<td>11</td>
</tr>
<tr>
<td>    6</td>
<td>9</td>
<td>10</td>
<td>11</td>
</tr>
<tr>
<td>  7</td>
<td>12</td>
<td>15</td>
<td>15</td>
</tr>
<tr>
<td>    8</td>
<td>13</td>
<td>14</td>
<td>15</td>
</tr>
</table>
<p>This means that each value of <code>MAX(rgt)</code> will correspond to exactly one immediate child; and the first entry in the recordset holding the value of <code>MAX(rgt)</code> will be that first child.</p>
<p>Several method exist to <a href="/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/">select records holding group-wise maximum in PostgreSQL</a>. In this case, is will be best to use <strong>PostgreSQL</strong>&#8216;s <code>DISTINCT ON</code>.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (MAX(hc.rgt) OVER (ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft &gt; hp.lft
        AND hc.lft &lt; hp.rgt
WHERE   hp.id = 42
ORDER BY
        MAX(hc.rgt) OVER (ORDER BY hc.lft), hc.lft
</pre>
<p><a href="#" onclick="xcollapse('X5039');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X5039" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
</tr>
<tr class="statusbar">
<td colspan="100">5 rows fetched in 0.0008s (0.1642s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=116073.33..117429.66 rows=271267 width=12)
  -&gt;  Sort  (cost=116073.33..116751.50 rows=271267 width=12)
        Sort Key: (max(hc.rgt) OVER (?)), hc.lft
        -&gt;  WindowAgg  (cost=86845.19..91592.36 rows=271267 width=12)
              -&gt;  Sort  (cost=86845.19..87523.35 rows=271267 width=12)
                    Sort Key: hc.lft
                    -&gt;  Nested Loop  (cost=5761.00..62364.22 rows=271267 width=12)
                          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
                                Index Cond: (id = 42)
                          -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5761.00..58286.67 rows=271267 width=12)
                                Recheck Cond: ((hc.lft &gt; hp.lft) AND (hc.lft &lt; hp.rgt))
                                -&gt;  Bitmap Index Scan on ix_hierarchy_lft  (cost=0.00..5693.19 rows=271267 width=0)
                                      Index Cond: ((hc.lft &gt; hp.lft) AND (hc.lft &lt; hp.rgt))
</pre>
</div>
<p>, which is reasonably fast (only <strong>160 ms</strong>).</p>
<p>Using <strong>PostgreSQL 8.4</strong> recursive abilities, this approach can be extended to select the descendants up to any level (provided as a parameter to the query).</p>
<p>Here&#8217;s the query to select all children and grandchildren:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, lft, rgt, 1 AS lvl
        FROM    t_hierarchy
        WHERE   id = 42
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt, lvl + 1
                FROM    q
                JOIN    t_hierarchy hc
                ON      hc.lft &gt; q.lft
                        AND hc.lft &lt; q.rgt
                WHERE   lvl &lt;= 2
                ORDER BY
                        MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft), hc.lft
                ) q2
        )
SELECT  *
FROM    q
</pre>
<p><a href="#" onclick="xcollapse('X4763');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4763" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>lvl</th>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="int4">3</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0042s (0.4342s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on q  (cost=3923702.09..4086462.51 rows=8138021 width=16)
  CTE q
    -&gt;  Recursive Union  (cost=0.00..3923702.09 rows=8138021 width=16)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy  (cost=0.00..8.54 rows=1 width=12)
                Index Cond: (id = 42)
          -&gt;  Subquery Scan q2  (cost=363886.28..376093.31 rows=813802 width=16)
                -&gt;  Unique  (cost=363886.28..367955.29 rows=813802 width=20)
                      -&gt;  Sort  (cost=363886.28..365920.79 rows=813802 width=20)
                            Sort Key: (max(hc.rgt) OVER (?)), hc.lft
                            -&gt;  WindowAgg  (cost=265683.50..283994.05 rows=813802 width=20)
                                  -&gt;  Sort  (cost=265683.50..267718.01 rows=813802 width=20)
                                        Sort Key: q.id, hc.lft
                                        -&gt;  Nested Loop  (cost=5335.67..185791.26 rows=813802 width=20)
                                              -&gt;  WorkTable Scan on q  (cost=0.00..0.22 rows=3 width=16)
                                                    Filter: (lvl &lt;= 2)
                                              -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5335.67..57861.34 rows=271267 width=12)
                                                    Recheck Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
                                                    -&gt;  Bitmap Index Scan on ix_hierarchy_lft  (cost=0.00..5267.85 rows=271267 width=0)
                                                          Index Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
</pre>
</div>
<p>This is also reasonably fast, only <strong>432 ms</strong>. It is slower than the same adjacency list query (which completes in several milliseconds), but still is much faster than <strong>R-Tree</strong> and of course the least efficient <strong>B-Tree</strong> solutions involving <code>COUNT(*)</code> and can ease your life if you have to deal with a nested sets model.</p>
<p><strong>To be continued.</strong></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Six degrees of separation</title>
		<link>http://explainextended.com/2010/02/27/six-degrees-of-separation/</link>
		<comments>http://explainextended.com/2010/02/27/six-degrees-of-separation/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 20:00:45 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4477</guid>
		<description><![CDATA[Answering questions asked on the site. Kathy asks: I am developing a social network site in PostgreSQL and want to find out if two people are no more than 6 friends apart. If your site grows popular, most probably, they are not. But we better check. On most social networks, friendship is a symmetric relationship [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Kathy</strong> asks:</p>
<blockquote>
<p>I am developing a social network site in <strong>PostgreSQL</strong> and want to find out if two people are no more than <strong>6</strong> friends apart.</p>
</blockquote>
<p>If your site grows popular, most probably, <a href="http://en.wikipedia.org/wiki/Six_degrees_of_separation">they are not</a>. But we better check.</p>
<p>On most social networks, friendship is a symmetric relationship (however, <a href="http://livejournal.com">LiveJournal</a> is a notable exception). This means that if Alice is a friend to Bob, then Bob is a friend to Alice as well.</p>
<p>The friendship relationship is best stored in a many-to-many link table with a <code>PRIMARY KEY</code> on both link fields and an additional check condition: the friend with the least id should be stored in the first column. This is to avoid storing a relationship twice: a <code>PRIMARY KEY</code> won&#8217;t be violated if the same record with the columns swapped will be inserted, but the check constraint will. The check constraint will also forbid storing a friend relationship to itself.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4477"></span><br />
<a href="#" onclick="xcollapse('X5926');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X5926" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE friends (
        orestes INT NOT NULL,
        pylades INT NOT NULL,
        CHECK (orestes &lt; pylades)
);

SELECT  SETSEED(0.20100227);

INSERT
INTO    friends
SELECT  o, p
FROM    (
        SELECT  o, SUM(FLOOR(RANDOM() * 100000) + 1) OVER (PARTITION BY o ORDER BY n) AS p
        FROM    (
                SELECT  o, generate_series(1, 20) n
                FROM    generate_series(1, 1000000) o
                ) q
        ) q2
WHERE   o &lt; p
        AND p &lt;= 1000000;

ALTER TABLE friends ADD CONSTRAINT pk_friends_op PRIMARY KEY (orestes, pylades);

CREATE UNIQUE INDEX ux_friends_po ON friends (pylades, orestes);
</pre>
</div>
<p>This table stores records for <strong>1,000,000</strong> people having <strong>20</strong> friends each in average. The first column is named <code>orestes</code> and the second one <code>pylades</code>.</p>
<p>With new <strong>PostgreSQL 8.4</strong> it is easy to write a recursive query that would traverse the relationship graph up to the given level and stop on the first match.</p>
<p>However, each recursion step requires a join, and as the number of records in the input recordset for the recusion grows with level, the joins become less and less efficient. The number of records grows exponentially, and on level <strong>6</strong> there will be about <strong>20 ^ 6 = 64,000,000</strong> records on input. This is just too much for a join with a <strong>20,000,000</strong> records table.</p>
<p>As the chain length increases, the tree diverges, the number of the records grows and it becomes more and more costly to join them with the table.</p>
<p>To work around this, we should use Bogdan the tunnel builder&#8217;s algorithm.</p>
<blockquote><p>British and French governments submit a tender to build a tunnel under the Channel. Many companies apply, all demanding years of time and billions pounds of money, so their offers are refused.</p>
<p>One day, Bogdan drops in and offers his services.</p>
<p><q>How much money would you demand for your work?</q>, the official asks. <q>Me and my brother Roman are good eaters, so we will need to buy a decent meal every day. 50 pounds a day will be OK.</q></p>
<p><q>That&#8217;s pretty cheap; and how much time will you need?</q> <q>Me and my brother Roman are fast diggers; one mile a day I think we will dig.</q></p>
<p><q>Oh, that&#8217;s pretty fast! But how will you be able to work on such a low budget in such a short time?</q>, the official asks out of curiosity.</p>
<p><q>That&#8217;s simple</q>, Bogdan answers, <q>I start digging from the British coast, my brother Roman starts digging from the French coast; the moment we meet, the work is over</q>.</p>
<p><q>OK</q>, says the official, <q>but what if you don&#8217;t meet?</q>.</p>
<p><q>No worries then: you get two tunnels for the price of one</q>
</p></blockquote>
<p>We should do a similar thing here. Instead of traversing the <strong>6</strong> levels from the beginning, we will traverse just <strong>3</strong> levels from each side, then join the resulting recordsets and hope some matching records will be found.</p>
<p>Traversing only <strong>3</strong> levels will be quite fast; and the resulting recordsets will be of moderate size so joining them will be easy.</p>
<p>As an extra, we will return the shortest path from one person to the other. To do this, we will need to record the friendship chain in an array. <strong>PostgreSQL</strong> does not offer an easy way to reverse an array, so in the first recordset, we will <em>append</em> the friends to the array, while in the second one we will <em>prepend</em> them. This way, we should just concatenate the resulting <del>tunnels</del>arrays.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q1 (person, chain, lvl) AS
        (
        SELECT  123456, ARRAY[123456], 1
        UNION ALL
        SELECT  friend, chain || friend, lvl + 1
        FROM    (
                SELECT  q1.*,
                        friend
                FROM    q1
                JOIN    (
                        SELECT  orestes AS me, pylades AS friend
                        FROM    friends
                        UNION ALL
                        SELECT  pylades AS me, orestes AS friend
                        FROM    friends
                        ) f
                ON      person = me
                WHERE   lvl &lt;= 3
                ) qo
        ),
        q2 (person, chain, lvl) AS
        (
        SELECT  654321, ARRAY[654321], 1
        UNION ALL
        SELECT  friend, friend || chain, lvl + 1
        FROM    (
                SELECT  q2.*,
                        friend
                FROM    q2
                JOIN    (
                        SELECT  orestes AS me, pylades AS friend
                        FROM    friends
                        UNION ALL
                        SELECT  pylades AS me, orestes AS friend
                        FROM    friends
                        ) f
                ON      person = me
                WHERE   lvl &lt;= 3
                ) qo
        )
SELECT  (q1.chain || q2.chain[2:q2.lvl])::TEXT AS chain
FROM    q1
JOIN    q2
ON      q2.person = q1.person
ORDER BY
        q1.lvl + q2.lvl
LIMIT 1
</pre>
<p><a href="#" onclick="xcollapse('X5520');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X5520" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>chain</th>
</tr>
<tr>
<td class="text">{123456,890237,278175,654321}</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0003s (0.5313s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=1138629880.05..1138629880.05 rows=1 width=72)
  CTE q1
    -&gt;  Recursive Union  (cost=0.00..71529.77 rows=2753901 width=40)
          -&gt;  Result  (cost=0.00..0.01 rows=1 width=0)
          -&gt;  Nested Loop  (cost=0.00..1645.17 rows=275390 width=40)
                Join Filter: (q1.person = &quot;20100227_friends&quot;.friends.orestes)
                -&gt;  WorkTable Scan on q1  (cost=0.00..0.22 rows=3 width=40)
                      Filter: (lvl &lt;= 3)
                -&gt;  Append  (cost=0.00..88.96 rows=30 width=8)
                      -&gt;  Index Scan using pk_friends_op on friends  (cost=0.00..37.55 rows=17 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.orestes = q1.person)
                      -&gt;  Index Scan using ux_friends_po on friends  (cost=0.00..51.40 rows=13 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.pylades = q1.person)
  CTE q2
    -&gt;  Recursive Union  (cost=0.00..71529.77 rows=2753901 width=40)
          -&gt;  Result  (cost=0.00..0.01 rows=1 width=0)
          -&gt;  Nested Loop  (cost=0.00..1645.17 rows=275390 width=40)
                Join Filter: (q2.person = &quot;20100227_friends&quot;.friends.orestes)
                -&gt;  WorkTable Scan on q2  (cost=0.00..0.22 rows=3 width=40)
                      Filter: (lvl &lt;= 3)
                -&gt;  Append  (cost=0.00..88.96 rows=30 width=8)
                      -&gt;  Index Scan using pk_friends_op on friends  (cost=0.00..37.55 rows=17 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.orestes = q2.person)
                      -&gt;  Index Scan using ux_friends_po on friends  (cost=0.00..51.40 rows=13 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.pylades = q2.person)
  -&gt;  Sort  (cost=1138486820.51..1233286454.49 rows=37919853589 width=72)
        Sort Key: ((q1.lvl + q2.lvl))
        -&gt;  Merge Join  (cost=849904.33..948887552.57 rows=37919853589 width=72)
              Merge Cond: (q1.person = q2.person)
              -&gt;  Sort  (cost=424952.16..431836.92 rows=2753901 width=40)
                    Sort Key: q1.person
                    -&gt;  CTE Scan on q1  (cost=0.00..55078.02 rows=2753901 width=40)
              -&gt;  Materialize  (cost=424952.16..459375.93 rows=2753901 width=40)
                    -&gt;  Sort  (cost=424952.16..431836.92 rows=2753901 width=40)
                          Sort Key: q2.person
                          -&gt;  CTE Scan on q2  (cost=0.00..55078.02 rows=2753901 width=40)
</pre>
</div>
<p>Note that the anchor part can not be used more than once in a recursive expression. To work around that, we had to join it to a derived table (a <code>UNION ALL</code> of two copies of the table with the columns swapped). However, <strong>PostgreSQL</strong>&#8216;s optimizer was smart enough to push the join predicate into the derived table and distribute the queries so that each part uses a corresponding index efficiently. This helps to traverse the tree and build the recordsets from both ends.</p>
<p>Each of the recordsets has only about <strong>8,000</strong> records, so scanning and joining them is very fast.</p>
<p>The whole query takes just a little longer than <strong>0.5</strong> seconds.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/02/27/six-degrees-of-separation/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/02/27/six-degrees-of-separation/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/27/six-degrees-of-separation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sargability of monotonic functions: example</title>
		<link>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/</link>
		<comments>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 20:00:25 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4407</guid>
		<description><![CDATA[In my previous article I presented a proposal to add sargability of monotonic functions into the SQL engines. In a nutshell: a monotonic function is a function that preserves the order of the argument so that it gives the larger results for the larger values of the argument. It is easy to prove that a [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous article I presented a proposal to add <a href="/2010/02/19/things-sql-needs-sargability-of-monotonic-functions/">sargability of monotonic functions</a> into the <strong>SQL</strong> engines.</p>
<p>In a nutshell: a monotonic function is a function that preserves the order of the argument so that it gives the larger results for the larger values of the argument. It is easy to prove that a <strong>B-tree</strong> with each key replaced by the result of the function will remain the valid <strong>B-Tree</strong> and hence can be used to search for ranges of function results just like it is used to search for ranges of values.</p>
<p>With a little effort, a <strong>B-Tree</strong> can also be used to search for the ranges of piecewise monotonic functions: those whose domain can be split into a number of continuous pieces with the function being monotonic within each piece (but it may be not monotonic and even not continuous across the pieces).</p>
<p>In this article, I&#8217;ll demonstrate the algorithm to do that (implemented in pure <strong>SQL</strong> on <strong>PostgreSQL</strong>).</p>
<p>I will show how the performance of simple query </p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.1234 AND 0.1235
</pre>
<p>could be improved if the sargability of monotonic functions had been implemented in the optimizer.<br />
<span id="more-4407"></span><br />
To do this, I will create a sample table:</p>
<p><a href="#" onclick="xcollapse('X1002');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X1002" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_sine (
        id INT NOT NULL PRIMARY KEY,
        value DOUBLE PRECISION NOT NULL
);

CREATE INDEX ix_sine_value ON t_sine (value);

SELECT  SETSEED(0.20100223);

INSERT
INTO    t_sine
SELECT  num, num / 10000.00 + RANDOM()
FROM    generate_series(1, 1000000) num;

ANALYZE t_sine;
</pre>
</div>
<p>This table contains <strong>1,000,000</strong> records with <code>value</code> randomly distributed from <strong>0</strong> to <strong>101</strong>.</p>
<p>To select the records we need, we can use a very simple and straightforward query:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.4452 AND 0.4453
</pre>
<p><a href="#" onclick="xcollapse('X1272');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1272" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>value</th>
</tr>
<tr>
<td class="int4">3663</td>
<td class="float8">0.46150185738802</td>
</tr>
<tr>
<td class="int4">19263</td>
<td class="float8">2.68015610060766</td>
</tr>
<tr>
<td class="int4">23783</td>
<td class="float8">2.68013202354237</td>
</tr>
<tr>
<td class="int4">86110</td>
<td class="float8">8.963312032599</td>
</tr>
<tr>
<td class="int4">128053</td>
<td class="float8">13.0278523004308</td>
</tr>
<tr>
<td class="int4">150339</td>
<td class="float8">15.2465362691633</td>
</tr>
<tr>
<td class="int4">185849</td>
<td class="float8">19.310986539682</td>
</tr>
<tr>
<td class="int4">186788</td>
<td class="float8">19.3110526885197</td>
</tr>
<tr>
<td class="int4">191391</td>
<td class="float8">19.3110088731334</td>
</tr>
<tr>
<td class="int4">210841</td>
<td class="float8">21.5297331408583</td>
</tr>
<tr>
<td class="int4">212511</td>
<td class="float8">21.529697893659</td>
</tr>
<tr>
<td class="int4">247639</td>
<td class="float8">25.5941842560224</td>
</tr>
<tr>
<td class="int4">373019</td>
<td class="float8">38.1605504324339</td>
</tr>
<tr>
<td class="int4">373416</td>
<td class="float8">38.1606072025172</td>
</tr>
<tr>
<td class="int4">458141</td>
<td class="float8">46.6624391236514</td>
</tr>
<tr>
<td class="int4">462683</td>
<td class="float8">46.6623900452435</td>
</tr>
<tr>
<td class="int4">462704</td>
<td class="float8">46.662440645238</td>
</tr>
<tr>
<td class="int4">463233</td>
<td class="float8">46.6624209528782</td>
</tr>
<tr>
<td class="int4">520118</td>
<td class="float8">52.945639446865</td>
</tr>
<tr>
<td class="int4">522686</td>
<td class="float8">52.9456708737895</td>
</tr>
<tr>
<td class="int4">561721</td>
<td class="float8">57.0100855686799</td>
</tr>
<tr>
<td class="int4">686886</td>
<td class="float8">69.5764806582652</td>
</tr>
<tr>
<td class="int4">711952</td>
<td class="float8">71.7951245983548</td>
</tr>
<tr>
<td class="int4">716508</td>
<td class="float8">71.7952263388403</td>
</tr>
<tr>
<td class="int4">778116</td>
<td class="float8">78.0783171531357</td>
</tr>
<tr>
<td class="int4">877138</td>
<td class="float8">88.4260205388732</td>
</tr>
<tr>
<td class="int4">903050</td>
<td class="float8">90.6446926224232</td>
</tr>
<tr>
<td class="int4">942345</td>
<td class="float8">94.7092782433405</td>
</tr>
<tr>
<td class="int4">946181</td>
<td class="float8">94.7092303088069</td>
</tr>
<tr>
<td class="int4">966815</td>
<td class="float8">96.9279387165383</td>
</tr>
<tr>
<td class="int4">999931</td>
<td class="float8">100.992433886895</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0022s (3.2500s)</td>
</tr>
</table>
</div>
<pre>
Seq Scan on t_sine  (cost=0.00..25406.00 rows=5000 width=12)
  Filter: ((sin(value) &gt;= 0.4452::double precision) AND (sin(value) &lt;= 0.4453::double precision))
</pre>
</div>
<p>which returns <strong>31</strong> records in <strong>3.25</strong> seconds.</p>
<p>The query uses a full table scan with the filter applied to each record.</p>
<p>Let&#8217;s try to improve it.</p>
<h3>Function description</h3>
<p>According to the notation I proposed in the previous article, the monotony of the function <code>SIN()</code> should be described as this:</p>
<p><code>SIN(arg FLOAT) MONOTONIC PIECEWISE<br />
DEFINED BY FLOOR(arg / PI() + 0.5)<br />
CASE PIECE % 2<br />
WHEN 0 THEN DECREASING INVERSE PIECE * PI() + ASIN(RESULT)<br />
ELSE INCREASING INVERSE PIECE * PI() - ASIN(RESULT)<br />
END</code></p>
<p>This means that:</p>
<ol>
<li>
<p>The function is piecewise monotonic,</p>
</li>
<li>
<p>The pieces are defined by the function <code>FLOOR(arg / PI() + 0.5)</code> (which essentially returns the number of the half-wave the argument belongs too),</p>
</li>
<li>
<p>The function monotony varies depending on the piece,</p>
</li>
<li>
<p>On odd pieces, the function increases, </p>
</li>
<li>
<p>On even pieces, the function decreases.</p>
</li>
<li>
<p>A single inverse expression is provided for each monotony</p>
</li>
</ol>
<p>Note that mathematically the function is strictly monotonic on each of its pieces. However, due to the rounding errors, different arguments can yield same function results, so the function value may map back to a range of the arguments rather than a single value.</p>
<p>In theory, it is possible to write a single expression which would map the function&#8217;s result to the pair of values defining the beginning and the end of such a range. However, the expression would be quite complex. So for illustration purposes I&#8217;ll make do with a single inverse function that yields an approximation of the back mapping. To find the exact range, some extra effort will be required.</p>
<h3>Building the pieces</h3>
<p>The function is piecewise monotonic and the pieces are defined by a function. For the pieces to be continuous, the function that defines them should be itself monotonic over all its domain.</p>
<p>The function that defines the pieces is <code>FLOOR(arg / PI() + 0.5)</code>.</p>
<p>It is a superposition of the three functions:</p>
<ul>
<li>
<p><code>OPERATOR_DIVISION(arg1 FLOAT, arg2 FLOAT)<br />
MONOTONIC OVER (arg1)<br />
CASE WHEN arg2 > 0 THEN INCREASING INVERSE RESULT * arg2<br />
WHEN arg2 = 0 THEN UNDEFINED<br />
WHEN arg2 < 0 THEN DECREASING INVERSE RESULT * arg2<br />
END</code></p>
</li>
<li>
<p><code>OPERATOR_PLUS(arg1 FLOAT, arg2 FLOAT) MONOTONIC<br />
OVER (arg1) STRICTLY INCREASING INVERSE RESULT - arg2,<br />
OVER (arg2) STRICTLY INCREASING INVERSE RESULT - arg1</code></p>
</li>
<li>
<p><code>FLOOR(arg FLOAT) MONOTONIC INCREASING<br />
INVERSE<br />
FROM RESULT EXACT<br />
TO RESULT + 1 EXACT EXCLUDE</code></p>
</li>
</ul>
<p>which are, given the values of the constants provided in the secondary arguments, are increasing over the argument. As we know from math, a superposition of monotonic functions is also monotonic.</p>
<p>Each function is defined with a single inverse condition which maps the result of the function back to a value <em>near</em> the range of the arguments yielding the result. The exact range has to be sought for using index seek over (hopefully) not too many records.</p>
<p>Sequentially applying the inverse expressions of each of the constituent functions to the result of the piece defining function, we get the following inverse expression for the latter:</p>
<ol>
<li><code>FLOOR(OPERATOR_PLUS(OPERATOR_DIVISION(arg, PI()), 0.5)) = PIECE</code></li>
<li><code>OPERATOR_PLUS(OPERATOR_DIVISION(arg, PI()), 0.5) ∈ [ PIECE, PIECE + 1 )</code></li>
<li><code>OPERATOR_DIVISION(arg, PI()) ∈ [ ≈(PIECE - 0.5), ≈((PIECE + 1) - 0.5) ]</code></li>
<li><code>arg ∈ [ ≈((PIECE - 0.5) * PI()), ≈(((PIECE + 1) - 0.5) * PI()) ]</code></li>
</ol>
<p>For each piece, we how have a pair of values <em>approximately</em> defining the range of values belonging to the piece.</p>
<p>To find out the exact bounds, we need to do the following:</p>
<ol>
<li>
<p>Calculate the piece for the minimal <code>value</code></p>
</li>
<li>
<p>Find the approximate upper bound for the piece.</p>
</li>
<li>
<p>Scanning the keys to the left, find the <strong>rightmost</strong> key to the <strong>left of the upper bound</strong> that belongs to the current (or previous) piece.</p>
</li>
<li>
<p>Scanning the keys to the right, find the <strong>first</strong> key of the <strong>next</strong> piece.</p>
</li>
<li>
<p>Scanning a single key to the left, find the <strong>last</strong> key of the <strong>current</strong> piece.</p>
</li>
<li>
<p>Recursively repeat steps <strong>1</strong> to <strong>5</strong>, taking the first value the next piece calculated on step <strong>4</strong> as a seed for the step <strong>1</strong>, until step <strong>4</strong> fails (which means that the pieces are over).</p>
</li>
</ol>
<p>This procedure guarantees that we always get the correct bounds even with the inexact inverse value, since it correctly handles both overflow and underflow of the inverse value, as show on the pictures below:</p>
<h4>Overflow</h4>
<p><img src="http://explainextended.com/wp-content/uploads/2010/02/overflow.png" alt="" title="Overflow" width="700" height="500" class="size-full wp-image-4420 noborder" /></p>
<h4>Underflow</h4>
<p><img src="http://explainextended.com/wp-content/uploads/2010/02/underflow.png" alt="" title="Underflow" width="700" height="500" class="aligncenter size-full wp-image-4419 noborder" /></p>
<p>Here's a query that selects the first and the last key of each piece:</p>
<p><a href="#" onclick="xcollapse('X822');return false;"><strong>View the query</strong></a><br />
</p>
<div id="X822" style="display: none; ">
<pre class="brush: sql">
WITH    RECURSIVE
        d AS (
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  minv, FLOOR(minv / PI() + 0.5) AS piece
                        FROM    (
                                SELECT  MIN(value) AS minv
                                FROM    t_sine
                                ) q
                        ) q2
                ) q3
        UNION ALL
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  nextv AS minv, nextpiece AS piece
                        FROM    d
                        WHERE   nextpiece IS NOT NULL
                        ) q2
                ) q3
        )
SELECT  *
FROM    d
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>piece</th>
<th>minv</th>
<th>maxv</th>
<th>nextv</th>
<th>nextpiece</th>
</tr>
<tr>
<td class="float8">0</td>
<td class="float8">0.0172837972298265</td>
<td class="float8">1.57060804706216</td>
<td class="float8">1.57081433883309</td>
<td class="float8">1</td>
</tr>
<tr>
<td class="float8">1</td>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">4.71268981658742</td>
<td class="float8">2</td>
</tr>
<tr>
<td class="float8">2</td>
<td class="float8">4.71268981658742</td>
<td class="float8">7.85390641669333</td>
<td class="float8">7.85409013534784</td>
<td class="float8">3</td>
</tr>
<tr>
<td class="float8">3</td>
<td class="float8">7.85409013534784</td>
<td class="float8">10.9955484666698</td>
<td class="float8">10.9956333589859</td>
<td class="float8">4</td>
</tr>
<tr>
<td class="float8">4</td>
<td class="float8">10.9956333589859</td>
<td class="float8">14.1371444690436</td>
<td class="float8">14.1372372000463</td>
<td class="float8">5</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="float8">31</td>
<td class="float8">95.8185816861346</td>
<td class="float8">98.9601405610755</td>
<td class="float8">98.9601765494391</td>
<td class="float8">32</td>
</tr>
<tr>
<td class="float8">32</td>
<td class="float8">98.9601765494391</td>
<td class="float8">100.992433886895</td>
<td class="float8"></td>
<td class="float8"></td>
</tr>
<tr class="statusbar">
<td colspan="100">33 rows fetched in 0.0057s (0.0484s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on d  (cost=65.97..67.99 rows=101 width=40)
  CTE d
    -&gt;  Recursive Union  (cost=0.08..65.97 rows=101 width=16)
          -&gt;  Subquery Scan q  (cost=0.08..0.80 rows=1 width=8)
                InitPlan 5 (returns $5)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 4 (returns $4)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                      InitPlan 10 (returns $8)
                        -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                              -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                    Filter: (value IS NOT NULL)
                SubPlan 3
                  -&gt;  Limit  (cost=0.22..0.26 rows=1 width=8)
                        InitPlan 2 (returns $3)
                          -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                                InitPlan 1 (returns $2)
                                  -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $2)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($3)[1])
                SubPlan 7
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 6 (returns $6)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $6)
                SubPlan 9
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 8 (returns $7)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $7)
          -&gt;  WorkTable Scan on d  (cost=0.04..6.31 rows=10 width=16)
                Filter: (d.nextpiece IS NOT NULL)
                InitPlan 15 (returns $13)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 14 (returns $12)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                SubPlan 13
                  -&gt;  Limit  (cost=0.19..0.23 rows=1 width=8)
                        InitPlan 12 (returns $11)
                          -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                                InitPlan 11 (returns $10)
                                  -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $10)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($11)[1])
                SubPlan 17
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 16 (returns $14)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $14)
                SubPlan 19
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 18 (returns $15)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $15)
</pre>
</div>
<p>Despite being huge in size, the query is very efficient and completes in only <strong>48 ms</strong>.</p>
<h3>Locating values within the pieces</h3>
<p>Now, when we have the exact bounds of each piece, we need to locate the records within each piece.</p>
<p>Since we don't have exact inverse function here, the basic idea is the same as above: given the approximate inverse, locate the exact bound using the iterative approach:</p>
<ol>
<li>Locate the first key to the left of the inverse which yields the function result less than the one sought for, up to the first key of the piece. Should this search fail, the first key of the piece is the lower bound.</li>
<li>Locate the first key to the right of that found on the previous step that yields the function value equal to or greater than the one sought for, up to the last key of the piece. Return <code>NULL</code> should it fail</li>
</ol>
<p>Since we have an inclusive range here, we don't need the third step (final scan to the left to find the rightmost least value) that we used when searching for the pieces.</p>
<p>This algorithm searches for the lower bound; to search for the upper bound, we just need to inverse both directions and tests (<q>left</q> becomes <q>right</q>, <q>less</q> becomes <q>greater</q> etc).</p>
<p>Since the monotony of the function varies from piece to piece, we should take this into account. For the pieces where the function's monotony is <code>DECREASING</code> we should swap the order of the bounds: the upper bound of the expression becomes the lower bound or the range of values and vice versa. This can be handled merely by substituting the conditions into the very same <code>CASE</code> expression that defines the monotony.</p>
<p>When we locate the upper and the lower bounds for each piece, we should just join <code>t_sine</code> on the following condition:</p>
<pre class="brush: sql">
ON value BETWEEN llimit and ulimit
</pre>
<p>It can happen so that the lower bound found by the algorithm exceeds the upper bound. This is a perfectly normal situation meaning that no keys match the condition and the range diverged. <code>BETWEEN</code> predicate will handle this.</p>
<p>It also can happen that one of the bounds is a <code>NULL</code>. This is also a valid situation, meaning that no value within the piece exceeds the lower bound (or falls short of the upper one). <code>BETWEEN</code> will also take care of it.</p>
<h3>Final query</h3>
<p>And here's the final query. Be ready, you'll have to spin your mouse wheel a lot:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        d AS (
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  minv, FLOOR(minv / PI() + 0.5) AS piece
                        FROM    (
                                SELECT  MIN(value) AS minv
                                FROM    t_sine
                                ) q
                        ) q2
                ) q3
        UNION ALL
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  nextv AS minv, nextpiece AS piece
                        FROM    d
                        WHERE   nextpiece IS NOT NULL
                        ) q2
                ) q3
        )
SELECT  l.*, s.*, SIN(value)
FROM    (
        SELECT  minv, maxv,
                CASE piece::INTEGER % 2
                WHEN 0 THEN
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &gt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= LEAST(piece * PI() + ASIN(0.4452), maxv)
                                        AND value &gt;= minv
                                        AND SIN(value) &lt; 0.4452
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                ),
                                minv
                                )
                                AND value &lt;= maxv
                                AND SIN(value) &gt;= 0.4452
                        ORDER BY
                                value
                        LIMIT 1
                        )
                ELSE
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &gt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= LEAST(piece * PI() - ASIN(0.4453), maxv)
                                        AND value &gt;= minv
                                        AND SIN(value) &gt; 0.4453
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                ),
                                minv
                                )
                                AND value &lt;= maxv
                                AND SIN(value) &lt;= 0.4453
                        ORDER BY
                                value
                        LIMIT 1
                        )
                END AS llimit,
                CASE piece::INTEGER % 2
                WHEN 0 THEN
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &lt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &gt;= GREATEST(piece * PI() + ASIN(0.4453), minv)
                                        AND value &lt;= maxv
                                        AND SIN(value) &gt; 0.4453
                                ORDER BY
                                        value
                                LIMIT 1
                                ),
                                maxv
                                )
                                AND value &gt;= minv
                                AND SIN(value) &lt;= 0.4453
                        ORDER BY
                                value DESC
                        LIMIT 1
                        )
                ELSE
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &lt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &gt;= GREATEST(piece * PI() - ASIN(0.4452), minv)
                                        AND value &lt;= maxv
                                        AND SIN(value) &lt; 0.4452
                                ORDER BY
                                        value
                                LIMIT 1
                                ),
                                maxv
                                )
                                AND value &gt;= minv
                                AND SIN(value) &gt;= 0.4452
                        ORDER BY
                                value DESC
                        LIMIT 1
                        )
                END AS ulimit
        FROM    d
        ) l
JOIN    t_sine s
ON      value BETWEEN llimit AND ulimit
</pre>
<p><a href="#" onclick="xcollapse('X1409');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1409" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>minv</th>
<th>maxv</th>
<th>llimit</th>
<th>ulimit</th>
<th>id</th>
<th>value</th>
<th>sin</th>
</tr>
<tr>
<td class="float8">0.0172837972298265</td>
<td class="float8">1.57060804706216</td>
<td class="float8">0.46150185738802</td>
<td class="float8">0.46150185738802</td>
<td class="int4">3663</td>
<td class="float8">0.46150185738802</td>
<td class="float8">0.44529334884391</td>
</tr>
<tr>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">2.68013202354237</td>
<td class="float8">2.68015610060766</td>
<td class="int4">23783</td>
<td class="float8">2.68013202354237</td>
<td class="float8">0.445256434133825</td>
</tr>
<tr>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">2.68013202354237</td>
<td class="float8">2.68015610060766</td>
<td class="int4">19263</td>
<td class="float8">2.68015610060766</td>
<td class="float8">0.445234875325918</td>
</tr>
<tr>
<td class="float8">7.85409013534784</td>
<td class="float8">10.9955484666698</td>
<td class="float8">8.963312032599</td>
<td class="float8">8.963312032599</td>
<td class="int4">86110</td>
<td class="float8">8.963312032599</td>
<td class="float8">0.445261178083286</td>
</tr>
<tr>
<td class="float8">10.9956333589859</td>
<td class="float8">14.1371444690436</td>
<td class="float8">13.0278523004308</td>
<td class="float8">13.0278523004308</td>
<td class="int4">128053</td>
<td class="float8">13.0278523004308</td>
<td class="float8">0.445275287664458</td>
</tr>
<tr>
<td class="float8">14.1372372000463</td>
<td class="float8">17.2787474542275</td>
<td class="float8">15.2465362691633</td>
<td class="float8">15.2465362691633</td>
<td class="int4">150339</td>
<td class="float8">15.2465362691633</td>
<td class="float8">0.445226320346027</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">185849</td>
<td class="float8">19.310986539682</td>
<td class="float8">0.445229561181329</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">191391</td>
<td class="float8">19.3110088731334</td>
<td class="float8">0.445249558810258</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">186788</td>
<td class="float8">19.3110526885197</td>
<td class="float8">0.44528879096528</td>
</tr>
<tr>
<td class="float8">20.4204805635758</td>
<td class="float8">23.561747650259</td>
<td class="float8">21.529697893659</td>
<td class="float8">21.5297331408583</td>
<td class="int4">212511</td>
<td class="float8">21.529697893659</td>
<td class="float8">0.445247526124325</td>
</tr>
<tr>
<td class="float8">20.4204805635758</td>
<td class="float8">23.561747650259</td>
<td class="float8">21.529697893659</td>
<td class="float8">21.5297331408583</td>
<td class="int4">210841</td>
<td class="float8">21.5297331408583</td>
<td class="float8">0.445215965240229</td>
</tr>
<tr>
<td class="float8">23.5619455649868</td>
<td class="float8">26.7032224601433</td>
<td class="float8">25.5941842560224</td>
<td class="float8">25.5941842560224</td>
<td class="int4">247639</td>
<td class="float8">25.5941842560224</td>
<td class="float8">0.445240672513922</td>
</tr>
<tr>
<td class="float8">36.1283162861556</td>
<td class="float8">39.2698307414278</td>
<td class="float8">38.1605504324339</td>
<td class="float8">38.1606072025172</td>
<td class="int4">373019</td>
<td class="float8">38.1605504324339</td>
<td class="float8">0.445236698722671</td>
</tr>
<tr>
<td class="float8">36.1283162861556</td>
<td class="float8">39.2698307414278</td>
<td class="float8">38.1605504324339</td>
<td class="float8">38.1606072025172</td>
<td class="int4">373416</td>
<td class="float8">38.1606072025172</td>
<td class="float8">0.445287530670759</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">462683</td>
<td class="float8">46.6623900452435</td>
<td class="float8">0.445291469623201</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">463233</td>
<td class="float8">46.6624209528782</td>
<td class="float8">0.445263795157205</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">458141</td>
<td class="float8">46.6624391236514</td>
<td class="float8">0.445247524983561</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">462704</td>
<td class="float8">46.662440645238</td>
<td class="float8">0.445246162542927</td>
</tr>
<tr>
<td class="float8">51.8363581438176</td>
<td class="float8">54.9774493159346</td>
<td class="float8">52.945639446865</td>
<td class="float8">52.9456708737895</td>
<td class="int4">520118</td>
<td class="float8">52.945639446865</td>
<td class="float8">0.445234079463435</td>
</tr>
<tr>
<td class="float8">51.8363581438176</td>
<td class="float8">54.9774493159346</td>
<td class="float8">52.945639446865</td>
<td class="float8">52.9456708737895</td>
<td class="int4">522686</td>
<td class="float8">52.9456708737895</td>
<td class="float8">0.445205939128705</td>
</tr>
<tr>
<td class="float8">54.9779645633645</td>
<td class="float8">58.1193389566675</td>
<td class="float8">57.0100855686799</td>
<td class="float8">57.0100855686799</td>
<td class="int4">561721</td>
<td class="float8">57.0100855686799</td>
<td class="float8">0.445218087206929</td>
</tr>
<tr>
<td class="float8">67.5442983367741</td>
<td class="float8">70.6858307610475</td>
<td class="float8">69.5764806582652</td>
<td class="float8">69.5764806582652</td>
<td class="int4">686886</td>
<td class="float8">69.5764806582652</td>
<td class="float8">0.445240002733585</td>
</tr>
<tr>
<td class="float8">70.6861408688866</td>
<td class="float8">73.8273966856673</td>
<td class="float8">71.7951245983548</td>
<td class="float8">71.7952263388403</td>
<td class="int4">711952</td>
<td class="float8">71.7951245983548</td>
<td class="float8">0.445297446856164</td>
</tr>
<tr>
<td class="float8">70.6861408688866</td>
<td class="float8">73.8273966856673</td>
<td class="float8">71.7951245983548</td>
<td class="float8">71.7952263388403</td>
<td class="int4">716508</td>
<td class="float8">71.7952263388403</td>
<td class="float8">0.445206347880861</td>
</tr>
<tr>
<td class="float8">76.9690608614191</td>
<td class="float8">80.1104563690938</td>
<td class="float8">78.0783171531357</td>
<td class="float8">78.0783171531357</td>
<td class="int4">778116</td>
<td class="float8">78.0783171531357</td>
<td class="float8">0.445290957467673</td>
</tr>
<tr>
<td class="float8">86.3938749629512</td>
<td class="float8">89.5353266705379</td>
<td class="float8">88.4260205388732</td>
<td class="float8">88.4260205388732</td>
<td class="int4">877138</td>
<td class="float8">88.4260205388732</td>
<td class="float8">0.445225639446173</td>
</tr>
<tr>
<td class="float8">89.5354929595716</td>
<td class="float8">92.6769711233795</td>
<td class="float8">90.6446926224232</td>
<td class="float8">90.6446926224232</td>
<td class="int4">903050</td>
<td class="float8">90.6446926224232</td>
<td class="float8">0.445286610427917</td>
</tr>
<tr>
<td class="float8">92.6770082124449</td>
<td class="float8">95.8184746547915</td>
<td class="float8">94.7092303088069</td>
<td class="float8">94.7092782433405</td>
<td class="int4">946181</td>
<td class="float8">94.7092303088069</td>
<td class="float8">0.445247543713327</td>
</tr>
<tr>
<td class="float8">92.6770082124449</td>
<td class="float8">95.8184746547915</td>
<td class="float8">94.7092303088069</td>
<td class="float8">94.7092782433405</td>
<td class="int4">942345</td>
<td class="float8">94.7092782433405</td>
<td class="float8">0.44529046414363</td>
</tr>
<tr>
<td class="float8">95.8185816861346</td>
<td class="float8">98.9601405610755</td>
<td class="float8">96.9279387165383</td>
<td class="float8">96.9279387165383</td>
<td class="int4">966815</td>
<td class="float8">96.9279387165383</td>
<td class="float8">0.445232181707114</td>
</tr>
<tr>
<td class="float8">98.9601765494391</td>
<td class="float8">100.992433886895</td>
<td class="float8">100.992433886895</td>
<td class="float8">100.992433886895</td>
<td class="int4">999931</td>
<td class="float8">100.992433886895</td>
<td class="float8">0.445263903548165</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0071s (0.1153s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=67.09..25557916.04 rows=11222222 width=36)
  CTE d
    -&gt;  Recursive Union  (cost=0.08..65.97 rows=101 width=16)
          -&gt;  Subquery Scan q  (cost=0.08..0.80 rows=1 width=8)
                InitPlan 5 (returns $5)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 4 (returns $4)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                      InitPlan 10 (returns $8)
                        -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                              -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                    Filter: (value IS NOT NULL)
                SubPlan 3
                  -&gt;  Limit  (cost=0.22..0.26 rows=1 width=8)
                        InitPlan 2 (returns $3)
                          -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                                InitPlan 1 (returns $2)
                                  -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $2)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($3)[1])
                SubPlan 7
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 6 (returns $6)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $6)
                SubPlan 9
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 8 (returns $7)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $7)
          -&gt;  WorkTable Scan on d  (cost=0.04..6.31 rows=10 width=16)
                Filter: (d.nextpiece IS NOT NULL)
                InitPlan 15 (returns $13)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 14 (returns $12)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                SubPlan 13
                  -&gt;  Limit  (cost=0.19..0.23 rows=1 width=8)
                        InitPlan 12 (returns $11)
                          -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                                InitPlan 11 (returns $10)
                                  -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $10)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($11)[1])
                SubPlan 17
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 16 (returns $14)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $14)
                SubPlan 19
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 18 (returns $15)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $15)
  -&gt;  CTE Scan on d  (cost=0.00..2.02 rows=101 width=24)
  -&gt;  Index Scan using ix_sine_value on t_sine s  (cost=1.12..2570.38 rows=111111 width=12)
        Index Cond: ((s.value &gt;= CASE ((d.piece)::integer % 2) WHEN 0 THEN (SubPlan 30) ELSE (SubPlan 32) END) AND (s.value &lt;= CASE ((d.piece)::integer % 2) WHEN 0 THEN (SubPlan 34) ELSE (SubPlan 36) END))
        SubPlan 30
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 29 (returns $24)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) + 0.461397604523314::double precision), $18)) AND (value &gt;= $19))
                              Filter: (sin(value) &lt; 0.4452::double precision)
                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &gt;= COALESCE($24, $19)) AND (value &lt;= $18))
                      Filter: (sin(value) &gt;= 0.4452::double precision)
        SubPlan 32
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 31 (returns $25)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) - 0.461509285667814::double precision), $18)) AND (value &gt;= $19))
                              Filter: (sin(value) &gt; 0.4453::double precision)
                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &gt;= COALESCE($25, $19)) AND (value &lt;= $18))
                      Filter: (sin(value) &lt;= 0.4453::double precision)
        SubPlan 34
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 33 (returns $26)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) + 0.461509285667814::double precision), $19)) AND (value &lt;= $18))
                              Filter: (sin(value) &gt; 0.4453::double precision)
                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &lt;= COALESCE($26, $18)) AND (value &gt;= $19))
                      Filter: (sin(value) &lt;= 0.4453::double precision)
        SubPlan 36
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 35 (returns $27)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) - 0.461397604523314::double precision), $19)) AND (value &lt;= $18))
                              Filter: (sin(value) &lt; 0.4452::double precision)
                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &lt;= COALESCE($27, $18)) AND (value &gt;= $19))
                      Filter: (sin(value) &gt;= 0.4452::double precision)
  SubPlan 22
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 21 (returns $20)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) + 0.461397604523314::double precision), $18)) AND (value &gt;= $19))
                        Filter: (sin(value) &lt; 0.4452::double precision)
          -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &gt;= COALESCE($20, $19)) AND (value &lt;= $18))
                Filter: (sin(value) &gt;= 0.4452::double precision)
  SubPlan 24
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 23 (returns $21)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) - 0.461509285667814::double precision), $18)) AND (value &gt;= $19))
                        Filter: (sin(value) &gt; 0.4453::double precision)
          -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &gt;= COALESCE($21, $19)) AND (value &lt;= $18))
                Filter: (sin(value) &lt;= 0.4453::double precision)
  SubPlan 26
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 25 (returns $22)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) + 0.461509285667814::double precision), $19)) AND (value &lt;= $18))
                        Filter: (sin(value) &gt; 0.4453::double precision)
          -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &lt;= COALESCE($22, $18)) AND (value &gt;= $19))
                Filter: (sin(value) &lt;= 0.4453::double precision)
  SubPlan 28
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 27 (returns $23)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) - 0.461397604523314::double precision), $19)) AND (value &lt;= $18))
                        Filter: (sin(value) &lt; 0.4452::double precision)
          -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &lt;= COALESCE($23, $18)) AND (value &gt;= $19))
                Filter: (sin(value) &gt;= 0.4452::double precision)
</pre>
</div>
<p>This <strong>200</strong>-line monster completes in only <strong>110 ms</strong>, or <strong>30 times</strong> as fast as the original <strong>3</strong>-liner:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.1234 AND 0.1235
</pre>
<p>, yielding the same results.</p>
<h3>Summary</h3>
<p>This example was to demonstrate feasibility of the <strong>B-Tree</strong> indexes to be used in a search for the predicates involving monotonic functions and the performance gain achieved.</p>
<p>The performance gain is over <strong>30</strong> times for a table that fits completely into the cache, and will increase with the number of the cache misses increases.</p>
<p>The <strong>SQL</strong> implementation of the algorithm is in fact not optimal, since iterative searches for the value boundaries are implemented as the subqueries. Each subquery requires reentering the <strong>B-Tree</strong> and traversing it starting from the root. The native algorithm working within the optimizer could avoid this by caching the key position in the index and issuing <code>next_key</code> / <code>prev_key</code> commands, which would improve the algorithm yet more.</p>
<p>Sargability of the monotonic functions, as shown above, can help to make the queries like the one described in this article much more legible, maintainable and efficient.</p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Searching for arbitrary portions of a date</title>
		<link>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/</link>
		<comments>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 20:00:13 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4132</guid>
		<description><![CDATA[From Stack Overflow: I have a Ruby on Rails application with a PostgreSQL database; several tables have created_at and updated_at timestamp attributes. When displayed, those dates are formatted in the user&#8217;s locale; for example, the timestamp 2009-10-15 16:30:00.435 becomes the string 15.10.2009 &#8211; 16:30 (the date format for this example being dd.mm.yyyy - hh.mm). The [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://stackoverflow.com/questions/2175844/is-it-possible-to-search-for-dates-as-strings-in-a-database-agnostic-way"><strong>Stack Overflow</strong></a>:</p>
<blockquote><p>I have a <strong>Ruby on Rails</strong> application with a <strong>PostgreSQL</strong> database; several tables have <code>created_at</code> and <code>updated_at</code> timestamp attributes.</p>
<p>When displayed, those dates are formatted in the user&#8217;s locale; for example, the timestamp <strong>2009-10-15 16:30:00.435</strong> becomes the string <strong>15.10.2009 &#8211; 16:30</strong> (the date format for this example being <code>dd.mm.yyyy - hh.mm</code>).</p>
<p>The requirement is that the user must be able to search for records by date, as if they were strings formatted in the current locale.</p>
<p>For example, searching for <strong>15.10.2009</strong> would return records with dates on <strong>October 15th 2009</strong>; searching for <strong>15.10</strong> would return records with dates on <strong>October 15th</strong> of any year, searching for <strong>15</strong> would return all dates that match <strong>15</strong> (be it day, month or year).</p></blockquote>
<p>The simplest solution would be just retrieve the locale string from the client, format the dates according to that string and search them using <code>LIKE</code> or <code>~</code> operators (the latter, as we all know, searches for <a href="http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-POSIX-TABLE"><strong>POSIX</strong> regular expressions</a>).</p>
<p>However, this would be not very efficient.</p>
<p>Let&#8217;s create a sample table and see:<br />
<span id="more-4132"></span><br />
<a href="#" onclick="xcollapse('X10740');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X10740" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_dates (
        id INT NOT NULL PRIMARY KEY,
        date TIMESTAMP NOT NULL,
        name VARCHAR(20) NOT NULL,
        stuffing VARCHAR(200) NOT NULL
);

CREATE INDEX ix_dates_date ON t_dates (date);

CREATE INDEX ix_dates_parts ON t_dates
USING GIN((
        ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
        ));

SELECT  SETSEED(0.20100202);

INSERT
INTO    t_dates (id, date, name, stuffing)
SELECT  id,
        TO_TIMESTAMP(&#039;2010-02-02&#039;, &#039;YYYY-MM-DD&#039;) -
        (id || &#039; hour&#039;)::INTERVAL +
        (FLOOR(RANDOM() * 1800) || &#039; second&#039;)::INTERVAL,
        &#039;Date &#039; || id,
        RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) id;
</pre>
</div>
<p>This query contains <strong>1,000,000</strong> records with random timestamps spanning more than <strong>114 years</strong>.</p>
<p>Assuming that the client&#8217;s date format is set to <code>dd.mm.yy hh24.mi.ss</code>, let&#8217;s try to select the number of records that satisfy this string: <code>'20.12'</code>. We are also assuming that the beginning of string and the end of string are the field separators as well:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
</tr>
<tr>
<td class="int8">4235</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (7.9138s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=47262.58..47262.59 rows=1 width=0)
  -&gt;  Seq Scan on t_dates  (cost=0.00..47258.97 rows=1444 width=0)
        Filter: (to_char(date, &#39;dd.mm.yy hh24.mi.ss&#39;::text) ~ &#39;(^|[^\\d])20\\.12([^\\d]|$)&#39;::text)
</pre>
<p>This query runs for almost <strong>8 seconds</strong>. Let&#8217;s see which values does it return:</p>
<pre class="brush: sql">
SELECT  id, date
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
ORDER BY
        MD5(id::TEXT)
LIMIT 10
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>date</th>
</tr>
<tr>
<td class="int4">622924</td>
<td class="timestamp">1939-01-10 20:12:31</td>
</tr>
<tr>
<td class="int4">781217</td>
<td class="timestamp">1920-12-20 07:11:26</td>
</tr>
<tr>
<td class="int4">501772</td>
<td class="timestamp">1952-11-05 20:12:03</td>
</tr>
<tr>
<td class="int4">956539</td>
<td class="timestamp">1900-12-20 04:57:58</td>
</tr>
<tr>
<td class="int4">523679</td>
<td class="timestamp">1950-05-08 01:20:12</td>
</tr>
<tr>
<td class="int4">141308</td>
<td class="timestamp">1993-12-20 04:07:37</td>
</tr>
<tr>
<td class="int4">648220</td>
<td class="timestamp">1936-02-21 20:12:30</td>
</tr>
<tr>
<td class="int4">236980</td>
<td class="timestamp">1983-01-20 20:12:29</td>
</tr>
<tr>
<td class="int4">413051</td>
<td class="timestamp">1962-12-20 13:20:54</td>
</tr>
<tr>
<td class="int4">323566</td>
<td class="timestamp">1973-03-06 02:20:12</td>
</tr>
</table>
</div>
<p>We see the matches on <strong>day-month</strong>, <strong>hour-minute</strong> and <strong>minute-second</strong>. There are no <strong>month-year</strong> or <strong>year-hour</strong> matches, since there is no <strong>20th</strong> month and the year-hour separator is not a period.</p>
<p>The query seems to return correct values but is quite slow.</p>
<p>To improve this query we can use <strong>PostgreSQL</strong>&#8216;s <code>GIN</code> indexing abilities.</p>
<p>A <a href="http://www.postgresql.org/docs/8.4/static/textsearch-indexes.html"><code>GIN</code> index </a> is a way to index one record with several keys.</p>
<p>A plain index is a <strong>B-Tree</strong> structure that stores the pointers to the records (the <code>ctid</code>&#8216;s) in the leaf nodes. Such an index only accepts a single expression as a key and builds a single sort order over these expressions, so each record can be pointed to at most once. There is a one-to-many mapping between keys and records: a key can point to many records, but a record can be pointed to by at most one key.</p>
<p>A <code>GIN</code> index, on the other hand, accepts an array of expressions as a parameter and uses each element of the array as a key. This way, the mapping becomes many-to-many: each key can point to many records, and each record can be pointed to by many keys.</p>
<p>Usually, <code>GIN</code> indexes are used for <code>FULLTEXT</code> indexing: the piece of text stored in a record is split into the separate words and each word is indexed separately so that search for any word can be performed using the index.</p>
<p>However, in <strong>PostgreSQL</strong>, <code>GIN</code> indexes support integer arrays as well. And we can use this support to improve our query.</p>
<p>As many of you may have noted, I created a <code>GIN</code> index in the table creation script. Here&#8217;s how it looks:</p>
<pre class="brush: sql">
CREATE INDEX ix_dates_parts ON t_dates
USING GIN((
        ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
        ));
</pre>
<p>Each record is split into an array of <strong>8</strong> integers, each representing a certain portion of a date which are normally used in the date formatting options. <strong>6</strong> of them just represent date parts, and there are two extra integers that represent a <strong>2</strong>-digit year and an <strong>AM/PM</strong> hour.</p>
<p>This way, any record in year <strong>2010</strong> gets indexed with both <strong>2010</strong> and <strong>10</strong>, and any record with hour <strong>19</strong> gets indexed with both <strong>19</strong> and <strong>7</strong>.</p>
<p>This covers most formatting options, and can be changed to include less used ones.</p>
<p>To make use of this index we should provide an additional predicate in the <code>WHERE</code> clause. This predicate will take a user-provided array as an input and search for the records that contain <em>all</em> elements of the user-provided array in the date parts.</p>
<p>Here&#8217;s how our query looks now:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
        AND ARRAY[20, 12] &lt;@ ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
</tr>
<tr>
<td class="int8">4235</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.2366s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=3595.90..3595.91 rows=1 width=0)
  -&gt;  Bitmap Heap Scan on t_dates  (cost=104.76..3595.90 rows=1 width=0)
        Recheck Cond: (&#39;{20,12}&#39;::integer[] &lt;@ ARRAY[(date_part(&#39;year&#39;::text, date))::integer, ((date_part(&#39;year&#39;::text, date))::integer % 100), (date_part(&#39;month&#39;::text, date))::integer, (date_part(&#39;day&#39;::text, date))::integer, (date_part(&#39;hour&#39;::text, date))::integer, ((((date_part(&#39;hour&#39;::text, date))::integer + 1) % 12) + 1), (date_part(&#39;minute&#39;::text, date))::integer, (date_part(&#39;second&#39;::text, date))::integer])
        Filter: (to_char(date, &#39;dd.mm.yy hh24.mi.ss&#39;::text) ~ &#39;(^|[^\\d])20\\.12([^\\d]|$)&#39;::text)
        -&gt;  Bitmap Index Scan on ix_dates_parts  (cost=0.00..104.76 rows=1000 width=0)
              Index Cond: (&#39;{20,12}&#39;::integer[] &lt;@ ARRAY[(date_part(&#39;year&#39;::text, date))::integer, ((date_part(&#39;year&#39;::text, date))::integer % 100), (date_part(&#39;month&#39;::text, date))::integer, (date_part(&#39;day&#39;::text, date))::integer, (date_part(&#39;hour&#39;::text, date))::integer, ((((date_part(&#39;hour&#39;::text, date))::integer + 1) % 12) + 1), (date_part(&#39;minute&#39;::text, date))::integer, (date_part(&#39;second&#39;::text, date))::integer])
</pre>
<p>This returns the same records but does it <strong>40</strong> times as fast, since the <code>GIN</code> index is used for coarse filtering and the fine-filtering operator is applied to selected results only.</p>
<p>For the index to work, the expression we used to create the index should be provided verbatim to the right side of the <code>&lt;@</code> (contains) operator.</p>
<p>This solution, however, does not cover all possible conditions: a client can use less obvious formats. However, with a proper design this should not be a problem. The client application should just ignore the parts which are formatted in a way not suitable for the index and do not put them into the array (but of course leave them in the regular expression).</p>
<p>This makes the index less selective but still usable, and the query performance will still be improved greatly.</p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: Selecting records holding group-wise maximum</title>
		<link>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/</link>
		<comments>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/#comments</comments>
		<pubDate>Thu, 26 Nov 2009 20:00:25 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3777</guid>
		<description><![CDATA[Continuing the series on selecting records holding group-wise maximums: How do I select the whole records, grouped on grouper and holding a group-wise maximum (or minimum) on other column? In this article, I&#8217;ll describe several ways to do this in PostgreSQL 8.4. PostgreSQL 8.4 syntax is much richer than that of MySQL. The former can [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing the series on <a href="/2009/11/24/mysql-selecting-records-holding-group-wise-maximum-on-a-unique-column/">selecting records holding group-wise maximums</a>:</p>
<blockquote><p>How do I select the <em>whole</em> records, grouped on <code>grouper</code> and holding a group-wise maximum (or minimum) on other column?</p></blockquote>
<p>In this article, I&#8217;ll describe several ways to do this in <strong>PostgreSQL 8.4</strong>.</p>
<p><strong>PostgreSQL 8.4</strong> syntax is much richer than that of <strong>MySQL</strong>. The former can use the analytic functions, recursive <strong>CTE</strong>&#8216;s and proprietary syntax extensions, all of which can be used for this task.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3777"></span><br />
<a href="#" onclick="xcollapse('X10125');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X10125" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_distinct (
      id INT NOT NULL PRIMARY KEY,
      orderer INT NOT NULL,
      glow INT NOT NULL,
      ghigh INT NOT NULL,
      stuffing VARCHAR(200) NOT NULL
);

CREATE INDEX ix_distinct_glow_id ON t_distinct (glow, id);
CREATE INDEX ix_distinct_ghigh_id ON t_distinct (ghigh, id);
CREATE INDEX ix_distinct_glow_orderer_id ON t_distinct (glow, orderer, id);
CREATE INDEX ix_distinct_ghigh_orderer_id ON t_distinct (ghigh, orderer, id);

SELECT  SETSEED(0.20091126);

INSERT
INTO    t_distinct (id, orderer, glow, ghigh, stuffing)
SELECT  id, FLOOR(RANDOM() * 9) + 1,
        (id - 1) % 10 + 1,
        (id - 1) % 10000 + 1,
        LPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) id;
</pre>
</div>
<p>This table has <strong>1,000,000</strong> records:</p>
<ul>
<li><code>id</code> is the <code>PRIMARY KEY</code></li>
<li><code>orderer</code> is filled with random values from <strong>1</strong> to <strong>10</strong></li>
<li><code>glow</code> is a low cardinality grouping field (<strong>10</strong> distinct values)</li>
<li><code>ghigh</code> is a high cardinality grouping field (<strong>10,000</strong> distinct values)</li>
<li><code>stuffing</code> is an asterisk-filled <code>VARCHAR(200)</code> column added to emulate payload of the actual tables</li>
</ul>
<h3>Analytic functions</h3>
<p><strong>PostgreSQL 8.4</strong> supports analytic functions. These functions extend the aggregate abilities: they work on the groups rather than on the individual records, but return their values to each individual record instead of shrinking the set. For instance, <code>ROW_NUMBER</code> enumerates the records within the group according to the ordering condition, and <code>DENSE_RANK</code> enumerates distinct values of the ordering column (it assigns same number to the records with the same value of the ordering column).</p>
<p>Let&#8217;s make a query to select the records holding the group-wise maximums of <code>id</code>. Since <code>id</code> is a <code>PRIMARY KEY</code> we don&#8217;t have to worry about the ties.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
SELECT  id, orderer, glow, ghigh
FROM    (
        SELECT  *, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY id) AS rn
        FROM    t_distinct
        ) q
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X4310');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4310" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">4</td>
<td class="int4">9</td>
<td class="int4">4</td>
<td class="int4">4</td>
</tr>
<tr>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">5</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">4</td>
<td class="int4">6</td>
<td class="int4">6</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">6</td>
<td class="int4">7</td>
<td class="int4">7</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">8</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">7</td>
<td class="int4">9</td>
<td class="int4">9</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">2</td>
<td class="int4">10</td>
<td class="int4">10</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0507s (11.7804s)</td>
</tr>
</table>
</div>
<pre>
Subquery Scan q  (cost=246866.84..279366.84 rows=5000 width=16)
  Filter: (q.rn = 1)
  -&gt;  WindowAgg  (cost=246866.84..266866.84 rows=1000000 width=220)
        -&gt;  Sort  (cost=246866.84..249366.84 rows=1000000 width=220)
              Sort Key: t_distinct.glow, t_distinct.id
              -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=220)
</pre>
</div>
<p>This works, but is <em>very</em> inefficient (more than <strong>12</strong> seconds). <strong>PostgreSQL</strong> chooses the sorting in this case but it is not very good in sorting the tables with large rows.</p>
<p>This can be improved by making <strong>PostgreSQL</strong> to use the index which covers both columns and then join the <code>id</code> (which is also covered by the index):</p>
<pre class="brush: sql">
SELECT  di.id, di.orderer, di.glow, di.ghigh
FROM    (
        SELECT  id, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY id) AS rn
        FROM    t_distinct d
        ) dd
JOIN    t_distinct di
ON      di.id = dd.id
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X4873');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4873" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">4</td>
<td class="int4">9</td>
<td class="int4">4</td>
<td class="int4">4</td>
</tr>
<tr>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">5</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">4</td>
<td class="int4">6</td>
<td class="int4">6</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">6</td>
<td class="int4">7</td>
<td class="int4">7</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">8</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">7</td>
<td class="int4">9</td>
<td class="int4">9</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">2</td>
<td class="int4">10</td>
<td class="int4">10</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.9997s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=140907.84..205920.84 rows=5000 width=16)
  -&gt;  Subquery Scan dd  (cost=140907.84..173407.84 rows=5000 width=4)
        Filter: (dd.rn = 1)
        -&gt;  WindowAgg  (cost=140907.84..160907.84 rows=1000000 width=8)
              -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=8)
                    Sort Key: d.glow, d.id
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=8)
  -&gt;  Index Scan using t_distinct_pkey on t_distinct di  (cost=0.00..6.49 rows=1 width=16)
        Index Cond: (di.id = dd.id)
</pre>
</div>
<p>This is much faster (<strong>4 s</strong>) but there is still much space for improvement.</p>
<p>If we wish to order by <code>orderer</code> we need to define a method to resolve ties.</p>
<p>Using the same approach we can return all records with ties:</p>
<pre class="brush: sql">
SELECT  COUNT(*), SUM(id)
FROM    (
        SELECT  *, DENSE_RANK() OVER (PARTITION BY glow ORDER BY orderer) AS dr
        FROM    t_distinct d
        ) dd
WHERE   dr = 1
</pre>
<p><a href="#" onclick="xcollapse('X2727');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X2727" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
<th>sum</th>
</tr>
<tr>
<td class="int8">111058</td>
<td class="int8">55543096995</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0078s (16.9997s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=279391.85..279391.86 rows=1 width=4)
  -&gt;  Subquery Scan dd  (cost=246866.84..279366.84 rows=5000 width=4)
        Filter: (dd.dr = 1)
        -&gt;  WindowAgg  (cost=246866.84..266866.84 rows=1000000 width=220)
              -&gt;  Sort  (cost=246866.84..249366.84 rows=1000000 width=220)
                    Sort Key: d.glow, d.orderer
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=220)
</pre>
</div>
<p>, or resolve ties by return the record with the maximum <code>id</code> among those holding the minimum value of the <code>orderer</code>:</p>
<pre class="brush: sql">
SELECT  di.id, di.orderer, di.glow, di.ghigh
FROM    (
        SELECT  id, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY orderer, id DESC) AS rn
        FROM    t_distinct d
        ) dd
JOIN    t_distinct di
ON      di.id = dd.id
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X7172');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X7172" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (4.8593s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=140907.84..208420.84 rows=5000 width=16)
  -&gt;  Subquery Scan dd  (cost=140907.84..175907.84 rows=5000 width=4)
        Filter: (dd.rn = 1)
        -&gt;  WindowAgg  (cost=140907.84..163407.84 rows=1000000 width=12)
              -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=12)
                    Sort Key: d.glow, d.orderer, d.id
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=12)
  -&gt;  Index Scan using t_distinct_pkey on t_distinct di  (cost=0.00..6.49 rows=1 width=16)
        Index Cond: (di.id = dd.id)
</pre>
</div>
<p>As you can see, all these queries are elegant but rather inefficient.</p>
<h3>Using DISTINCT ON</h3>
<p><strong>PostgreSQL</strong> implements another way to return the whole records holding group-wise maximums or minimums.</p>
<p>By using a special clause, <a href="http://www.postgresql.org/docs/8.4/interactive/queries-select-lists.html"><code>DISTINCT ON</code></a>, we can return records holding only the distinct values of the certain columns. For this to work correctly, one needs to define an <code>ORDER BY</code> condition in addition to <code>DISTINCT ON</code>, with the leading expressions being the same as those using in <code>DISTINCT ON</code>. This guarantees that all the records belonging to each group would be consecutive if not for the <code>DISTINCT ON</code> clause.</p>
<p><code>DISTINCT ON</code> is applied after the <code>ORDER BY</code> condition. It just returns the first record from each group, skipping the others. This is very easy to do: return a record if the grouping expression changed from the previous row; don&#8217;t return if it didn&#8217;t.</p>
<p>This is quite similar to <strong>MySQL</strong>&#8216;s extension for <code>GROUP BY</code>, but, unlike <strong>MySQL</strong>, this solution guarantees correct order and the fact that all values returned will be taken from a single record.</p>
<p>This query cannot be used to return all records with ties (since the values of the grouping column won&#8217;t be distinct), but it will work if the ties are impossible (as in selecting a maximum <code>id</code>), or if a correct condition for resolving ties is provided.</p>
<p>Here&#8217;s the query to return records holding <code>MAX(id)</code>:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (glow) id, orderer, glow, ghigh
FROM    t_distinct
ORDER BY
        glow, id DESC
</pre>
<p><a href="#" onclick="xcollapse('X9913');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X9913" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999991</td>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">9991</td>
</tr>
<tr>
<td class="int4">999992</td>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">9992</td>
</tr>
<tr>
<td class="int4">999993</td>
<td class="int4">6</td>
<td class="int4">3</td>
<td class="int4">9993</td>
</tr>
<tr>
<td class="int4">999994</td>
<td class="int4">5</td>
<td class="int4">4</td>
<td class="int4">9994</td>
</tr>
<tr>
<td class="int4">999995</td>
<td class="int4">4</td>
<td class="int4">5</td>
<td class="int4">9995</td>
</tr>
<tr>
<td class="int4">999996</td>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">9996</td>
</tr>
<tr>
<td class="int4">999997</td>
<td class="int4">3</td>
<td class="int4">7</td>
<td class="int4">9997</td>
</tr>
<tr>
<td class="int4">999998</td>
<td class="int4">2</td>
<td class="int4">8</td>
<td class="int4">9998</td>
</tr>
<tr>
<td class="int4">999999</td>
<td class="int4">8</td>
<td class="int4">9</td>
<td class="int4">9999</td>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">6</td>
<td class="int4">10</td>
<td class="int4">10000</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.3593s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=140907.84..145907.84 rows=10 width=16)
  -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=16)
        Sort Key: glow, id
        -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=16)
</pre>
</div>
<p>And here&#8217;s the one to return the <code>MAX(id)</code> within the <code>MIN(orderer)</code>:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (glow) id, orderer, glow, ghigh
FROM    t_distinct
ORDER BY
        glow, orderer, id DESC
</pre>
<p><a href="#" onclick="xcollapse('X9326');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X9326" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.9530s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=140907.84..145907.84 rows=10 width=16)
  -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=16)
        Sort Key: glow, orderer, id
        -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=16)
</pre>
</div>
<p>This is more efficient than the window function. However, both queries still take <strong>4 seconds</strong>. This is almost <strong>40 times</strong> as much as the same queries in <strong>MySQL </strong>, even without any improvements.</p>
<p>Unlike <strong>MySQL</strong>, <strong>PostgreSQL</strong> does not implement loose index scan which would allow to jump over the distinct index records. However, it can be emulated using recursive <strong>CTE</strong>&#8216;s.</p>
<h3>Recursive CTE&#8217;s to emulate loose index scan</h3>
<p>The main idea here is simple:</p>
<ul>
<li>In the anchor part of the <strong>CTE</strong> take the lowest value of the key</li>
<li>In the recursive part of the <strong>CTE</strong> take the next value of the key by using <code>&gt;</code> or <code>&lt;</code> operators along with the <code>ORDER BY</code> and <code>LIMIT 1</code></li>
</ul>
<p><strong>PostgreSQL</strong>&#8216;s syntax allows compacting a whole table record into a single field (which can be exploded later). This will allow us to avoid joins by placing the whole recursive part into a subquery which will use the index efficiently.</p>
<p>Here&#8217;s the query to return records holding group-wise <code>MAX(id)</code>:</p>
<pre class="brush: sql">
WITH    RECURSIVE rows AS
        (
        SELECT  d
        FROM    (
                SELECT  d
                FROM    t_distinct d
                ORDER BY
                        glow DESC, id DESC
                LIMIT 1
                ) q
        UNION ALL
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow &lt; (r.d).glow
                ORDER BY
                        di.glow DESC, di.id DESC
                LIMIT 1
                )
        FROM    rows r
        WHERE   d IS NOT NULL
        )
SELECT  (d).id, (d).orderer, (d).glow, (d).ghigh
FROM    rows
WHERE   d IS NOT NULL
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">6</td>
<td class="int4">10</td>
<td class="int4">10000</td>
</tr>
<tr>
<td class="int4">999999</td>
<td class="int4">8</td>
<td class="int4">9</td>
<td class="int4">9999</td>
</tr>
<tr>
<td class="int4">999998</td>
<td class="int4">2</td>
<td class="int4">8</td>
<td class="int4">9998</td>
</tr>
<tr>
<td class="int4">999997</td>
<td class="int4">3</td>
<td class="int4">7</td>
<td class="int4">9997</td>
</tr>
<tr>
<td class="int4">999996</td>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">9996</td>
</tr>
<tr>
<td class="int4">999995</td>
<td class="int4">4</td>
<td class="int4">5</td>
<td class="int4">9995</td>
</tr>
<tr>
<td class="int4">999994</td>
<td class="int4">5</td>
<td class="int4">4</td>
<td class="int4">9994</td>
</tr>
<tr>
<td class="int4">999993</td>
<td class="int4">6</td>
<td class="int4">3</td>
<td class="int4">9993</td>
</tr>
<tr>
<td class="int4">999992</td>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">9992</td>
</tr>
<tr>
<td class="int4">999991</td>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">9991</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0005s (0.0049s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=232.94..234.96 rows=100 width=32)
  Filter: (d IS NOT NULL)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..232.94 rows=101 width=32)
          -&gt;  Subquery Scan q  (cost=0.00..2.24 rows=1 width=32)
                -&gt;  Limit  (cost=0.00..2.23 rows=1 width=40)
                      -&gt;  Index Scan Backward using ix_distinct_glow_id on t_distinct d  (cost=0.00..2231423.39 rows=1000000 width=40)
          -&gt;  WorkTable Scan on rows r  (cost=0.00..22.87 rows=10 width=32)
                Filter: (r.d IS NOT NULL)
                SubPlan 1
                  -&gt;  Limit  (cost=0.00..2.27 rows=1 width=40)
                        -&gt;  Index Scan Backward using ix_distinct_glow_id on t_distinct di  (cost=0.00..755578.45 rows=333333 width=40)
                              Index Cond: (glow &lt; ($1).glow)
</pre>
<p>As you can see, this query takes only <strong>5 ms</strong>, next to instant. This is because on each iteration step, the whole record can be returned in a single index seek for the first value of the key which is greater than the previous value.</p>
<p>If we wanted to resolve the ties with more complex conditions, the query would become a little more complex too.</p>
<p>Let&#8217;s consider the query to resolve ties by selecting <code>MAX(id)</code> within the <code>MIN(orderer)</code>, just like in the previous example.</p>
<p>The indexes we created order all columns in the same directions: <code>(glow ASC, orderer ASC, id ASC)</code>. Of course, the whole index could be used as well if <em>all</em> directions were reversed: <code>(glow DESC, orderer DESC, id DESC)</code>.</p>
<p>However, if only some of the directions are reversed, like in <code>(orderer DESC, id ASC)</code> (which is what we need here), the index cannot be used for ordering anymore.</p>
<p>The same problem was mentioned in one of the previous articles on <a href="http://explainextended.com/2009/11/24/mysql-selecting-records-holding-group-wise-maximum-on-a-unique-column/">selecting records holding group-wise maximums in <strong>MySQL</strong></a>. And this is the reason for the <code>MAX(id)</code> being less efficient than <code>MIN(id)</code> with a loose index scan (which is described in more details in the article aforementioned). However, <strong>MySQL</strong> deals with it automatically, while we need to implement this with our own hands.</p>
<p>To do this, we should need to use the same trick as we did in <strong>MySQL</strong>: select the <code>MIN(orderer)</code> and <code>MAX(id)</code> within this <code>orderer</code> in two different queries which would use two different index seeks, each in the appropriate direction.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE groups AS
        (
        SELECT  d
        FROM    (
                SELECT  d
                FROM    t_distinct d
                ORDER BY
                        glow, orderer
                LIMIT 1
                ) q
        UNION ALL
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow &gt; (g.d).glow
                ORDER BY
                        di.glow, di.orderer
                LIMIT 1
                )
        FROM    groups g
        WHERE   d IS NOT NULL
        ),
        rows AS
        (
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow = (g.d).glow
                        AND di.orderer = (g.d).orderer
                ORDER BY
                        id DESC
                LIMIT 1
                ) di
        FROM    groups g
        WHERE   d IS NOT NULL
        )
SELECT  (di).id, (di).orderer, (di).glow, (di).ghigh
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0005s (0.0058s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=588.47..590.47 rows=100 width=32)
  CTE groups
    -&gt;  Recursive Union  (cost=0.00..243.79 rows=101 width=32)
          -&gt;  Subquery Scan q  (cost=0.00..2.35 rows=1 width=32)
                -&gt;  Limit  (cost=0.00..2.34 rows=1 width=40)
                      -&gt;  Index Scan using ix_distinct_glow_orderer_id on t_distinct d  (cost=0.00..2343052.63 rows=1000000 width=40)
          -&gt;  WorkTable Scan on groups g  (cost=0.00..23.94 rows=10 width=32)
                Filter: (g.d IS NOT NULL)
                SubPlan 1
                  -&gt;  Limit  (cost=0.00..2.37 rows=1 width=40)
                        -&gt;  Index Scan using ix_distinct_glow_orderer_id on t_distinct di  (cost=0.00..791389.47 rows=333333 width=40)
                              Index Cond: (glow &gt; ($1).glow)
  CTE rows
    -&gt;  CTE Scan on groups g  (cost=0.00..344.68 rows=100 width=32)
          Filter: (d IS NOT NULL)
          SubPlan 3
            -&gt;  Limit  (cost=0.00..3.43 rows=1 width=36)
                  -&gt;  Index Scan Backward using ix_distinct_glow_orderer_id on t_distinct di  (cost=0.00..38073.13 rows=11111 width=36)
                        Index Cond: ((glow = ($3).glow) AND (orderer = ($3).orderer))
</pre>
<p>We see both <code>Index Scan</code> and <code>Index Scan Backward</code> in the plan above. The first one finds the <code>MIN(orderer)</code>, the second one finds the <code>MAX(id)</code> within the previously found value of the <code>orderer</code>.</p>
<p>Note that unlike <strong>MySQL</strong>, in <strong>PostgreSQL</strong> it&#8217;s enough to use just a single <code>ORDER BY id DESC</code> condition in the subquery which selects the top <code>id</code> within the records with the lowest <code>orderer</code>. <strong>PostgreSQL</strong>&#8216;s optimizer is smart enough to pick the correct index (that is the index on <code>(glow, orderer, id)</code>) to serve this query.</p>
<p>This query also takes only <strong>5 ms</strong>.</p>
<h4>Summary</h4>
<p>Unlike <strong>MySQL</strong>, <strong>PostgreSQL</strong> implements several clean and documented ways to select the records holding group-wise maximums, including window functions and <code>DISTINCT ON</code>.</p>
<p>However to the lack of the loose index scan support by the <strong>PostgreSQL</strong>&#8216;s optimizer and the less efficient usage of indexes in <strong>PostgreSQL</strong>, the queries using these function take too long.</p>
<p>To word around these problems and improve the queries against the low cardinality grouping conditions, a certain solution described in the article should be used.</p>
<p>This solution uses recursive <strong>CTE</strong>&#8216;s to emulate loose index scan and is very efficient if the grouping columns have low cardinality.</p>
<p><strong>To be continued.</strong></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recursive CTE&#8217;s: PostgreSQL</title>
		<link>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/</link>
		<comments>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 20:00:24 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3721</guid>
		<description><![CDATA[In the previous article on recursive CTE&#8216;s in SQL Server I demonstrated that they are not really set-based. SQL Server implements the recursive CTE&#8216;s syntax, but forbids all operations that do not distribute over UNION ALL, and each recursive step sees only a single record from the previous step. Now, let&#8217;s check the same operations [...]]]></description>
			<content:encoded><![CDATA[<p>In the previous article on <a href="/2009/11/18/sql-server-are-the-recursive-ctes-really-set-based/">recursive <strong>CTE</strong>&#8216;s in <strong>SQL Server</strong></a> I demonstrated that they are not really set-based.</p>
<p><strong>SQL Server</strong> implements the recursive <strong>CTE</strong>&#8216;s syntax, but forbids all operations that do not distribute over <code>UNION ALL</code>, and each recursive step sees only a single record from the previous step.</p>
<p>Now, let&#8217;s check the same operations in <strong>PostgreSQL 8.4</strong>.</p>
<p>To do this, we well write a query that selects only the very first branch of a tree: that is, each item would be the first child of its parent. To do this, we should select the item that would be the first child of the root, the select the first child of that item etc.</p>
<p>This is a set-based operation.</p>
<p><strong>Oracle</strong>&#8216;s <code>CONNECT BY</code> syntax, despite being set-based, offers some limited set-based capabilities: you can use <code>ORDER SIBLINGS BY</code> clause to define the order in which the siblings are returned. However, this would require some additional work to efficiently return only the first branch.</p>
<p>In a true set-based system, this is much more simple.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3721"></span></p>
<pre class="brush: sql">
CREATE TABLE t_recursive (
        id INT NOT NULL PRIMARY KEY,
        parent INT NOT NULL,
        orderer INT NOT NULL,
        data VARCHAR(100) NOT NULL
        );

CREATE INDEX ix_recursive_parent_orderer ON t_recursive (parent, orderer);

SELECT  SETSEED(0.20091123);

INSERT
INTO    t_recursive
SELECT  s, (s - 1) / 5, FLOOR(RANDOM() * 10000), &#039;Item &#039; || s
FROM    generate_series(1, 1000000) s;
</pre>
<p>This table contains <strong>1,000,000</strong> records and implements an <a href="/2009/09/24/adjacency-list-vs-nested-sets-postgresql/">adjacency tree hierarchy</a>.</p>
<p>Each item has at most <strong>5</strong> children and a randomly filled column, <code>orderer</code>, which defines its order along its siblings.</p>
<p>Now, let&#8217;s try to make a query that would select the first branch, the order being defined by the value provided in <code>orderer</code>.</p>
<p>To do this, we should make the anchor step to return the first child of the root item, and the recursive steps to return the first child of the previously returned item.</p>
<p>To return the first child of a given parent in the <code>orderer</code> order, we can use <strong>PostgreSQL</strong>&#8216;s <code>DISTINCT ON</code> functionality.</p>
<p>This is the same as <code>DISTINCT</code>, but can return the whole row rather than a column <code>DISTINCT</code> is being applied to.</p>
<p>If two rows share the value of the column <code>DISTINCT ON</code> is being applied to, only one of the rows will be returned. Which row will it be is defined by the <code>ORDER BY</code> clause.</p>
<p>This solves the problems like <q>return a single row that holds group-wise maximum</q>.</p>
<p>This query:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (parent) *
FROM    t_recursive
ORDER BY
        parent, orderer
</pre>
<p>is the same as this one:</p>
<pre class="brush: sql">
SELECT  (q.r).*
FROM    (
        SELECT  r, ROW_NUMBER() OVER (PARTITION BY grouper ORDER BY orderer) AS rn
        FROM    t_recursive r
        ) q
WHERE   rn = 1
</pre>
<p>, but more legible and in some cases more efficient.</p>
<p>To apply this clause to the task we need, we should just use <code>DISTINCT ON (parent)</code> in both anchor and recursive parts:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (parent) *
                FROM    t_recursive
                WHERE   parent = 0
                ORDER BY
                        parent, orderer
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (c.parent) c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.parent, c.orderer
                ) q2
        )
SELECT  *
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">0</td>
<td class="int4">1686</td>
<td class="varchar">Item 3</td>
</tr>
<tr>
<td class="int4">19</td>
<td class="int4">3</td>
<td class="int4">3370</td>
<td class="varchar">Item 19</td>
</tr>
<tr>
<td class="int4">98</td>
<td class="int4">19</td>
<td class="int4">42</td>
<td class="varchar">Item 98</td>
</tr>
<tr>
<td class="int4">492</td>
<td class="int4">98</td>
<td class="int4">1762</td>
<td class="varchar">Item 492</td>
</tr>
<tr>
<td class="int4">2464</td>
<td class="int4">492</td>
<td class="int4">2295</td>
<td class="varchar">Item 2464</td>
</tr>
<tr>
<td class="int4">12322</td>
<td class="int4">2464</td>
<td class="int4">2050</td>
<td class="varchar">Item 12322</td>
</tr>
<tr>
<td class="int4">61614</td>
<td class="int4">12322</td>
<td class="int4">768</td>
<td class="varchar">Item 61614</td>
</tr>
<tr>
<td class="int4">308074</td>
<td class="int4">61614</td>
<td class="int4">1925</td>
<td class="varchar">Item 308074</td>
</tr>
<tr class="statusbar">
<td colspan="100">8 rows fetched in 0.0004s (0.0042s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=3188.19..3207.83 rows=982 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..3188.19 rows=982 width=230)
          -&gt;  Unique  (cost=0.00..15.46 rows=2 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Unique  (cost=313.84..314.33 rows=98 width=23)
                -&gt;  Sort  (cost=313.84..314.08 rows=98 width=23)
                      Sort Key: c.parent, c.orderer
                      -&gt;  Nested Loop  (cost=0.00..310.60 rows=98 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.40 rows=20 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This gives us a single branch containing just the records we need: each one being the first child of its parent.</p>
<p>Now, what if we wanted to return a list of records, each being the <em>second</em> child to its parent?</p>
<p>Since each recursive part takes only one record as an input (and returns one record as an output), we can just replace <code>DISTINCT ON</code> (which returns the first child of each group) with a <code>OFFSET 1 LIMIT 1</code>:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_recursive
                WHERE   parent = 0
                ORDER BY
                        orderer
                OFFSET 1 LIMIT 1
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.orderer
                OFFSET 1 LIMIT 1
                ) q2
        )
SELECT  *
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">0</td>
<td class="int4">2540</td>
<td class="varchar">Item 1</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">1</td>
<td class="int4">3405</td>
<td class="varchar">Item 6</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">6</td>
<td class="int4">2884</td>
<td class="varchar">Item 33</td>
</tr>
<tr>
<td class="int4">166</td>
<td class="int4">33</td>
<td class="int4">3084</td>
<td class="varchar">Item 166</td>
</tr>
<tr>
<td class="int4">833</td>
<td class="int4">166</td>
<td class="int4">1848</td>
<td class="varchar">Item 833</td>
</tr>
<tr>
<td class="int4">4169</td>
<td class="int4">833</td>
<td class="int4">993</td>
<td class="varchar">Item 4169</td>
</tr>
<tr>
<td class="int4">20850</td>
<td class="int4">4169</td>
<td class="int4">3126</td>
<td class="varchar">Item 20850</td>
</tr>
<tr>
<td class="int4">104251</td>
<td class="int4">20850</td>
<td class="int4">3021</td>
<td class="varchar">Item 104251</td>
</tr>
<tr>
<td class="int4">521256</td>
<td class="int4">104251</td>
<td class="int4">5492</td>
<td class="varchar">Item 521256</td>
</tr>
<tr class="statusbar">
<td colspan="100">9 rows fetched in 0.0005s (0.0043s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=1564.44..1564.66 rows=11 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=3.09..1564.44 rows=11 width=230)
          -&gt;  Limit  (cost=3.09..6.18 rows=1 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Limit  (cost=155.79..155.79 rows=1 width=23)
                -&gt;  Sort  (cost=155.79..155.91 rows=49 width=23)
                      Sort Key: c.orderer
                      -&gt;  Nested Loop  (cost=0.00..155.30 rows=49 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.20 rows=10 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This query returns a whole branch of the items that are second children to their parents.</p>
<p>Both these queries are available in <strong>SQL Server 2005</strong>: despite the fact they do not distribute over <code>UNION ALL</code>, they can be rewritten using a <code>ROW_NUMBER()</code>.</p>
<p>As was shown in the <a href="/2009/11/18/sql-server-are-the-recursive-ctes-really-set-based/">previous article</a>, for some strange reason, <strong>SQL Server 2005</strong> does not forbid this clause, but rather implements it in the incorrect way. This, however, allows writing the query we&#8217;re after.</p>
<p>Now, let&#8217;s make a query that would recursively return the first <em>two children</em>.</p>
<p>This requires a set-based solution, since first two children can come from different parents.</p>
<p>A grandchild that is third to its grandfather won&#8217;t count, even if it was a first child to the first child of the grandfather.</p>
<p>A first child to the third child of the grandfather won&#8217;t count either, even if it was the first grandchild to its grandfather.</p>
<p>To do this right, on each step the query should accept <strong>2</strong> items, return <strong>2</strong> items and process the items accepted <em>at once</em>.</p>
<p>This is also easy in a true set-based recursive <strong>CTE</strong>:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_recursive r
                WHERE   parent = 0
                ORDER BY
                        orderer
                LIMIT 2
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.orderer
                LIMIT 2
                ) q
        )
SELECT  *
FROM    rows r
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">0</td>
<td class="int4">1686</td>
<td class="varchar">Item 3</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">0</td>
<td class="int4">2540</td>
<td class="varchar">Item 1</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">2181</td>
<td class="varchar">Item 8</td>
</tr>
<tr>
<td class="int4">19</td>
<td class="int4">3</td>
<td class="int4">3370</td>
<td class="varchar">Item 19</td>
</tr>
<tr>
<td class="int4">98</td>
<td class="int4">19</td>
<td class="int4">42</td>
<td class="varchar">Item 98</td>
</tr>
<tr>
<td class="int4">99</td>
<td class="int4">19</td>
<td class="int4">1351</td>
<td class="varchar">Item 99</td>
</tr>
<tr>
<td class="int4">497</td>
<td class="int4">99</td>
<td class="int4">1245</td>
<td class="varchar">Item 497</td>
</tr>
<tr>
<td class="int4">496</td>
<td class="int4">99</td>
<td class="int4">1255</td>
<td class="varchar">Item 496</td>
</tr>
<tr>
<td class="int4">2486</td>
<td class="int4">497</td>
<td class="int4">205</td>
<td class="varchar">Item 2486</td>
</tr>
<tr>
<td class="int4">2484</td>
<td class="int4">496</td>
<td class="int4">362</td>
<td class="varchar">Item 2484</td>
</tr>
<tr>
<td class="int4">12431</td>
<td class="int4">2486</td>
<td class="int4">29</td>
<td class="varchar">Item 12431</td>
</tr>
<tr>
<td class="int4">12423</td>
<td class="int4">2484</td>
<td class="int4">311</td>
<td class="varchar">Item 12423</td>
</tr>
<tr>
<td class="int4">62119</td>
<td class="int4">12423</td>
<td class="int4">1113</td>
<td class="varchar">Item 62119</td>
</tr>
<tr>
<td class="int4">62120</td>
<td class="int4">12423</td>
<td class="int4">1121</td>
<td class="varchar">Item 62120</td>
</tr>
<tr>
<td class="int4">310602</td>
<td class="int4">62120</td>
<td class="int4">341</td>
<td class="varchar">Item 310602</td>
</tr>
<tr>
<td class="int4">310605</td>
<td class="int4">62120</td>
<td class="int4">468</td>
<td class="varchar">Item 310605</td>
</tr>
<tr class="statusbar">
<td colspan="100">16 rows fetched in 0.0007s (0.0043s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows r  (cost=3122.65..3123.09 rows=22 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..3122.65 rows=22 width=230)
          -&gt;  Limit  (cost=0.00..6.18 rows=2 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive r  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Limit  (cost=311.58..311.58 rows=2 width=23)
                -&gt;  Sort  (cost=311.58..311.82 rows=98 width=23)
                      Sort Key: c.orderer
                      -&gt;  Nested Loop  (cost=0.00..310.60 rows=98 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.40 rows=20 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This query takes all children of the previous set and returns the first two of their descendants.</p>
<p>We see that <strong>PostgreSQL</strong>, unlike <strong>SQL Server</strong>, implements the recursive <strong>CTE</strong>&#8216;s in truly set-based way.</p>
<p>The recursive part of the query can accept a set on input, return a set on output and do the set-based operations on the set received.</p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2009/11/23/recursive-ctes-postgresql/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2009/11/23/recursive-ctes-postgresql/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shuffling rows: PostgreSQL</title>
		<link>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/</link>
		<comments>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 19:00:08 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3342</guid>
		<description><![CDATA[Answering questions asked on the site. Josh asks: I am building a music application and need to create a playlist of arbitrary length from the tracks stored in the database. This playlist should be shuffled and a track can repeat only after at least 10 other tracks had been played. Is it possible to do [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Josh</strong> asks:</p>
<blockquote><p>
I am building a music application and need to create a playlist of arbitrary length from the tracks stored in the database.</p>
<p>This playlist should be shuffled and a track can repeat only after at least <strong>10</strong> other tracks had been played.</p>
<p>Is it possible to do this with a single <strong>SQL</strong> query or I need to create a cursor?</p>
<p>This is in <strong>PostgreSQL 8.4</strong>
</p></blockquote>
<p><strong>PostgreSQL 8.4</strong> is a wise choice, since it introduces some new features that ease this task.</p>
<p>To do this we just need to keep a running set that would hold the previous <strong>10</strong> tracks so that we could filter on them.</p>
<p><strong>PostgreSQL 8.4</strong> supports recursive <strong>CTE</strong>&#8216;s that allow iterating the resultsets, and arrays that can be easily used to keep the set of <strong>10</strong> latest tracks.</p>
<p>Here&#8217;s what we should do to build the playlist:</p>
<ol>
<li>We make a recursive <strong>CTE</strong> that would generate as many records as we need and just use <strong>LIMIT</strong> to limit the number</li>
<li>The base part of the <strong>CTE</strong> is just a random record (fetched with <code>ORDER BY RANDOM() LIMIT 1</code>)</li>
<li>The base part also defines the <strong>queue</strong>. This is an array which holds <strong>10</strong> latest records selected. It is initialized in the base part with the <code>id</code> of the random track just selected</li>
<li>The recursive part of the <strong>CTE</strong> joins the previous record with the table, making sure that no record from the latest <strong>10</strong> will be selected on this step. To do this, we just use the array operator <code>&lt;@</code> (<em>contained by</em>)</li>
<li>The recursive part adds newly selected record to the queue. The queue should be no more than 10 records long, that&#8217;s why we apply array slicing operator to it (<code>[1:10]</code>)</li>
</ol>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3342"></span><br />
<a href="#" onclick="xcollapse('X179');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X179" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_track
        (
        id INT NOT NULL PRIMARY KEY,
        name VARCHAR(20) NOT NULL
        );

INSERT
INTO    t_track
SELECT  s, &#039;Track &#039; || s
FROM    generate_series(1, 1000) s;

ANALYZE t_track;
</pre>
</div>
<p>This table is quite simple: it just contains <strong>1,000</strong> tracks with generated names.</p>
<p>And here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        shuffle AS
        (
        SELECT  *
        FROM    (
                SELECT  id, name, ARRAY[id] AS queue
                FROM    t_track
                ORDER BY
                        RANDOM()
                LIMIT 1
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  t.id, t.name, (t.id || s.queue)[1:10]
                FROM    shuffle s
                JOIN    t_track t
                ON      NOT ARRAY[t.id] &lt;@ s.queue
                ORDER BY
                        RANDOM()
                LIMIT 1
                ) q
        )
SELECT  id, name, queue::VARCHAR
FROM    shuffle
LIMIT 30
</pre>
<p><a href="#" onclick="xcollapse('X7857');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X7857" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>name</th>
<th>queue</th>
</tr>
<tr>
<td class="int4">739</td>
<td class="varchar">Track 739</td>
<td class="varchar">{739}</td>
</tr>
<tr>
<td class="int4">811</td>
<td class="varchar">Track 811</td>
<td class="varchar">{811,739}</td>
</tr>
<tr>
<td class="int4">216</td>
<td class="varchar">Track 216</td>
<td class="varchar">{216,811,739}</td>
</tr>
<tr>
<td class="int4">192</td>
<td class="varchar">Track 192</td>
<td class="varchar">{192,216,811,739}</td>
</tr>
<tr>
<td class="int4">286</td>
<td class="varchar">Track 286</td>
<td class="varchar">{286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">287</td>
<td class="varchar">Track 287</td>
<td class="varchar">{287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">856</td>
<td class="varchar">Track 856</td>
<td class="varchar">{856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">371</td>
<td class="varchar">Track 371</td>
<td class="varchar">{371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">336</td>
<td class="varchar">Track 336</td>
<td class="varchar">{336,371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">558</td>
<td class="varchar">Track 558</td>
<td class="varchar">{558,336,371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">99</td>
<td class="varchar">Track 99</td>
<td class="varchar">{99,558,336,371,856,287,286,192,216,811}</td>
</tr>
<tr>
<td class="int4">462</td>
<td class="varchar">Track 462</td>
<td class="varchar">{462,99,558,336,371,856,287,286,192,216}</td>
</tr>
<tr>
<td class="int4">653</td>
<td class="varchar">Track 653</td>
<td class="varchar">{653,462,99,558,336,371,856,287,286,192}</td>
</tr>
<tr>
<td class="int4">682</td>
<td class="varchar">Track 682</td>
<td class="varchar">{682,653,462,99,558,336,371,856,287,286}</td>
</tr>
<tr>
<td class="int4">329</td>
<td class="varchar">Track 329</td>
<td class="varchar">{329,682,653,462,99,558,336,371,856,287}</td>
</tr>
<tr>
<td class="int4">365</td>
<td class="varchar">Track 365</td>
<td class="varchar">{365,329,682,653,462,99,558,336,371,856}</td>
</tr>
<tr>
<td class="int4">72</td>
<td class="varchar">Track 72</td>
<td class="varchar">{72,365,329,682,653,462,99,558,336,371}</td>
</tr>
<tr>
<td class="int4">841</td>
<td class="varchar">Track 841</td>
<td class="varchar">{841,72,365,329,682,653,462,99,558,336}</td>
</tr>
<tr>
<td class="int4">159</td>
<td class="varchar">Track 159</td>
<td class="varchar">{159,841,72,365,329,682,653,462,99,558}</td>
</tr>
<tr>
<td class="int4">521</td>
<td class="varchar">Track 521</td>
<td class="varchar">{521,159,841,72,365,329,682,653,462,99}</td>
</tr>
<tr>
<td class="int4">736</td>
<td class="varchar">Track 736</td>
<td class="varchar">{736,521,159,841,72,365,329,682,653,462}</td>
</tr>
<tr>
<td class="int4">759</td>
<td class="varchar">Track 759</td>
<td class="varchar">{759,736,521,159,841,72,365,329,682,653}</td>
</tr>
<tr>
<td class="int4">142</td>
<td class="varchar">Track 142</td>
<td class="varchar">{142,759,736,521,159,841,72,365,329,682}</td>
</tr>
<tr>
<td class="int4">607</td>
<td class="varchar">Track 607</td>
<td class="varchar">{607,142,759,736,521,159,841,72,365,329}</td>
</tr>
<tr>
<td class="int4">331</td>
<td class="varchar">Track 331</td>
<td class="varchar">{331,607,142,759,736,521,159,841,72,365}</td>
</tr>
<tr>
<td class="int4">957</td>
<td class="varchar">Track 957</td>
<td class="varchar">{957,331,607,142,759,736,521,159,841,72}</td>
</tr>
<tr>
<td class="int4">985</td>
<td class="varchar">Track 985</td>
<td class="varchar">{985,957,331,607,142,759,736,521,159,841}</td>
</tr>
<tr>
<td class="int4">702</td>
<td class="varchar">Track 702</td>
<td class="varchar">{702,985,957,331,607,142,759,736,521,159}</td>
</tr>
<tr>
<td class="int4">914</td>
<td class="varchar">Track 914</td>
<td class="varchar">{914,702,985,957,331,607,142,759,736,521}</td>
</tr>
<tr>
<td class="int4">569</td>
<td class="varchar">Track 569</td>
<td class="varchar">{569,914,702,985,957,331,607,142,759,736}</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0012s (0.1246s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=3444.86..3445.13 rows=11 width=94)
  CTE shuffle
    -&gt;  Recursive Union  (cost=23.50..3444.86 rows=11 width=94)
          -&gt;  Subquery Scan q  (cost=23.50..23.51 rows=1 width=94)
                -&gt;  Limit  (cost=23.50..23.50 rows=1 width=13)
                      -&gt;  Sort  (cost=23.50..26.00 rows=1000 width=13)
                            Sort Key: (random())
                            -&gt;  Seq Scan on t_track  (cost=0.00..18.50 rows=1000 width=13)
          -&gt;  Subquery Scan q  (cost=342.10..342.11 rows=1 width=94)
                -&gt;  Limit  (cost=342.10..342.10 rows=1 width=45)
                      -&gt;  Sort  (cost=342.10..367.08 rows=9990 width=45)
                            Sort Key: (random())
                            -&gt;  Nested Loop  (cost=17.00..292.15 rows=9990 width=45)
                                  Join Filter: (NOT (ARRAY[t.id] &lt;@ s.queue))
                                  -&gt;  WorkTable Scan on shuffle s  (cost=0.00..0.20 rows=10 width=32)
                                  -&gt;  Materialize  (cost=17.00..27.00 rows=1000 width=13)
                                        -&gt;  Seq Scan on t_track t  (cost=0.00..16.00 rows=1000 width=13)
  -&gt;  CTE Scan on shuffle  (cost=0.00..0.28 rows=11 width=94)
</pre>
</div>
<p>This query selects first <strong>30</strong> records but the <code>LIMIT</code> clause can be changed to select an arbitrary number of records (including that exceeding <strong>1,000</strong>), since we don&#8217;t apply any limits into the recursive part of the query.</p>
<p>Normally, the <code>queue</code> would be hidden but I left it just to illustrate what&#8217;s going on. As you can see, the <code>queue</code> holds the <code>id</code>&#8216;s of last <strong>10</strong> records.</p>
<p>The query runs for <strong>120 ms</strong> which is quite fast but could be yet improved using approaches described in <a href="/2009/07/18/postgresql-8-4-sampling-random-rows/"><strong>PostgreSQL 8.4: sampling random rows</strong></a>. However, this will make the query too hard to read and <code>ORDER BY RANDOM()</code> is just fine to demonstrates the principle.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
<div class='wb_fb_bottom'><!-- Wordbooker created FB tags --> <iframe src="http://www.facebook.com/plugins/like.php?locale=en_US&href=http://explainextended.com/2009/10/06/shuffling-rows-postgresql/&amp;layout=standard&amp;show_faces=false&amp;width=250&amp;action=like&amp;colorscheme=light&amp;font=arial&amp;height=35px" scrolling="no" frameborder="no" style="border:none; overflow:hidden; width:250px; height:35px;" allowTransparency="true"></iframe><div style="float:right;"><!-- Wordbooker created FB tags --> <a name="fb_share" type="button" share_url="http://explainextended.com/2009/10/06/shuffling-rows-postgresql/"></a></div></div>]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

