<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>EXPLAIN EXTENDED &#187; PostgreSQL</title>
	<atom:link href="http://explainextended.com/category/postgresql/feed/" rel="self" type="application/rss+xml" />
	<link>http://explainextended.com</link>
	<description>How to create fast database queries</description>
	<lastBuildDate>Wed, 25 Aug 2010 13:29:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Date ranges: overlapping with priority</title>
		<link>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/</link>
		<comments>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 19:00:31 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4663</guid>
		<description><![CDATA[Answering questions asked on the site. Jason Foster asks: We have a table of student registrations: Students student_code course_code course_section session_cd 987654321 ESC102H1 Y 20085 998766543 ELEE203H F 20085 course_code and course_section identify a course, session_cd is an academic session, e. g. 20085, 20091, 20079. The courses (stored in another table) have associated values for [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Jason Foster</strong> asks:</p>
<blockquote>
<p>We have a table of student registrations: </p>
<table class="excel">
<caption>Students</caption>
<tr>
<th>student_code</th>
<th>course_code</th>
<th>course_section</th>
<th>session_cd</th>
</tr>
<tr>
<td>987654321</td>
<td>ESC102H1</td>
<td>Y</td>
<td>20085</td>
</tr>
<tr>
<td>998766543</td>
<td>ELEE203H</td>
<td>F</td>
<td>20085</td>
</tr>
</table>
<p><code>course_code</code> and <code>course_section</code> identify a course, <code>session_cd</code> is an academic session, e. g. <strong>20085</strong>, <strong>20091</strong>, <strong>20079</strong>.</p>
<p>The courses (stored in another table) have associated values for <q>engineering design</q>, <q>complementary studies</q>, etc., like that:</p>
<table class="excel">
<caption>Courses</caption>
<tr>
<th>course_code</th>
<th>course_section</th>
<th>start_session</th>
<th>end_session</th>
<th>design</th>
<th>science</th>
<th>studies</th>
</tr>
<tr>
<td>ESC102H1</td>
<td>F</td>
<td>20071</td>
<td>20099</td>
<td>10</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>AER201Y1</td>
<td>Y</td>
<td>20059</td>
<td>NULL</td>
<td>0</td>
<td>0</td>
<td>30</td>
</tr>
</table>
<p>, or like that:</p>
<table class="excel">
<caption>In-house courses</caption>
<tr>
<th>course_code</th>
<th>course_section</th>
<th>student_code</th>
<th>design</th>
<th>science</th>
<th>studies</th>
</tr>
<tr>
<td>ESC102H1</td>
<td>F</td>
<td>998766543</td>
<td>10</td>
<td>0</td>
<td>0</td>
</tr>
</table>
<p>We are required by an external accreditation body to add up all of the units of <q>engineering design</q>, <q>complementary studies</q>, etc., taken by an individual student.</p>
<p>Where it gets really messy is that we have multiple data feeds for the associated values of courses. For example we have a set from the <strong>Registrar&#8217;s Office</strong>, the <strong>Civil Department</strong>, our <strong>In-House</strong> version, etc.</p>
<p>The rule is that <strong>In-House</strong> beats <strong>Civil</strong> beats the <strong>Registrar&#8217;s Office</strong> in the case of any duplication within the overlapping intervals.</p>
<p>The <code>session_cd</code> is of the form <code>YYYY{1,5,9}</code>.</p>
</blockquote>
<p>Basically, we have three sets here.</p>
<p>To get the course hours for a given student we should find a record for him in the in-house set, or, failing that, find if the session is within the ranges of one of the external sets (<strong>Civil</strong> or <strong>Registrar</strong>). If both ranges contain the academic session the student took the course, <strong>Civil</strong> should be taken.</p>
<p>The first part is quite simple: we just <code>LEFT JOIN</code> students with the in-house courses and get the hours for the courses which are filled. The real problem is the next part: searching for the ranges containing a given value.</p>
<p>As I already mentioned in the previous posts, relational databases are not in general that efficient for the queries like that. It&#8217;s easy to use an index to find a value of a column within a given range, but <strong>B-Tree</strong> indexes are of little help in searching for a range of two columns containing a given value.</p>
<p>However, in this case, the <a href="http://en.wikipedia.org/wiki/Data_domain">data domain</a> of <code>session_cd</code> is quite a limited set. For each pair of <code>session_start</code> and <code>session_end</code> it is easy to create a set of <em>all</em> possible values between <code>session_start</code> and <code>session_end</code>.</p>
<p>The overlapping parts of the session ranges from the two sets will yields two records for each of the sessions belonging to the range. Of these two records we will need to take the relevant one (that is <strong>Civil</strong>) by using <code>DISTINCT ON</code> with the additional sorting on the source (<strong>Civil</strong> goes first).</p>
<p>Then we just join the relevant records to the subset of the <code>students</code> which does not have corresponding records in the in-house version.</p>
<p>Finally, we need to union this with the in-house recordset.<br />
<span id="more-4663"></span></p>
<h3>Pictures</h3>
<p>Here&#8217;s the same thing in pictures:</p>
<ol>
<li>
<p>Within each source, courses are defined by the a single record holding the start and end session:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/lines.png" alt="" title="Lines" width="300" height="600" class="aligncenter size-full wp-image-4676 noborder" /></p>
</li>
<li>
<p>To find the courses superposition (regarding the priority) we split each range into a number of records, each corresponding to a single session, then combine these records in a singe recordset, ordered by session then by source:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/records.png" alt="" title="Records" width="300" height="450" class="aligncenter size-full wp-image-4682 noborder" /></p>
</li>
<li>
<p>From each session, we take a single record with the higher priority and use to to join with the students table:</p>
<p><img src="http://explainextended.com/wp-content/uploads/2010/04/bricks.png" alt="" title="Bricks" width="300" height="600" class="aligncenter size-full wp-image-4675 noborder" /></p>
</li>
</ol>
<h3>Query</h3>
<p>Now, let&#8217;s create some sample tables and see how it works:</p>
<p><a href="#" onclick="xcollapse('X8397');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X8397" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_inhouse
        (
        course INT NOT NULL,
        student INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL,
        PRIMARY KEY (course, student)
        );

CREATE TABLE t_civil
        (
        id INT NOT NULL PRIMARY KEY,
        session_start INT NOT NULL, session_end INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL
        );

CREATE TABLE t_registrar
        (
        id INT NOT NULL PRIMARY KEY,
        session_start INT NOT NULL, session_end INT NOT NULL,
        hours1 INT NOT NULL, hours2 INT NOT NULL, hours3 INT NOT NULL
        );

CREATE TABLE t_student
        (
        id INT NOT NULL,
        course INT NOT NULL,
        session INT NOT NULL,
        PRIMARY KEY (id, course, session)
        );

SELECT  SETSEED(0.20100407);

INSERT
INTO    t_civil
SELECT  n,
        ((e / 3)) * 100 + (ARRAY[1, 5, 9])[e % 3 + 1],
        (((e + l) / 3)) * 100 + (ARRAY[1, 5, 9])[(e + l) % 3 + 1],
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n,
                6000 + CEILING(RANDOM() * 10)::INTEGER AS e,
                CEILING(RANDOM() * 20)::INTEGER AS l
        FROM    generate_series(1, 200) n
        ) q;

INSERT
INTO    t_registrar
SELECT  n,
        ((e / 3)) * 100 + (ARRAY[1, 5, 9])[e % 3 + 1],
        (((e + l) / 3)) * 100 + (ARRAY[1, 5, 9])[(e + l) % 3 + 1],
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n,
                6000 + CEILING(RANDOM() * 10)::INTEGER AS e,
                CEILING(RANDOM() * 20)::INTEGER AS l
        FROM    generate_series(1, 200) n
        ) q;

INSERT
INTO    t_inhouse
SELECT  n,
        s,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER,
        CEILING(RANDOM() * 50)::INTEGER
FROM    (
        SELECT  n, generate_series(1, 100) s
        FROM    (
                SELECT  n
                FROM    generate_series(1, 200) n
                ) q
        ) q;

INSERT
INTO    t_student
SELECT  i, c,
        ((s / 3)) * 100 + (ARRAY[1, 5, 9])[s % 3 + 1]
FROM    (
        SELECT  *, 6000 + CEILING(RANDOM() * 30)::INTEGER AS s,
                RANDOM() AS rnd
        FROM    (
                SELECT  *,
                        generate_series(1, 200) c
                FROM    (
                        SELECT  i
                        FROM    generate_series(1, 500) i
                        ) q
                ) q
        ) q
WHERE   rnd &lt; 0.1
</pre>
</div>
<p>The tables contain <strong>500</strong> students, random civil and registrar ranges for <strong>200</strong> courses, and <strong>20,000</strong> in-house records for first <strong>100</strong> students.</p>
<p>And here&#8217;s the query (limited to return first <strong>10</strong> records for the sake of readability):</p>
<pre class="brush: sql">
SELECT  1 AS sourse, s.id AS student, i.course AS course, hours1, hours2, hours3
FROM    t_student s
JOIN    t_inhouse i
ON      i.student = s.id
        AND i.course = s.course
UNION ALL
SELECT  source, s.id, q.id, hours1, hours2, hours3
FROM    t_student s
JOIN    (
        SELECT  DISTINCT ON (current_session, id) *
        FROM    (
                SELECT  *,
                        ((cs / 3)) * 100 + (ARRAY[1, 5, 9])[cs % 3 + 1] AS current_session
                FROM    (
                        SELECT  *,
                                generate_series(e, l) AS cs
                        FROM    (
                                SELECT  *,
                                        session_start * 3 / 100 + CASE (session_start % 100) WHEN 1 THEN 0 WHEN 5 THEN 1 ELSE 2 END AS e,
                                        session_end * 3 / 100 + CASE (session_end % 100) WHEN 1 THEN 0 WHEN 5 THEN 1 ELSE 2 END AS l
                                FROM    (
                                        SELECT  2 AS source, *
                                        FROM    t_civil
                                        UNION ALL
                                        SELECT  3 AS source, *
                                        FROM    t_registrar
                                        ) q
                                ) q
                        ) q
                ) q
        ORDER BY
                current_session, id, source
        ) q
ON      q.current_session = s.session
        AND q.id = s.course
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    t_inhouse ih
        WHERE   ih.student = s.id
                AND ih.course = s.course
        )
ORDER BY
        student, course
LIMIT 10
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>sourse</th>
<th>student</th>
<th>course</th>
<th>hours1</th>
<th>hours2</th>
<th>hours3</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">21</td>
<td class="int4">25</td>
<td class="int4">44</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">18</td>
<td class="int4">49</td>
<td class="int4">49</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">26</td>
<td class="int4">12</td>
<td class="int4">26</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">27</td>
<td class="int4">32</td>
<td class="int4">38</td>
<td class="int4">22</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">39</td>
<td class="int4">39</td>
<td class="int4">27</td>
<td class="int4">36</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">44</td>
<td class="int4">32</td>
<td class="int4">26</td>
<td class="int4">30</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">51</td>
<td class="int4">3</td>
<td class="int4">21</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">54</td>
<td class="int4">7</td>
<td class="int4">10</td>
<td class="int4">34</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">57</td>
<td class="int4">46</td>
<td class="int4">34</td>
<td class="int4">10</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">63</td>
<td class="int4">50</td>
<td class="int4">18</td>
<td class="int4">7</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0008s (0.0968s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=2323.47..2323.50 rows=10 width=20)
  -&gt;  Sort  (cost=2323.47..2328.52 rows=2019 width=20)
        Sort Key: s.id, i.course
        -&gt;  Append  (cost=817.31..2279.84 rows=2019 width=20)
              -&gt;  Merge Join  (cost=817.31..1837.75 rows=1979 width=20)
                    Merge Cond: ((i.course = s.course) AND (i.student = s.id))
                    -&gt;  Index Scan using t_inhouse_pkey on t_inhouse i  (cost=0.00..825.94 rows=20000 width=20)
                    -&gt;  Sort  (cost=817.21..842.18 rows=9986 width=8)
                          Sort Key: s.course, s.id
                          -&gt;  Seq Scan on t_student s  (cost=0.00..153.86 rows=9986 width=8)
              -&gt;  Nested Loop Anti Join  (cost=362.94..421.91 rows=40 width=24)
                    -&gt;  Hash Join  (cost=362.94..404.44 rows=50 width=28)
                          Hash Cond: ((((((q.cs / 3) * 100) + (&#39;{1,5,9}&#39;::integer[])[((q.cs % 3) + 1)])) = s.session) AND (q.id = s.course))
                          -&gt;  Unique  (cost=59.29..62.29 rows=200 width=40)
                                -&gt;  Sort  (cost=59.29..60.29 rows=400 width=40)
                                      Sort Key: ((((q.cs / 3) * 100) + (&#39;{1,5,9}&#39;::integer[])[((q.cs % 3) + 1)])), q.id, q.source
                                      -&gt;  Subquery Scan q  (cost=0.00..42.00 rows=400 width=40)
                                            -&gt;  Result  (cost=0.00..33.00 rows=400 width=56)
                                                  -&gt;  Append  (cost=0.00..8.00 rows=400 width=56)
                                                        -&gt;  Seq Scan on t_civil  (cost=0.00..4.00 rows=200 width=56)
                                                        -&gt;  Seq Scan on t_registrar  (cost=0.00..4.00 rows=200 width=56)
                          -&gt;  Hash  (cost=153.86..153.86 rows=9986 width=12)
                                -&gt;  Seq Scan on t_student s  (cost=0.00..153.86 rows=9986 width=12)
                    -&gt;  Index Scan using t_inhouse_pkey on t_inhouse ih  (cost=0.00..0.35 rows=1 width=8)
                          Index Cond: ((ih.course = s.course) AND (ih.student = s.id))
</pre>
<p>The query returns the required hours and the source of these hours for each of the courses a student attended.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/04/07/date-ranges-overlapping-with-priority/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: using recursive functions in nested sets</title>
		<link>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/</link>
		<comments>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 20:00:45 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4528</guid>
		<description><![CDATA[In the previous article, I discussed a way to improve nested sets model in PostgreSQL. The approach shown in the article used an analytical function to filter all immediate children of a node in a recursive CTE. This allowed us to filter a node&#8217;s children on the level more efficiently than R-Tree or B-Tree approaches [...]]]></description>
			<content:encoded><![CDATA[<p>In the previous article, I discussed <a href="/2010/03/01/postgresql-nested-sets-and-r-tree/">a way to improve nested sets model in <strong>PostgreSQL</strong></a>.</p>
<p>The approach shown in the article used an analytical function to filter all immediate children of a node in a recursive <strong>CTE</strong>.</p>
<p>This allowed us to filter a node&#8217;s children on the level more efficiently than <strong>R-Tree</strong> or <strong>B-Tree</strong> approaches do (since they rely on <code>COUNT(*)</code>).</p>
<p>That solution was pure <strong>SQL</strong> and it was quite fast, but not optimal.</p>
<p>The drawback of that solution is that it still needs to fetch all children of a node to apply the analytic function to them. This can take much time for the top of the hierarchy. And since the top of the hierarchy is what is what usually shown at the start page, it would be very nice to improve this query yet a little more.</p>
<p>We can do it by creating and using a simple recursive <strong>SQL</strong> function. This function does not even require <strong>PL/pgSQL</strong> to be enabled.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4528"></span><br />
<a href="#" onclick="xcollapse('X9881');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X9881" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_hierarchy (
        id INT NOT NULL,
        parent INT NOT NULL,
        lft INT NOT NULL,
        rgt INT NOT NULL,
        data VARCHAR(100) NOT NULL,
        stuffing VARCHAR(100) NOT NULL
);

INSERT
INTO    t_hierarchy
WITH RECURSIVE
        ini AS
        (
        SELECT  8 AS level, 5 AS children
        ),
        range AS
        (
        SELECT  level, children,
                (
                SELECT  SUM(POW(children, n)::INTEGER * ((n &lt; level)::INTEGER + 1))
                FROM    generate_series(level, 0, -1) n
                ) width
        FROM    ini
        ),
        q AS
        (
        SELECT  s AS id, 0 AS parent, level, children,
                1 + width * (s - 1) AS lft,
                1 + width * s - 1 AS rgt,
                width / children AS width
        FROM    (
                SELECT  r.*, generate_series(1, children) s
                FROM    range r
                ) q2
        UNION ALL
        SELECT  id * children + position, id, level - 1, children,
                1 + lft + width * (position - 1),
                1 + lft + width * position - 1,
                width / children
        FROM    (
                SELECT  generate_series(1, children) AS position, q.*
                FROM    q
                ) q2
        WHERE   level &gt; 0
        )
SELECT  id, parent, lft, rgt, &#039;Value &#039; || id, RPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    q;

ALTER TABLE t_hierarchy ADD CONSTRAINT pk_hierarchy_id PRIMARY KEY (id);
CREATE UNIQUE INDEX ux_hierarchy_lft ON t_hierarchy (lft);
CREATE UNIQUE INDEX ux_hierarchy_rgt ON t_hierarchy (rgt);
CREATE INDEX ix_hierarchy_parent ON t_hierarchy (parent);
CREATE INDEX ix_hierarchy_sets ON t_hierarchy USING GIST(POLYGON(BOX(POINT(-1, lft), POINT(1, rgt))));
</pre>
</div>
<p>If we run the query introduced in the previous article to fetch all children up to level <strong>2</strong> from a really top node, we get the following results:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, lft, rgt, 1 AS lvl
        FROM    t_hierarchy
        WHERE   id = 1
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt, lvl + 1
                FROM    q
                JOIN    t_hierarchy hc
                ON      hc.lft &gt; q.lft
                        AND hc.lft &lt; q.rgt
                WHERE   lvl &lt;= 2
                ORDER BY
                        MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft), hc.lft
                ) q2
        )
SELECT  *
FROM    q
</pre>
<p><a href="#" onclick="xcollapse('X3758');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X3758" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>lvl</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">585937</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">2</td>
<td class="int4">117188</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">117189</td>
<td class="int4">234375</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">234376</td>
<td class="int4">351562</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">351563</td>
<td class="int4">468749</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">468750</td>
<td class="int4">585936</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">31</td>
<td class="int4">3</td>
<td class="int4">23439</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">32</td>
<td class="int4">23440</td>
<td class="int4">46876</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">46877</td>
<td class="int4">70313</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">34</td>
<td class="int4">70314</td>
<td class="int4">93750</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">35</td>
<td class="int4">93751</td>
<td class="int4">117187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">36</td>
<td class="int4">117190</td>
<td class="int4">140626</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">37</td>
<td class="int4">140627</td>
<td class="int4">164063</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">38</td>
<td class="int4">164064</td>
<td class="int4">187500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">39</td>
<td class="int4">187501</td>
<td class="int4">210937</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">40</td>
<td class="int4">210938</td>
<td class="int4">234374</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">41</td>
<td class="int4">234377</td>
<td class="int4">257813</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">43</td>
<td class="int4">281251</td>
<td class="int4">304687</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">44</td>
<td class="int4">304688</td>
<td class="int4">328124</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">45</td>
<td class="int4">328125</td>
<td class="int4">351561</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">46</td>
<td class="int4">351564</td>
<td class="int4">375000</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">47</td>
<td class="int4">375001</td>
<td class="int4">398437</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">48</td>
<td class="int4">398438</td>
<td class="int4">421874</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">49</td>
<td class="int4">421875</td>
<td class="int4">445311</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">50</td>
<td class="int4">445312</td>
<td class="int4">468748</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">51</td>
<td class="int4">468751</td>
<td class="int4">492187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">52</td>
<td class="int4">492188</td>
<td class="int4">515624</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">53</td>
<td class="int4">515625</td>
<td class="int4">539061</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">54</td>
<td class="int4">539062</td>
<td class="int4">562498</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">55</td>
<td class="int4">562499</td>
<td class="int4">585935</td>
<td class="int4">3</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0169s (14.7499s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on q  (cost=3923687.62..4086447.04 rows=8137971 width=16)
  CTE q
    -&gt;  Recursive Union  (cost=0.00..3923687.62 rows=8137971 width=16)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy  (cost=0.00..8.54 rows=1 width=12)
                Index Cond: (id = 1)
          -&gt;  Subquery Scan q2  (cost=363885.01..376091.97 rows=813797 width=16)
                -&gt;  Unique  (cost=363885.01..367954.00 rows=813797 width=20)
                      -&gt;  Sort  (cost=363885.01..365919.50 rows=813797 width=20)
                            Sort Key: (max(hc.rgt) OVER (?)), hc.lft
                            -&gt;  WindowAgg  (cost=265682.87..283993.30 rows=813797 width=20)
                                  -&gt;  Sort  (cost=265682.87..267717.36 rows=813797 width=20)
                                        Sort Key: q.id, hc.lft
                                        -&gt;  Nested Loop  (cost=5335.66..185791.16 rows=813797 width=20)
                                              -&gt;  WorkTable Scan on q  (cost=0.00..0.22 rows=3 width=16)
                                                    Filter: (lvl &lt;= 2)
                                              -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5335.66..57861.32 rows=271266 width=12)
                                                    Recheck Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
                                                    -&gt;  Bitmap Index Scan on ux_hierarchy_lft  (cost=0.00..5267.84 rows=271266 width=0)
                                                          Index Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
</pre>
</div>
<p>This runs for almost <strong>15 seconds</strong>: too much.</p>
<p>This can be improved by exploiting these two properties of the nested sets model:</p>
<ol>
<li>
<p>The first immediate child of a node is the node holding the first <code>lft</code> next to the node&#8217;s <code>lft</code></p>
</li>
<li>
<p>The next sibling of a node is the node holding the first <code>lft</code> next to the node&#8217;s <code>rgt</code></p>
</li>
</ol>
<p>If we recursively traverse through the nodes, we can find the first child as well as all of its siblings. This is enough to build a hierarchy, and level filter can be implemented merely by limiting the recursion depth.</p>
<p>However, recursive <strong>CTE</strong>&#8216;s only allow one recursion level. We cannot nest the <code>WITH</code> clause.</p>
<p>To work around that, we can use <strong>PostgreSQL</strong>&#8216;s ability to run set-returning functions recursively. We will use the function-based recursion to iterate the parent-child axis, and the <strong>CTE</strong>-based recursion to iterate siblings axis.</p>
<p>We need to create a function that would take a node&#8217;s id on input and return a set of its children on output, with the function recursively applied to each of the children. To find a set of children, we will implement a recursive <strong>CTE</strong> that finds the first child in the anchor part and the next sibling in the recursive part.</p>
<p>Here&#8217;s the function:</p>
<pre class="brush: sql">
CREATE OR REPLACE FUNCTION fn_get_children(id INT, level INT)
RETURNS SETOF INT[] AS
$$
        WITH    RECURSIVE q AS
                (
                SELECT  (hc).id, (hc).lft, (hc).rgt, prgt
                FROM    (
                        SELECT  (
                                SELECT  hc
                                FROM    t_hierarchy hc
                                WHERE   hc.lft &gt; hp.lft
                                        AND hc.lft &lt; hp.rgt
                                ORDER BY
                                        hc.lft
                                LIMIT 1
                                ) hc,
                                rgt AS prgt
                        FROM    t_hierarchy hp
                        WHERE   hp.id = $1
                        ) q2
                UNION ALL
                SELECT  (hc).id, (hc).lft, (hc).rgt, prgt
                FROM    (
                        SELECT  (
                                SELECT  hc
                                FROM    t_hierarchy hc
                                WHERE   hc.lft &gt; q.rgt
                                        AND hc.lft &lt; q.prgt
                                ORDER BY
                                        hc.lft
                                LIMIT 1
                                ) hc,
                                prgt
                        FROM    q
                        WHERE   q.lft IS NOT NULL
                        ) q2
                )
        SELECT  CASE which
                WHEN 1 THEN ARRAY[q.id, $2]
                ELSE fn_get_children(q.id, $2 - 1)
                END
        FROM    (
                VALUES (1), (2)
                ) vals(which)
        CROSS JOIN
                q
        WHERE   q.id IS NOT NULL
                AND $2 &gt; 0
        ORDER BY
                id, which;
$$ LANGUAGE sql;
</pre>
<p>The function accepts a node&#8217;s <code>id</code> and a level on input, and returns a set of arrays, each corresponding to one of the node&#8217;s children and its level. The level returned by the function decreases and in fact represents not the level as such, but the number of levels left to reach the filter-set bottom. But since the initial level is user-set, it is easy to cast it to the actual level.</p>
<p>Let&#8217;s run the function:</p>
<pre class="brush: sql">
SELECT  c[1], 3 - c[2]
FROM    fn_get_children(1, 2) c;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>c</th>
<th>?column?</th>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">34</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">35</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">31</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">32</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">38</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">40</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">39</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">36</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">37</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">41</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">43</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">44</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">45</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">47</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">50</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">46</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">48</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">49</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">53</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">55</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">51</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">52</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">54</td>
<td class="int4">2</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0023s (0.0658s)</td>
</tr>
</table>
</div>
<pre>
Function Scan on fn_get_children c  (cost=0.00..262.50 rows=1000 width=32)
</pre>
<p>As we can see, the function returned all children and grandchildren of the node <strong>1</strong> along with their level, and did it in only <strong>65 ms</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/03/02/postgresql-using-recursive-functions-in-nested-sets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: nested sets and R-Tree</title>
		<link>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/</link>
		<comments>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 20:00:04 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4506</guid>
		<description><![CDATA[A feedback on one of my previous articles comparing adjacency list and nested sets models for PostgreSQL. Jay writes: In your series on adjacency lists vs nested sets, you discuss geometric types and R-Tree indexes in MySQL, but you don&#8217;t discuss them when discussing the same subject with PostgreSQL, which also has geometric types and [...]]]></description>
			<content:encoded><![CDATA[<p>A feedback on one of my previous articles comparing <a href="/2009/09/24/adjacency-list-vs-nested-sets-postgresql/">adjacency list and nested sets models for <strong>PostgreSQL</strong></a>.</p>
<p><strong>Jay</strong> writes:</p>
<blockquote>
<p>In your series on adjacency lists vs nested sets, you discuss <a href="/2009/09/29/adjacency-list-vs-nested-sets-mysql/">geometric types and <strong>R-Tree</strong> indexes in <strong>MySQL</strong></a>, but you don&#8217;t discuss them when discussing the same subject with <strong>PostgreSQL</strong>, which also has geometric types and <strong>R-Tree</strong> indexing (mostly available through <a href="http://www.postgresql.org/docs/8.4/static/gist-examples.html"><strong>GiST</strong> indexes</a>).</p>
<p>To make it simple I added the following line after the data insertion part of the script at Nested Sets &#8211; Postgresql:</p>
<pre class="brush: sql">
ALTER TABLE t_hierarchy ADD COLUMN sets POLYGON;
UPDATE t_hierarchy SET sets = POLYGON(BOX(POINT(-1,lft), POINT(1, rgt)));
</pre>
<p>It needed to be a <code>POLYGON</code> instead of a <code>BOX</code> since there is a <code>@>(POLYGON,POLYGON)</code> function but no <code>@>(BOX,BOX)</code> function, and the polygon was cast from the box to create the rectangle shape required.</p>
<p>It outperforms the adjacency list on <q>all descendants</q>; outperforms it on <q>all ancestors</q> (not by much); performs reasonably well on <q>all descendants up to a certain level</q> on items with few descendants (e. g. <strong>31415</strong>) and badly on items with many descendants (e. g. <strong>42</strong>).</p>
<p>It still completes in less than <strong>20</strong> seconds though, which is an improvement over <strong>1</strong> minute.</p>
</blockquote>
<p><strong>PostgreSQL</strong> does support <strong>R-Tree</strong> indexes indeed (through <strong>GiST</strong> interface), and they can be used to improve the efficiency of the nested sets model.</p>
<p>Let&#8217;s create a sample table and try some of the queries that <strong>Jay</strong> proposed:<br />
<span id="more-4506"></span><br />
<a href="#" onclick="xcollapse('X9066');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X9066" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_hierarchy (
        id INT NOT NULL,
        parent INT NOT NULL,
        lft INT NOT NULL,
        rgt INT NOT NULL,
        data VARCHAR(100) NOT NULL,
        stuffing VARCHAR(100) NOT NULL
);

INSERT
INTO    t_hierarchy
WITH RECURSIVE
        ini AS
        (
        SELECT  8 AS level, 5 AS children
        ),
        range AS
        (
        SELECT  level, children,
                (
                SELECT  SUM(POW(children, n)::INTEGER * ((n &lt; level)::INTEGER + 1))
                FROM    generate_series(level, 0, -1) n
                ) width
        FROM    ini
        ),
        q AS
        (
        SELECT  s AS id, 0 AS parent, level, children,
                1 + width * (s - 1) AS lft,
                1 + width * s - 1 AS rgt,
                width / children AS width
        FROM    (
                SELECT  r.*, generate_series(1, children) s
                FROM    range r
                ) q2
        UNION ALL
        SELECT  id * children + position, id, level - 1, children,
                1 + lft + width * (position - 1),
                1 + lft + width * position - 1,
                width / children
        FROM    (
                SELECT  generate_series(1, children) AS position, q.*
                FROM    q
                ) q2
        WHERE   level &gt; 0
        )
SELECT  id, parent, lft, rgt, &#039;Value &#039; || id, RPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    q;

ALTER TABLE t_hierarchy ADD CONSTRAINT pk_hierarchy_id PRIMARY KEY (id);
CREATE INDEX ix_hierarchy_lft ON t_hierarchy (lft);
CREATE INDEX ix_hierarchy_rgt ON t_hierarchy (rgt);
CREATE INDEX ix_hierarchy_parent ON t_hierarchy (parent);
CREATE INDEX ix_hierarchy_sets ON t_hierarchy USING GIST(POLYGON(BOX(POINT(-1, lft), POINT(1, rgt))));
</pre>
</div>
<p>To make the management of the table easier, I didn&#8217;t create an additional column with the geometric representation of the nested sets, but instead just defined an index on a derived expression, so that updating <code>lft</code> and <code>rgt</code> columns would be enough to update the set.</p>
<p>Now, let&#8217;s see how these queries perform.</p>
<h3>All descendants</h3>
<pre class="brush: sql">
SELECT  SUM(LENGTH(hc.stuffing)), COUNT(*)
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
</pre>
<p><a href="#" onclick="xcollapse('X3393');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X3393" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
<th>count</th>
</tr>
<tr>
<td class="int8">1953100</td>
<td class="int8">19531</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0003s (0.2139s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=8253.58..8253.60 rows=1 width=101)
  -&gt;  Nested Loop  (cost=136.32..8241.37 rows=2441 width=101)
        -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
              Index Cond: (id = 42)
        -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=136.32..8129.10 rows=2441 width=109)
              Recheck Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
              -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                    Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
</pre>
</div>
<p>Quite fast, <strong>213 ms</strong>.</p>
<h3>All ancestors</h3>
<pre class="brush: sql">
SELECT  hc.id, hc.lft, hc.rgt, hc.parent
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) @&gt; POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
</pre>
<p><a href="#" onclick="xcollapse('X4471');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4471" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>parent</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">585937</td>
<td class="int4">0</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">234376</td>
<td class="int4">351562</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">8</td>
</tr>
<tr class="statusbar">
<td colspan="100">3 rows fetched in 0.0007s (0.0127s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=136.32..8241.37 rows=2441 width=16)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 42)
  -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=136.32..8129.10 rows=2441 width=16)
        Recheck Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) @&gt; polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
        -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
              Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) @&gt; polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
</pre>
</div>
<p>Extremely fast: only <strong>10 ms</strong>.</p>
<h3>All descendants up to a certain level</h3>
<pre class="brush: sql">
SELECT  hc.id, hc.lft, hc.rgt, hc.parent
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt)))
WHERE   hp.id = 42
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hcp
        WHERE   POLYGON(BOX(POINT(-1, hc.lft), POINT(1, hc.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hcp.lft), POINT(1, hcp.rgt)))
        ) -
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hpp
        WHERE   POLYGON(BOX(POINT(-1, hp.lft), POINT(1, hp.rgt))) &lt;@ POLYGON(BOX(POINT(-1, hpp.lft), POINT(1, hpp.rgt)))
        ) &lt;= 2
</pre>
<p><a href="#" onclick="xcollapse('X2169');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X2169" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>parent</th>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="int4">213</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="int4">212</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="int4">211</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="int4">42</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="int4">214</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="int4">215</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="int4">215</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0039s (20.2523s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=0.03..40113216.41 rows=814 width=16)
  Join Filter: (((SubPlan 1) - (SubPlan 2)) &lt;= 2)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 42)
  -&gt;  Index Scan using ix_hierarchy_sets on t_hierarchy hc  (cost=0.03..9692.12 rows=2441 width=16)
        Index Cond: (polygon(box(point((-1)::double precision, (hc.lft)::double precision), point(1::double precision, (hc.rgt)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (hp.lft)::double precision), point(1::double precision, (hp.rgt)::double precision))))
  SubPlan 1
    -&gt;  Aggregate  (cost=8214.53..8214.54 rows=1 width=0)
          -&gt;  Bitmap Heap Scan on t_hierarchy hcp  (cost=136.32..8208.43 rows=2441 width=0)
                Recheck Cond: (polygon(box(point((-1)::double precision, ($0)::double precision), point(1::double precision, ($1)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
                -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                      Index Cond: (polygon(box(point((-1)::double precision, ($0)::double precision), point(1::double precision, ($1)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
  SubPlan 2
    -&gt;  Aggregate  (cost=8214.53..8214.54 rows=1 width=0)
          -&gt;  Bitmap Heap Scan on t_hierarchy hpp  (cost=136.32..8208.43 rows=2441 width=0)
                Recheck Cond: (polygon(box(point((-1)::double precision, ($2)::double precision), point(1::double precision, ($3)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
                -&gt;  Bitmap Index Scan on ix_hierarchy_sets  (cost=0.00..135.71 rows=2441 width=0)
                      Index Cond: (polygon(box(point((-1)::double precision, ($2)::double precision), point(1::double precision, ($3)::double precision))) &lt;@ polygon(box(point((-1)::double precision, (lft)::double precision), point(1::double precision, (rgt)::double precision))))
</pre>
</div>
<p>This, exactly as was mentioned by <strong>Jay</strong>, is much faster than using a <strong>B-Tree</strong> index but still too slow: <strong>20</strong> seconds.</p>
<h3>Analysis</h3>
<p>The nested sets model, improved by using the <strong>R-Tree</strong> indexes, provides a way to tell if two records are in the same ancestry chain.</p>
<p>However, even with the <strong>R-Tree</strong>, the model provides no simple means to tell how deep is a record nested.</p>
<p>To check it, an <strong>R-Tree</strong> index scan should be made which would return all of the record&#8217;s ancestors, the the number of the ancestors is to be compared with that of the parent node.</p>
<p>For a record with lots of ancestors (which was the case for the record <strong>42</strong> we used in the test queries), this means that thousands of records should be checked in a nested loop, out of which only a dozen will be returned.</p>
<p>Ironically, for the real-world models, this type of query is most often used, and used against the records with lots of descendants it is.</p>
<p>Usually, when hierarchical data are stored in a database, they are presented to a user in the form of a tree view. When the user opens the catalog, the first-level entries are show; when the user clicks on <q>expand</q> button of an entry, all immediate children of the entry should be shown.</p>
<p>Since users usually start browsing from the beginning, clicking the expand buttons on the first-level or second-level entries is what happens most often. And, unfortunately, it takes the most time to execute these queries.</p>
<p>Adjacency list model provides a constant time solution to this problem, since fetching all immediate children requires a single index scan. This is extremely fast on showing the immediate children.</p>
<p>A user can also click on <q>expand all</q> which should just return all children of the given entry.</p>
<p>However, clicking on <q>expand all</q> on a high-level entry will return too many records, so a time to download them or represent them in the GUI will be much more than that required to fetch them out of the table. A properly written GUI usually limits the level of the records returned so that GUI remains responsive, which, it its turn, implies the same problem of filtering on level.</p>
<p>The low-level entries (for which it makes sense to implement <q>expand all</q> without any limitations) can be queried for their descendants with the <strong>R-Tree</strong> query in the nested sets model or with a recursive query in the adjacency list model almost equally fast, since low-level entries contain few records.</p>
<p>The same applies to selecting all ancestors. Despite the fact that the nested sets model outperforms slightly the adjacency list model on this type of query, the absolute numbers are very small and the times that both queries take are almost imperceptible to the bare eye. A hierarchy is seldom more than a dozen levels deep, and fetching each ancestor even with a recursive query requires but one unique index scan per ancestor.</p>
<p>However, one may still be forced to use the nested tree model. This may be the way an ORM stores its data in the database; a heavily used legacy schema too old and scary to touch; or just some obscure model which mostly requires fetching all descendants fast with an occasional need to filter on the level.</p>
<p>Here are some methods to deal with it.</p>
<h3>Analytic functions</h3>
<p>Though there is no efficient way to filter all descendants on the level, there still is a way to fetch all <em>immediate</em> children of a record.</p>
<p>If we select all records within <code>lft</code> and <code>rgt</code> of a given entry and order them by <code>lft</code>, the first record will be the first immediate child of the entry.</p>
<p>All descendants of the first child will be returned before the second child and have <code>rgt</code> less than that of the first child.</p>
<p>This means that if we record the <code>MAX(rgt)</code> fetched so far, it will be that of the last immediate child of the entry fetched so far:</p>
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>MAX(rgt)</th>
</tr>
<tr>
<td>  2</td>
<td>2</td>
<td>11</td>
<td>11</td>
</tr>
<tr>
<td>    3</td>
<td>3</td>
<td>4</td>
<td>11</td>
</tr>
<tr>
<td>    4</td>
<td>5</td>
<td>8</td>
<td>11</td>
</tr>
<tr>
<td>      5</td>
<td>6</td>
<td>7</td>
<td>11</td>
</tr>
<tr>
<td>    6</td>
<td>9</td>
<td>10</td>
<td>11</td>
</tr>
<tr>
<td>  7</td>
<td>12</td>
<td>15</td>
<td>15</td>
</tr>
<tr>
<td>    8</td>
<td>13</td>
<td>14</td>
<td>15</td>
</tr>
</table>
<p>This means that each value of <code>MAX(rgt)</code> will correspond to exactly one immediate child; and the first entry in the recordset holding the value of <code>MAX(rgt)</code> will be that first child.</p>
<p>Several method exist to <a href="/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/">select records holding group-wise maximum in PostgreSQL</a>. In this case, is will be best to use <strong>PostgreSQL</strong>&#8216;s <code>DISTINCT ON</code>.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (MAX(hc.rgt) OVER (ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft &gt; hp.lft
        AND hc.lft &lt; hp.rgt
WHERE   hp.id = 42
ORDER BY
        MAX(hc.rgt) OVER (ORDER BY hc.lft), hc.lft
</pre>
<p><a href="#" onclick="xcollapse('X5039');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X5039" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
</tr>
<tr class="statusbar">
<td colspan="100">5 rows fetched in 0.0008s (0.1642s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=116073.33..117429.66 rows=271267 width=12)
  -&gt;  Sort  (cost=116073.33..116751.50 rows=271267 width=12)
        Sort Key: (max(hc.rgt) OVER (?)), hc.lft
        -&gt;  WindowAgg  (cost=86845.19..91592.36 rows=271267 width=12)
              -&gt;  Sort  (cost=86845.19..87523.35 rows=271267 width=12)
                    Sort Key: hc.lft
                    -&gt;  Nested Loop  (cost=5761.00..62364.22 rows=271267 width=12)
                          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
                                Index Cond: (id = 42)
                          -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5761.00..58286.67 rows=271267 width=12)
                                Recheck Cond: ((hc.lft &gt; hp.lft) AND (hc.lft &lt; hp.rgt))
                                -&gt;  Bitmap Index Scan on ix_hierarchy_lft  (cost=0.00..5693.19 rows=271267 width=0)
                                      Index Cond: ((hc.lft &gt; hp.lft) AND (hc.lft &lt; hp.rgt))
</pre>
</div>
<p>, which is reasonably fast (only <strong>160 ms</strong>).</p>
<p>Using <strong>PostgreSQL 8.4</strong> recursive abilities, this approach can be extended to select the descendants up to any level (provided as a parameter to the query).</p>
<p>Here&#8217;s the query to select all children and grandchildren:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, lft, rgt, 1 AS lvl
        FROM    t_hierarchy
        WHERE   id = 42
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft)) hc.id, hc.lft, hc.rgt, lvl + 1
                FROM    q
                JOIN    t_hierarchy hc
                ON      hc.lft &gt; q.lft
                        AND hc.lft &lt; q.rgt
                WHERE   lvl &lt;= 2
                ORDER BY
                        MAX(hc.rgt) OVER (PARTITION BY q.id ORDER BY hc.lft), hc.lft
                ) q2
        )
SELECT  *
FROM    q
</pre>
<p><a href="#" onclick="xcollapse('X4763');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4763" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>lft</th>
<th>rgt</th>
<th>lvl</th>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="int4">3</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0042s (0.4342s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on q  (cost=3923702.09..4086462.51 rows=8138021 width=16)
  CTE q
    -&gt;  Recursive Union  (cost=0.00..3923702.09 rows=8138021 width=16)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy  (cost=0.00..8.54 rows=1 width=12)
                Index Cond: (id = 42)
          -&gt;  Subquery Scan q2  (cost=363886.28..376093.31 rows=813802 width=16)
                -&gt;  Unique  (cost=363886.28..367955.29 rows=813802 width=20)
                      -&gt;  Sort  (cost=363886.28..365920.79 rows=813802 width=20)
                            Sort Key: (max(hc.rgt) OVER (?)), hc.lft
                            -&gt;  WindowAgg  (cost=265683.50..283994.05 rows=813802 width=20)
                                  -&gt;  Sort  (cost=265683.50..267718.01 rows=813802 width=20)
                                        Sort Key: q.id, hc.lft
                                        -&gt;  Nested Loop  (cost=5335.67..185791.26 rows=813802 width=20)
                                              -&gt;  WorkTable Scan on q  (cost=0.00..0.22 rows=3 width=16)
                                                    Filter: (lvl &lt;= 2)
                                              -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5335.67..57861.34 rows=271267 width=12)
                                                    Recheck Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
                                                    -&gt;  Bitmap Index Scan on ix_hierarchy_lft  (cost=0.00..5267.85 rows=271267 width=0)
                                                          Index Cond: ((hc.lft &gt; q.lft) AND (hc.lft &lt; q.rgt))
</pre>
</div>
<p>This is also reasonably fast, only <strong>432 ms</strong>. It is slower than the same adjacency list query (which completes in several milliseconds), but still is much faster than <strong>R-Tree</strong> and of course the least efficient <strong>B-Tree</strong> solutions involving <code>COUNT(*)</code> and can ease your life if you have to deal with a nested sets model.</p>
<p><strong>To be continued.</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/03/01/postgresql-nested-sets-and-r-tree/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Six degrees of separation</title>
		<link>http://explainextended.com/2010/02/27/six-degrees-of-separation/</link>
		<comments>http://explainextended.com/2010/02/27/six-degrees-of-separation/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 20:00:45 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4477</guid>
		<description><![CDATA[Answering questions asked on the site. Kathy asks: I am developing a social network site in PostgreSQL and want to find out if two people are no more than 6 friends apart. If your site grows popular, most probably, they are not. But we better check. On most social networks, friendship is a symmetric relationship [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Kathy</strong> asks:</p>
<blockquote>
<p>I am developing a social network site in <strong>PostgreSQL</strong> and want to find out if two people are no more than <strong>6</strong> friends apart.</p>
</blockquote>
<p>If your site grows popular, most probably, <a href="http://en.wikipedia.org/wiki/Six_degrees_of_separation">they are not</a>. But we better check.</p>
<p>On most social networks, friendship is a symmetric relationship (however, <a href="http://livejournal.com">LiveJournal</a> is a notable exception). This means that if Alice is a friend to Bob, then Bob is a friend to Alice as well.</p>
<p>The friendship relationship is best stored in a many-to-many link table with a <code>PRIMARY KEY</code> on both link fields and an additional check condition: the friend with the least id should be stored in the first column. This is to avoid storing a relationship twice: a <code>PRIMARY KEY</code> won&#8217;t be violated if the same record with the columns swapped will be inserted, but the check constraint will. The check constraint will also forbid storing a friend relationship to itself.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-4477"></span><br />
<a href="#" onclick="xcollapse('X5926');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X5926" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE friends (
        orestes INT NOT NULL,
        pylades INT NOT NULL,
        CHECK (orestes &lt; pylades)
);

SELECT  SETSEED(0.20100227);

INSERT
INTO    friends
SELECT  o, p
FROM    (
        SELECT  o, SUM(FLOOR(RANDOM() * 100000) + 1) OVER (PARTITION BY o ORDER BY n) AS p
        FROM    (
                SELECT  o, generate_series(1, 20) n
                FROM    generate_series(1, 1000000) o
                ) q
        ) q2
WHERE   o &lt; p
        AND p &lt;= 1000000;

ALTER TABLE friends ADD CONSTRAINT pk_friends_op PRIMARY KEY (orestes, pylades);

CREATE UNIQUE INDEX ux_friends_po ON friends (pylades, orestes);
</pre>
</div>
<p>This table stores records for <strong>1,000,000</strong> people having <strong>20</strong> friends each in average. The first column is named <code>orestes</code> and the second one <code>pylades</code>.</p>
<p>With new <strong>PostgreSQL 8.4</strong> it is easy to write a recursive query that would traverse the relationship graph up to the given level and stop on the first match.</p>
<p>However, each recursion step requires a join, and as the number of records in the input recordset for the recusion grows with level, the joins become less and less efficient. The number of records grows exponentially, and on level <strong>6</strong> there will be about <strong>20 ^ 6 = 64,000,000</strong> records on input. This is just too much for a join with a <strong>20,000,000</strong> records table.</p>
<p>As the chain length increases, the tree diverges, the number of the records grows and it becomes more and more costly to join them with the table.</p>
<p>To work around this, we should use Bogdan the tunnel builder&#8217;s algorithm.</p>
<blockquote><p>British and French governments submit a tender to build a tunnel under the Channel. Many companies apply, all demanding years of time and billions pounds of money, so their offers are refused.</p>
<p>One day, Bogdan drops in and offers his services.</p>
<p><q>How much money would you demand for your work?</q>, the official asks. <q>Me and my brother Roman are good eaters, so we will need to buy a decent meal every day. 50 pounds a day will be OK.</q></p>
<p><q>That&#8217;s pretty cheap; and how much time will you need?</q> <q>Me and my brother Roman are fast diggers; one mile a day I think we will dig.</q></p>
<p><q>Oh, that&#8217;s pretty fast! But how will you be able to work on such a low budget in such a short time?</q>, the official asks out of curiosity.</p>
<p><q>That&#8217;s simple</q>, Bogdan answers, <q>I start digging from the British coast, my brother Roman starts digging from the French coast; the moment we meet, the work is over</q>.</p>
<p><q>OK</q>, says the official, <q>but what if you don&#8217;t meet?</q>.</p>
<p><q>No worries then: you get two tunnels for the price of one</q>
</p></blockquote>
<p>We should do a similar thing here. Instead of traversing the <strong>6</strong> levels from the beginning, we will traverse just <strong>3</strong> levels from each side, then join the resulting recordsets and hope some matching records will be found.</p>
<p>Traversing only <strong>3</strong> levels will be quite fast; and the resulting recordsets will be of moderate size so joining them will be easy.</p>
<p>As an extra, we will return the shortest path from one person to the other. To do this, we will need to record the friendship chain in an array. <strong>PostgreSQL</strong> does not offer an easy way to reverse an array, so in the first recordset, we will <em>append</em> the friends to the array, while in the second one we will <em>prepend</em> them. This way, we should just concatenate the resulting <del>tunnels</del>arrays.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        q1 (person, chain, lvl) AS
        (
        SELECT  123456, ARRAY[123456], 1
        UNION ALL
        SELECT  friend, chain || friend, lvl + 1
        FROM    (
                SELECT  q1.*,
                        friend
                FROM    q1
                JOIN    (
                        SELECT  orestes AS me, pylades AS friend
                        FROM    friends
                        UNION ALL
                        SELECT  pylades AS me, orestes AS friend
                        FROM    friends
                        ) f
                ON      person = me
                WHERE   lvl &lt;= 3
                ) qo
        ),
        q2 (person, chain, lvl) AS
        (
        SELECT  654321, ARRAY[654321], 1
        UNION ALL
        SELECT  friend, friend || chain, lvl + 1
        FROM    (
                SELECT  q2.*,
                        friend
                FROM    q2
                JOIN    (
                        SELECT  orestes AS me, pylades AS friend
                        FROM    friends
                        UNION ALL
                        SELECT  pylades AS me, orestes AS friend
                        FROM    friends
                        ) f
                ON      person = me
                WHERE   lvl &lt;= 3
                ) qo
        )
SELECT  (q1.chain || q2.chain[2:q2.lvl])::TEXT AS chain
FROM    q1
JOIN    q2
ON      q2.person = q1.person
ORDER BY
        q1.lvl + q2.lvl
LIMIT 1
</pre>
<p><a href="#" onclick="xcollapse('X5520');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X5520" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>chain</th>
</tr>
<tr>
<td class="text">{123456,890237,278175,654321}</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0003s (0.5313s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=1138629880.05..1138629880.05 rows=1 width=72)
  CTE q1
    -&gt;  Recursive Union  (cost=0.00..71529.77 rows=2753901 width=40)
          -&gt;  Result  (cost=0.00..0.01 rows=1 width=0)
          -&gt;  Nested Loop  (cost=0.00..1645.17 rows=275390 width=40)
                Join Filter: (q1.person = &quot;20100227_friends&quot;.friends.orestes)
                -&gt;  WorkTable Scan on q1  (cost=0.00..0.22 rows=3 width=40)
                      Filter: (lvl &lt;= 3)
                -&gt;  Append  (cost=0.00..88.96 rows=30 width=8)
                      -&gt;  Index Scan using pk_friends_op on friends  (cost=0.00..37.55 rows=17 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.orestes = q1.person)
                      -&gt;  Index Scan using ux_friends_po on friends  (cost=0.00..51.40 rows=13 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.pylades = q1.person)
  CTE q2
    -&gt;  Recursive Union  (cost=0.00..71529.77 rows=2753901 width=40)
          -&gt;  Result  (cost=0.00..0.01 rows=1 width=0)
          -&gt;  Nested Loop  (cost=0.00..1645.17 rows=275390 width=40)
                Join Filter: (q2.person = &quot;20100227_friends&quot;.friends.orestes)
                -&gt;  WorkTable Scan on q2  (cost=0.00..0.22 rows=3 width=40)
                      Filter: (lvl &lt;= 3)
                -&gt;  Append  (cost=0.00..88.96 rows=30 width=8)
                      -&gt;  Index Scan using pk_friends_op on friends  (cost=0.00..37.55 rows=17 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.orestes = q2.person)
                      -&gt;  Index Scan using ux_friends_po on friends  (cost=0.00..51.40 rows=13 width=8)
                            Index Cond: (&quot;20100227_friends&quot;.friends.pylades = q2.person)
  -&gt;  Sort  (cost=1138486820.51..1233286454.49 rows=37919853589 width=72)
        Sort Key: ((q1.lvl + q2.lvl))
        -&gt;  Merge Join  (cost=849904.33..948887552.57 rows=37919853589 width=72)
              Merge Cond: (q1.person = q2.person)
              -&gt;  Sort  (cost=424952.16..431836.92 rows=2753901 width=40)
                    Sort Key: q1.person
                    -&gt;  CTE Scan on q1  (cost=0.00..55078.02 rows=2753901 width=40)
              -&gt;  Materialize  (cost=424952.16..459375.93 rows=2753901 width=40)
                    -&gt;  Sort  (cost=424952.16..431836.92 rows=2753901 width=40)
                          Sort Key: q2.person
                          -&gt;  CTE Scan on q2  (cost=0.00..55078.02 rows=2753901 width=40)
</pre>
</div>
<p>Note that the anchor part can not be used more than once in a recursive expression. To work around that, we had to join it to a derived table (a <code>UNION ALL</code> of two copies of the table with the columns swapped). However, <strong>PostgreSQL</strong>&#8216;s optimizer was smart enough to push the join predicate into the derived table and distribute the queries so that each part uses a corresponding index efficiently. This helps to traverse the tree and build the recordsets from both ends.</p>
<p>Each of the recordsets has only about <strong>8,000</strong> records, so scanning and joining them is very fast.</p>
<p>The whole query takes just a little longer than <strong>0.5</strong> seconds.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/27/six-degrees-of-separation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sargability of monotonic functions: example</title>
		<link>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/</link>
		<comments>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 20:00:25 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4407</guid>
		<description><![CDATA[In my previous article I presented a proposal to add sargability of monotonic functions into the SQL engines. In a nutshell: a monotonic function is a function that preserves the order of the argument so that it gives the larger results for the larger values of the argument. It is easy to prove that a [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous article I presented a proposal to add <a href="/2010/02/19/things-sql-needs-sargability-of-monotonic-functions/">sargability of monotonic functions</a> into the <strong>SQL</strong> engines.</p>
<p>In a nutshell: a monotonic function is a function that preserves the order of the argument so that it gives the larger results for the larger values of the argument. It is easy to prove that a <strong>B-tree</strong> with each key replaced by the result of the function will remain the valid <strong>B-Tree</strong> and hence can be used to search for ranges of function results just like it is used to search for ranges of values.</p>
<p>With a little effort, a <strong>B-Tree</strong> can also be used to search for the ranges of piecewise monotonic functions: those whose domain can be split into a number of continuous pieces with the function being monotonic within each piece (but it may be not monotonic and even not continuous across the pieces).</p>
<p>In this article, I&#8217;ll demonstrate the algorithm to do that (implemented in pure <strong>SQL</strong> on <strong>PostgreSQL</strong>).</p>
<p>I will show how the performance of simple query </p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.1234 AND 0.1235
</pre>
<p>could be improved if the sargability of monotonic functions had been implemented in the optimizer.<br />
<span id="more-4407"></span><br />
To do this, I will create a sample table:</p>
<p><a href="#" onclick="xcollapse('X1002');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X1002" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_sine (
        id INT NOT NULL PRIMARY KEY,
        value DOUBLE PRECISION NOT NULL
);

CREATE INDEX ix_sine_value ON t_sine (value);

SELECT  SETSEED(0.20100223);

INSERT
INTO    t_sine
SELECT  num, num / 10000.00 + RANDOM()
FROM    generate_series(1, 1000000) num;

ANALYZE t_sine;
</pre>
</div>
<p>This table contains <strong>1,000,000</strong> records with <code>value</code> randomly distributed from <strong>0</strong> to <strong>101</strong>.</p>
<p>To select the records we need, we can use a very simple and straightforward query:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.4452 AND 0.4453
</pre>
<p><a href="#" onclick="xcollapse('X1272');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1272" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>value</th>
</tr>
<tr>
<td class="int4">3663</td>
<td class="float8">0.46150185738802</td>
</tr>
<tr>
<td class="int4">19263</td>
<td class="float8">2.68015610060766</td>
</tr>
<tr>
<td class="int4">23783</td>
<td class="float8">2.68013202354237</td>
</tr>
<tr>
<td class="int4">86110</td>
<td class="float8">8.963312032599</td>
</tr>
<tr>
<td class="int4">128053</td>
<td class="float8">13.0278523004308</td>
</tr>
<tr>
<td class="int4">150339</td>
<td class="float8">15.2465362691633</td>
</tr>
<tr>
<td class="int4">185849</td>
<td class="float8">19.310986539682</td>
</tr>
<tr>
<td class="int4">186788</td>
<td class="float8">19.3110526885197</td>
</tr>
<tr>
<td class="int4">191391</td>
<td class="float8">19.3110088731334</td>
</tr>
<tr>
<td class="int4">210841</td>
<td class="float8">21.5297331408583</td>
</tr>
<tr>
<td class="int4">212511</td>
<td class="float8">21.529697893659</td>
</tr>
<tr>
<td class="int4">247639</td>
<td class="float8">25.5941842560224</td>
</tr>
<tr>
<td class="int4">373019</td>
<td class="float8">38.1605504324339</td>
</tr>
<tr>
<td class="int4">373416</td>
<td class="float8">38.1606072025172</td>
</tr>
<tr>
<td class="int4">458141</td>
<td class="float8">46.6624391236514</td>
</tr>
<tr>
<td class="int4">462683</td>
<td class="float8">46.6623900452435</td>
</tr>
<tr>
<td class="int4">462704</td>
<td class="float8">46.662440645238</td>
</tr>
<tr>
<td class="int4">463233</td>
<td class="float8">46.6624209528782</td>
</tr>
<tr>
<td class="int4">520118</td>
<td class="float8">52.945639446865</td>
</tr>
<tr>
<td class="int4">522686</td>
<td class="float8">52.9456708737895</td>
</tr>
<tr>
<td class="int4">561721</td>
<td class="float8">57.0100855686799</td>
</tr>
<tr>
<td class="int4">686886</td>
<td class="float8">69.5764806582652</td>
</tr>
<tr>
<td class="int4">711952</td>
<td class="float8">71.7951245983548</td>
</tr>
<tr>
<td class="int4">716508</td>
<td class="float8">71.7952263388403</td>
</tr>
<tr>
<td class="int4">778116</td>
<td class="float8">78.0783171531357</td>
</tr>
<tr>
<td class="int4">877138</td>
<td class="float8">88.4260205388732</td>
</tr>
<tr>
<td class="int4">903050</td>
<td class="float8">90.6446926224232</td>
</tr>
<tr>
<td class="int4">942345</td>
<td class="float8">94.7092782433405</td>
</tr>
<tr>
<td class="int4">946181</td>
<td class="float8">94.7092303088069</td>
</tr>
<tr>
<td class="int4">966815</td>
<td class="float8">96.9279387165383</td>
</tr>
<tr>
<td class="int4">999931</td>
<td class="float8">100.992433886895</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0022s (3.2500s)</td>
</tr>
</table>
</div>
<pre>
Seq Scan on t_sine  (cost=0.00..25406.00 rows=5000 width=12)
  Filter: ((sin(value) &gt;= 0.4452::double precision) AND (sin(value) &lt;= 0.4453::double precision))
</pre>
</div>
<p>which returns <strong>31</strong> records in <strong>3.25</strong> seconds.</p>
<p>The query uses a full table scan with the filter applied to each record.</p>
<p>Let&#8217;s try to improve it.</p>
<h3>Function description</h3>
<p>According to the notation I proposed in the previous article, the monotony of the function <code>SIN()</code> should be described as this:</p>
<p><code>SIN(arg FLOAT) MONOTONIC PIECEWISE<br />
DEFINED BY FLOOR(arg / PI() + 0.5)<br />
CASE PIECE % 2<br />
WHEN 0 THEN DECREASING INVERSE PIECE * PI() + ASIN(RESULT)<br />
ELSE INCREASING INVERSE PIECE * PI() - ASIN(RESULT)<br />
END</code></p>
<p>This means that:</p>
<ol>
<li>
<p>The function is piecewise monotonic,</p>
</li>
<li>
<p>The pieces are defined by the function <code>FLOOR(arg / PI() + 0.5)</code> (which essentially returns the number of the half-wave the argument belongs too),</p>
</li>
<li>
<p>The function monotony varies depending on the piece,</p>
</li>
<li>
<p>On odd pieces, the function increases, </p>
</li>
<li>
<p>On even pieces, the function decreases.</p>
</li>
<li>
<p>A single inverse expression is provided for each monotony</p>
</li>
</ol>
<p>Note that mathematically the function is strictly monotonic on each of its pieces. However, due to the rounding errors, different arguments can yield same function results, so the function value may map back to a range of the arguments rather than a single value.</p>
<p>In theory, it is possible to write a single expression which would map the function&#8217;s result to the pair of values defining the beginning and the end of such a range. However, the expression would be quite complex. So for illustration purposes I&#8217;ll make do with a single inverse function that yields an approximation of the back mapping. To find the exact range, some extra effort will be required.</p>
<h3>Building the pieces</h3>
<p>The function is piecewise monotonic and the pieces are defined by a function. For the pieces to be continuous, the function that defines them should be itself monotonic over all its domain.</p>
<p>The function that defines the pieces is <code>FLOOR(arg / PI() + 0.5)</code>.</p>
<p>It is a superposition of the three functions:</p>
<ul>
<li>
<p><code>OPERATOR_DIVISION(arg1 FLOAT, arg2 FLOAT)<br />
MONOTONIC OVER (arg1)<br />
CASE WHEN arg2 > 0 THEN INCREASING INVERSE RESULT * arg2<br />
WHEN arg2 = 0 THEN UNDEFINED<br />
WHEN arg2 < 0 THEN DECREASING INVERSE RESULT * arg2<br />
END</code></p>
</li>
<li>
<p><code>OPERATOR_PLUS(arg1 FLOAT, arg2 FLOAT) MONOTONIC<br />
OVER (arg1) STRICTLY INCREASING INVERSE RESULT - arg2,<br />
OVER (arg2) STRICTLY INCREASING INVERSE RESULT - arg1</code></p>
</li>
<li>
<p><code>FLOOR(arg FLOAT) MONOTONIC INCREASING<br />
INVERSE<br />
FROM RESULT EXACT<br />
TO RESULT + 1 EXACT EXCLUDE</code></p>
</li>
</ul>
<p>which are, given the values of the constants provided in the secondary arguments, are increasing over the argument. As we know from math, a superposition of monotonic functions is also monotonic.</p>
<p>Each function is defined with a single inverse condition which maps the result of the function back to a value <em>near</em> the range of the arguments yielding the result. The exact range has to be sought for using index seek over (hopefully) not too many records.</p>
<p>Sequentially applying the inverse expressions of each of the constituent functions to the result of the piece defining function, we get the following inverse expression for the latter:</p>
<ol>
<li><code>FLOOR(OPERATOR_PLUS(OPERATOR_DIVISION(arg, PI()), 0.5)) = PIECE</code></li>
<li><code>OPERATOR_PLUS(OPERATOR_DIVISION(arg, PI()), 0.5) ∈ [ PIECE, PIECE + 1 )</code></li>
<li><code>OPERATOR_DIVISION(arg, PI()) ∈ [ ≈(PIECE - 0.5), ≈((PIECE + 1) - 0.5) ]</code></li>
<li><code>arg ∈ [ ≈((PIECE - 0.5) * PI()), ≈(((PIECE + 1) - 0.5) * PI()) ]</code></li>
</ol>
<p>For each piece, we how have a pair of values <em>approximately</em> defining the range of values belonging to the piece.</p>
<p>To find out the exact bounds, we need to do the following:</p>
<ol>
<li>
<p>Calculate the piece for the minimal <code>value</code></p>
</li>
<li>
<p>Find the approximate upper bound for the piece.</p>
</li>
<li>
<p>Scanning the keys to the left, find the <strong>rightmost</strong> key to the <strong>left of the upper bound</strong> that belongs to the current (or previous) piece.</p>
</li>
<li>
<p>Scanning the keys to the right, find the <strong>first</strong> key of the <strong>next</strong> piece.</p>
</li>
<li>
<p>Scanning a single key to the left, find the <strong>last</strong> key of the <strong>current</strong> piece.</p>
</li>
<li>
<p>Recursively repeat steps <strong>1</strong> to <strong>5</strong>, taking the first value the next piece calculated on step <strong>4</strong> as a seed for the step <strong>1</strong>, until step <strong>4</strong> fails (which means that the pieces are over).</p>
</li>
</ol>
<p>This procedure guarantees that we always get the correct bounds even with the inexact inverse value, since it correctly handles both overflow and underflow of the inverse value, as show on the pictures below:</p>
<h4>Overflow</h4>
<p><img src="http://explainextended.com/wp-content/uploads/2010/02/overflow.png" alt="" title="Overflow" width="700" height="500" class="size-full wp-image-4420 noborder" /></p>
<h4>Underflow</h4>
<p><img src="http://explainextended.com/wp-content/uploads/2010/02/underflow.png" alt="" title="Underflow" width="700" height="500" class="aligncenter size-full wp-image-4419 noborder" /></p>
<p>Here's a query that selects the first and the last key of each piece:</p>
<p><a href="#" onclick="xcollapse('X822');return false;"><strong>View the query</strong></a><br />
</p>
<div id="X822" style="display: none; ">
<pre class="brush: sql">
WITH    RECURSIVE
        d AS (
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  minv, FLOOR(minv / PI() + 0.5) AS piece
                        FROM    (
                                SELECT  MIN(value) AS minv
                                FROM    t_sine
                                ) q
                        ) q2
                ) q3
        UNION ALL
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  nextv AS minv, nextpiece AS piece
                        FROM    d
                        WHERE   nextpiece IS NOT NULL
                        ) q2
                ) q3
        )
SELECT  *
FROM    d
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>piece</th>
<th>minv</th>
<th>maxv</th>
<th>nextv</th>
<th>nextpiece</th>
</tr>
<tr>
<td class="float8">0</td>
<td class="float8">0.0172837972298265</td>
<td class="float8">1.57060804706216</td>
<td class="float8">1.57081433883309</td>
<td class="float8">1</td>
</tr>
<tr>
<td class="float8">1</td>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">4.71268981658742</td>
<td class="float8">2</td>
</tr>
<tr>
<td class="float8">2</td>
<td class="float8">4.71268981658742</td>
<td class="float8">7.85390641669333</td>
<td class="float8">7.85409013534784</td>
<td class="float8">3</td>
</tr>
<tr>
<td class="float8">3</td>
<td class="float8">7.85409013534784</td>
<td class="float8">10.9955484666698</td>
<td class="float8">10.9956333589859</td>
<td class="float8">4</td>
</tr>
<tr>
<td class="float8">4</td>
<td class="float8">10.9956333589859</td>
<td class="float8">14.1371444690436</td>
<td class="float8">14.1372372000463</td>
<td class="float8">5</td>
</tr>
<tr class="break">
<td colspan="100"/></tr>
<tr>
<td class="float8">31</td>
<td class="float8">95.8185816861346</td>
<td class="float8">98.9601405610755</td>
<td class="float8">98.9601765494391</td>
<td class="float8">32</td>
</tr>
<tr>
<td class="float8">32</td>
<td class="float8">98.9601765494391</td>
<td class="float8">100.992433886895</td>
<td class="float8"></td>
<td class="float8"></td>
</tr>
<tr class="statusbar">
<td colspan="100">33 rows fetched in 0.0057s (0.0484s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on d  (cost=65.97..67.99 rows=101 width=40)
  CTE d
    -&gt;  Recursive Union  (cost=0.08..65.97 rows=101 width=16)
          -&gt;  Subquery Scan q  (cost=0.08..0.80 rows=1 width=8)
                InitPlan 5 (returns $5)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 4 (returns $4)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                      InitPlan 10 (returns $8)
                        -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                              -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                    Filter: (value IS NOT NULL)
                SubPlan 3
                  -&gt;  Limit  (cost=0.22..0.26 rows=1 width=8)
                        InitPlan 2 (returns $3)
                          -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                                InitPlan 1 (returns $2)
                                  -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $2)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($3)[1])
                SubPlan 7
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 6 (returns $6)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $6)
                SubPlan 9
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 8 (returns $7)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $7)
          -&gt;  WorkTable Scan on d  (cost=0.04..6.31 rows=10 width=16)
                Filter: (d.nextpiece IS NOT NULL)
                InitPlan 15 (returns $13)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 14 (returns $12)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                SubPlan 13
                  -&gt;  Limit  (cost=0.19..0.23 rows=1 width=8)
                        InitPlan 12 (returns $11)
                          -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                                InitPlan 11 (returns $10)
                                  -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $10)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($11)[1])
                SubPlan 17
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 16 (returns $14)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $14)
                SubPlan 19
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 18 (returns $15)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $15)
</pre>
</div>
<p>Despite being huge in size, the query is very efficient and completes in only <strong>48 ms</strong>.</p>
<h3>Locating values within the pieces</h3>
<p>Now, when we have the exact bounds of each piece, we need to locate the records within each piece.</p>
<p>Since we don't have exact inverse function here, the basic idea is the same as above: given the approximate inverse, locate the exact bound using the iterative approach:</p>
<ol>
<li>Locate the first key to the left of the inverse which yields the function result less than the one sought for, up to the first key of the piece. Should this search fail, the first key of the piece is the lower bound.</li>
<li>Locate the first key to the right of that found on the previous step that yields the function value equal to or greater than the one sought for, up to the last key of the piece. Return <code>NULL</code> should it fail</li>
</ol>
<p>Since we have an inclusive range here, we don't need the third step (final scan to the left to find the rightmost least value) that we used when searching for the pieces.</p>
<p>This algorithm searches for the lower bound; to search for the upper bound, we just need to inverse both directions and tests (<q>left</q> becomes <q>right</q>, <q>less</q> becomes <q>greater</q> etc).</p>
<p>Since the monotony of the function varies from piece to piece, we should take this into account. For the pieces where the function's monotony is <code>DECREASING</code> we should swap the order of the bounds: the upper bound of the expression becomes the lower bound or the range of values and vice versa. This can be handled merely by substituting the conditions into the very same <code>CASE</code> expression that defines the monotony.</p>
<p>When we locate the upper and the lower bounds for each piece, we should just join <code>t_sine</code> on the following condition:</p>
<pre class="brush: sql">
ON value BETWEEN llimit and ulimit
</pre>
<p>It can happen so that the lower bound found by the algorithm exceeds the upper bound. This is a perfectly normal situation meaning that no keys match the condition and the range diverged. <code>BETWEEN</code> predicate will handle this.</p>
<p>It also can happen that one of the bounds is a <code>NULL</code>. This is also a valid situation, meaning that no value within the piece exceeds the lower bound (or falls short of the upper one). <code>BETWEEN</code> will also take care of it.</p>
<h3>Final query</h3>
<p>And here's the final query. Be ready, you'll have to spin your mouse wheel a lot:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        d AS (
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  minv, FLOOR(minv / PI() + 0.5) AS piece
                        FROM    (
                                SELECT  MIN(value) AS minv
                                FROM    t_sine
                                ) q
                        ) q2
                ) q3
        UNION ALL
        SELECT  piece,
                minv,
                COALESCE(
                (
                SELECT  value
                FROM    t_sine
                WHERE   value &lt; nv[1]
                ORDER BY
                        value DESC
                LIMIT 1
                ),
                (
                SELECT  MAX(value)
                FROM    t_sine
                )
                ) AS maxv,
                nv[1] AS nextv,
                nv[2] AS nextpiece
        FROM    (
                SELECT  minv, piece,
                        (
                        SELECT  ARRAY[value, FLOOR(value / PI() + 0.5)]
                        FROM    t_sine
                        WHERE   value &gt;
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= ((piece + 1) - 0.5) * PI()
                                        AND FLOOR(value / PI() + 0.5) &lt;= piece
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                )
                        ORDER BY
                                value
                        LIMIT 1
                        ) nv
                FROM    (
                        SELECT  nextv AS minv, nextpiece AS piece
                        FROM    d
                        WHERE   nextpiece IS NOT NULL
                        ) q2
                ) q3
        )
SELECT  l.*, s.*, SIN(value)
FROM    (
        SELECT  minv, maxv,
                CASE piece::INTEGER % 2
                WHEN 0 THEN
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &gt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= LEAST(piece * PI() + ASIN(0.4452), maxv)
                                        AND value &gt;= minv
                                        AND SIN(value) &lt; 0.4452
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                ),
                                minv
                                )
                                AND value &lt;= maxv
                                AND SIN(value) &gt;= 0.4452
                        ORDER BY
                                value
                        LIMIT 1
                        )
                ELSE
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &gt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &lt;= LEAST(piece * PI() - ASIN(0.4453), maxv)
                                        AND value &gt;= minv
                                        AND SIN(value) &gt; 0.4453
                                ORDER BY
                                        value DESC
                                LIMIT 1
                                ),
                                minv
                                )
                                AND value &lt;= maxv
                                AND SIN(value) &lt;= 0.4453
                        ORDER BY
                                value
                        LIMIT 1
                        )
                END AS llimit,
                CASE piece::INTEGER % 2
                WHEN 0 THEN
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &lt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &gt;= GREATEST(piece * PI() + ASIN(0.4453), minv)
                                        AND value &lt;= maxv
                                        AND SIN(value) &gt; 0.4453
                                ORDER BY
                                        value
                                LIMIT 1
                                ),
                                maxv
                                )
                                AND value &gt;= minv
                                AND SIN(value) &lt;= 0.4453
                        ORDER BY
                                value DESC
                        LIMIT 1
                        )
                ELSE
                        (
                        SELECT  value
                        FROM    t_sine
                        WHERE   value &lt;=
                                COALESCE(
                                (
                                SELECT  value
                                FROM    t_sine
                                WHERE   value &gt;= GREATEST(piece * PI() - ASIN(0.4452), minv)
                                        AND value &lt;= maxv
                                        AND SIN(value) &lt; 0.4452
                                ORDER BY
                                        value
                                LIMIT 1
                                ),
                                maxv
                                )
                                AND value &gt;= minv
                                AND SIN(value) &gt;= 0.4452
                        ORDER BY
                                value DESC
                        LIMIT 1
                        )
                END AS ulimit
        FROM    d
        ) l
JOIN    t_sine s
ON      value BETWEEN llimit AND ulimit
</pre>
<p><a href="#" onclick="xcollapse('X1409');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X1409" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>minv</th>
<th>maxv</th>
<th>llimit</th>
<th>ulimit</th>
<th>id</th>
<th>value</th>
<th>sin</th>
</tr>
<tr>
<td class="float8">0.0172837972298265</td>
<td class="float8">1.57060804706216</td>
<td class="float8">0.46150185738802</td>
<td class="float8">0.46150185738802</td>
<td class="int4">3663</td>
<td class="float8">0.46150185738802</td>
<td class="float8">0.44529334884391</td>
</tr>
<tr>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">2.68013202354237</td>
<td class="float8">2.68015610060766</td>
<td class="int4">23783</td>
<td class="float8">2.68013202354237</td>
<td class="float8">0.445256434133825</td>
</tr>
<tr>
<td class="float8">1.57081433883309</td>
<td class="float8">4.7123523916252</td>
<td class="float8">2.68013202354237</td>
<td class="float8">2.68015610060766</td>
<td class="int4">19263</td>
<td class="float8">2.68015610060766</td>
<td class="float8">0.445234875325918</td>
</tr>
<tr>
<td class="float8">7.85409013534784</td>
<td class="float8">10.9955484666698</td>
<td class="float8">8.963312032599</td>
<td class="float8">8.963312032599</td>
<td class="int4">86110</td>
<td class="float8">8.963312032599</td>
<td class="float8">0.445261178083286</td>
</tr>
<tr>
<td class="float8">10.9956333589859</td>
<td class="float8">14.1371444690436</td>
<td class="float8">13.0278523004308</td>
<td class="float8">13.0278523004308</td>
<td class="int4">128053</td>
<td class="float8">13.0278523004308</td>
<td class="float8">0.445275287664458</td>
</tr>
<tr>
<td class="float8">14.1372372000463</td>
<td class="float8">17.2787474542275</td>
<td class="float8">15.2465362691633</td>
<td class="float8">15.2465362691633</td>
<td class="int4">150339</td>
<td class="float8">15.2465362691633</td>
<td class="float8">0.445226320346027</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">185849</td>
<td class="float8">19.310986539682</td>
<td class="float8">0.445229561181329</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">191391</td>
<td class="float8">19.3110088731334</td>
<td class="float8">0.445249558810258</td>
</tr>
<tr>
<td class="float8">17.2790936637506</td>
<td class="float8">20.4199075459354</td>
<td class="float8">19.310986539682</td>
<td class="float8">19.3110526885197</td>
<td class="int4">186788</td>
<td class="float8">19.3110526885197</td>
<td class="float8">0.44528879096528</td>
</tr>
<tr>
<td class="float8">20.4204805635758</td>
<td class="float8">23.561747650259</td>
<td class="float8">21.529697893659</td>
<td class="float8">21.5297331408583</td>
<td class="int4">212511</td>
<td class="float8">21.529697893659</td>
<td class="float8">0.445247526124325</td>
</tr>
<tr>
<td class="float8">20.4204805635758</td>
<td class="float8">23.561747650259</td>
<td class="float8">21.529697893659</td>
<td class="float8">21.5297331408583</td>
<td class="int4">210841</td>
<td class="float8">21.5297331408583</td>
<td class="float8">0.445215965240229</td>
</tr>
<tr>
<td class="float8">23.5619455649868</td>
<td class="float8">26.7032224601433</td>
<td class="float8">25.5941842560224</td>
<td class="float8">25.5941842560224</td>
<td class="int4">247639</td>
<td class="float8">25.5941842560224</td>
<td class="float8">0.445240672513922</td>
</tr>
<tr>
<td class="float8">36.1283162861556</td>
<td class="float8">39.2698307414278</td>
<td class="float8">38.1605504324339</td>
<td class="float8">38.1606072025172</td>
<td class="int4">373019</td>
<td class="float8">38.1605504324339</td>
<td class="float8">0.445236698722671</td>
</tr>
<tr>
<td class="float8">36.1283162861556</td>
<td class="float8">39.2698307414278</td>
<td class="float8">38.1605504324339</td>
<td class="float8">38.1606072025172</td>
<td class="int4">373416</td>
<td class="float8">38.1606072025172</td>
<td class="float8">0.445287530670759</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">462683</td>
<td class="float8">46.6623900452435</td>
<td class="float8">0.445291469623201</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">463233</td>
<td class="float8">46.6624209528782</td>
<td class="float8">0.445263795157205</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">458141</td>
<td class="float8">46.6624391236514</td>
<td class="float8">0.445247524983561</td>
</tr>
<tr>
<td class="float8">45.5530965244733</td>
<td class="float8">48.694549004443</td>
<td class="float8">46.6623900452435</td>
<td class="float8">46.662440645238</td>
<td class="int4">462704</td>
<td class="float8">46.662440645238</td>
<td class="float8">0.445246162542927</td>
</tr>
<tr>
<td class="float8">51.8363581438176</td>
<td class="float8">54.9774493159346</td>
<td class="float8">52.945639446865</td>
<td class="float8">52.9456708737895</td>
<td class="int4">520118</td>
<td class="float8">52.945639446865</td>
<td class="float8">0.445234079463435</td>
</tr>
<tr>
<td class="float8">51.8363581438176</td>
<td class="float8">54.9774493159346</td>
<td class="float8">52.945639446865</td>
<td class="float8">52.9456708737895</td>
<td class="int4">522686</td>
<td class="float8">52.9456708737895</td>
<td class="float8">0.445205939128705</td>
</tr>
<tr>
<td class="float8">54.9779645633645</td>
<td class="float8">58.1193389566675</td>
<td class="float8">57.0100855686799</td>
<td class="float8">57.0100855686799</td>
<td class="int4">561721</td>
<td class="float8">57.0100855686799</td>
<td class="float8">0.445218087206929</td>
</tr>
<tr>
<td class="float8">67.5442983367741</td>
<td class="float8">70.6858307610475</td>
<td class="float8">69.5764806582652</td>
<td class="float8">69.5764806582652</td>
<td class="int4">686886</td>
<td class="float8">69.5764806582652</td>
<td class="float8">0.445240002733585</td>
</tr>
<tr>
<td class="float8">70.6861408688866</td>
<td class="float8">73.8273966856673</td>
<td class="float8">71.7951245983548</td>
<td class="float8">71.7952263388403</td>
<td class="int4">711952</td>
<td class="float8">71.7951245983548</td>
<td class="float8">0.445297446856164</td>
</tr>
<tr>
<td class="float8">70.6861408688866</td>
<td class="float8">73.8273966856673</td>
<td class="float8">71.7951245983548</td>
<td class="float8">71.7952263388403</td>
<td class="int4">716508</td>
<td class="float8">71.7952263388403</td>
<td class="float8">0.445206347880861</td>
</tr>
<tr>
<td class="float8">76.9690608614191</td>
<td class="float8">80.1104563690938</td>
<td class="float8">78.0783171531357</td>
<td class="float8">78.0783171531357</td>
<td class="int4">778116</td>
<td class="float8">78.0783171531357</td>
<td class="float8">0.445290957467673</td>
</tr>
<tr>
<td class="float8">86.3938749629512</td>
<td class="float8">89.5353266705379</td>
<td class="float8">88.4260205388732</td>
<td class="float8">88.4260205388732</td>
<td class="int4">877138</td>
<td class="float8">88.4260205388732</td>
<td class="float8">0.445225639446173</td>
</tr>
<tr>
<td class="float8">89.5354929595716</td>
<td class="float8">92.6769711233795</td>
<td class="float8">90.6446926224232</td>
<td class="float8">90.6446926224232</td>
<td class="int4">903050</td>
<td class="float8">90.6446926224232</td>
<td class="float8">0.445286610427917</td>
</tr>
<tr>
<td class="float8">92.6770082124449</td>
<td class="float8">95.8184746547915</td>
<td class="float8">94.7092303088069</td>
<td class="float8">94.7092782433405</td>
<td class="int4">946181</td>
<td class="float8">94.7092303088069</td>
<td class="float8">0.445247543713327</td>
</tr>
<tr>
<td class="float8">92.6770082124449</td>
<td class="float8">95.8184746547915</td>
<td class="float8">94.7092303088069</td>
<td class="float8">94.7092782433405</td>
<td class="int4">942345</td>
<td class="float8">94.7092782433405</td>
<td class="float8">0.44529046414363</td>
</tr>
<tr>
<td class="float8">95.8185816861346</td>
<td class="float8">98.9601405610755</td>
<td class="float8">96.9279387165383</td>
<td class="float8">96.9279387165383</td>
<td class="int4">966815</td>
<td class="float8">96.9279387165383</td>
<td class="float8">0.445232181707114</td>
</tr>
<tr>
<td class="float8">98.9601765494391</td>
<td class="float8">100.992433886895</td>
<td class="float8">100.992433886895</td>
<td class="float8">100.992433886895</td>
<td class="int4">999931</td>
<td class="float8">100.992433886895</td>
<td class="float8">0.445263903548165</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0071s (0.1153s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=67.09..25557916.04 rows=11222222 width=36)
  CTE d
    -&gt;  Recursive Union  (cost=0.08..65.97 rows=101 width=16)
          -&gt;  Subquery Scan q  (cost=0.08..0.80 rows=1 width=8)
                InitPlan 5 (returns $5)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 4 (returns $4)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                      InitPlan 10 (returns $8)
                        -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                              -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                    Filter: (value IS NOT NULL)
                SubPlan 3
                  -&gt;  Limit  (cost=0.22..0.26 rows=1 width=8)
                        InitPlan 2 (returns $3)
                          -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                                InitPlan 1 (returns $2)
                                  -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $2)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($3)[1])
                SubPlan 7
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 6 (returns $6)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $6)
                SubPlan 9
                  -&gt;  Limit  (cost=0.18..0.22 rows=1 width=8)
                        InitPlan 8 (returns $7)
                          -&gt;  Limit  (cost=0.02..0.18 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.02..17938.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= (((floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)) + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= floor((($1 / 3.14159265358979::double precision) + 0.5::double precision)))
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $7)
          -&gt;  WorkTable Scan on d  (cost=0.04..6.31 rows=10 width=16)
                Filter: (d.nextpiece IS NOT NULL)
                InitPlan 15 (returns $13)
                  -&gt;  Result  (cost=0.03..0.04 rows=1 width=0)
                        InitPlan 14 (returns $12)
                          -&gt;  Limit  (cost=0.00..0.03 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..33799.01 rows=1000000 width=8)
                                      Filter: (value IS NOT NULL)
                SubPlan 13
                  -&gt;  Limit  (cost=0.19..0.23 rows=1 width=8)
                        InitPlan 12 (returns $11)
                          -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                                InitPlan 11 (returns $10)
                                  -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                              Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                              Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                                      Index Cond: (value &gt; $10)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..12104.70 rows=333333 width=8)
                              Index Cond: (value &lt; ($11)[1])
                SubPlan 17
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 16 (returns $14)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $14)
                SubPlan 19
                  -&gt;  Limit  (cost=0.15..0.19 rows=1 width=8)
                        InitPlan 18 (returns $15)
                          -&gt;  Limit  (cost=0.01..0.15 rows=1 width=8)
                                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..15438.04 rows=111111 width=8)
                                      Index Cond: (value &lt;= ((($9 + 1::double precision) - 0.5::double precision) * 3.14159265358979::double precision))
                                      Filter: (floor(((value / 3.14159265358979::double precision) + 0.5::double precision)) &lt;= $9)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..14604.70 rows=333333 width=8)
                              Index Cond: (value &gt; $15)
  -&gt;  CTE Scan on d  (cost=0.00..2.02 rows=101 width=24)
  -&gt;  Index Scan using ix_sine_value on t_sine s  (cost=1.12..2570.38 rows=111111 width=12)
        Index Cond: ((s.value &gt;= CASE ((d.piece)::integer % 2) WHEN 0 THEN (SubPlan 30) ELSE (SubPlan 32) END) AND (s.value &lt;= CASE ((d.piece)::integer % 2) WHEN 0 THEN (SubPlan 34) ELSE (SubPlan 36) END))
        SubPlan 30
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 29 (returns $24)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) + 0.461397604523314::double precision), $18)) AND (value &gt;= $19))
                              Filter: (sin(value) &lt; 0.4452::double precision)
                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &gt;= COALESCE($24, $19)) AND (value &lt;= $18))
                      Filter: (sin(value) &gt;= 0.4452::double precision)
        SubPlan 32
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 31 (returns $25)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) - 0.461509285667814::double precision), $18)) AND (value &gt;= $19))
                              Filter: (sin(value) &gt; 0.4453::double precision)
                -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &gt;= COALESCE($25, $19)) AND (value &lt;= $18))
                      Filter: (sin(value) &lt;= 0.4453::double precision)
        SubPlan 34
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 33 (returns $26)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) + 0.461509285667814::double precision), $19)) AND (value &lt;= $18))
                              Filter: (sin(value) &gt; 0.4453::double precision)
                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &lt;= COALESCE($26, $18)) AND (value &gt;= $19))
                      Filter: (sin(value) &lt;= 0.4453::double precision)
        SubPlan 36
          -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
                InitPlan 35 (returns $27)
                  -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                        -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                              Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) - 0.461397604523314::double precision), $19)) AND (value &lt;= $18))
                              Filter: (sin(value) &lt; 0.4452::double precision)
                -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                      Index Cond: ((value &lt;= COALESCE($27, $18)) AND (value &gt;= $19))
                      Filter: (sin(value) &gt;= 0.4452::double precision)
  SubPlan 22
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 21 (returns $20)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) + 0.461397604523314::double precision), $18)) AND (value &gt;= $19))
                        Filter: (sin(value) &lt; 0.4452::double precision)
          -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &gt;= COALESCE($20, $19)) AND (value &lt;= $18))
                Filter: (sin(value) &gt;= 0.4452::double precision)
  SubPlan 24
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 23 (returns $21)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &lt;= LEAST((($17 * 3.14159265358979::double precision) - 0.461509285667814::double precision), $18)) AND (value &gt;= $19))
                        Filter: (sin(value) &gt; 0.4453::double precision)
          -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &gt;= COALESCE($21, $19)) AND (value &lt;= $18))
                Filter: (sin(value) &lt;= 0.4453::double precision)
  SubPlan 26
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 25 (returns $22)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) + 0.461509285667814::double precision), $19)) AND (value &lt;= $18))
                        Filter: (sin(value) &gt; 0.4453::double precision)
          -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &lt;= COALESCE($22, $18)) AND (value &gt;= $19))
                Filter: (sin(value) &lt;= 0.4453::double precision)
  SubPlan 28
    -&gt;  Limit  (cost=0.14..0.28 rows=1 width=8)
          InitPlan 27 (returns $23)
            -&gt;  Limit  (cost=0.01..0.14 rows=1 width=8)
                  -&gt;  Index Scan using ix_sine_value on t_sine  (cost=0.01..225.76 rows=1667 width=8)
                        Index Cond: ((value &gt;= GREATEST((($17 * 3.14159265358979::double precision) - 0.461397604523314::double precision), $19)) AND (value &lt;= $18))
                        Filter: (sin(value) &lt; 0.4452::double precision)
          -&gt;  Index Scan Backward using ix_sine_value on t_sine  (cost=0.00..225.75 rows=1667 width=8)
                Index Cond: ((value &lt;= COALESCE($23, $18)) AND (value &gt;= $19))
                Filter: (sin(value) &gt;= 0.4452::double precision)
</pre>
</div>
<p>This <strong>200</strong>-line monster completes in only <strong>110 ms</strong>, or <strong>30 times</strong> as fast as the original <strong>3</strong>-liner:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_sine
WHERE   SIN(value) BETWEEN 0.1234 AND 0.1235
</pre>
<p>, yielding the same results.</p>
<h3>Summary</h3>
<p>This example was to demonstrate feasibility of the <strong>B-Tree</strong> indexes to be used in a search for the predicates involving monotonic functions and the performance gain achieved.</p>
<p>The performance gain is over <strong>30</strong> times for a table that fits completely into the cache, and will increase with the number of the cache misses increases.</p>
<p>The <strong>SQL</strong> implementation of the algorithm is in fact not optimal, since iterative searches for the value boundaries are implemented as the subqueries. Each subquery requires reentering the <strong>B-Tree</strong> and traversing it starting from the root. The native algorithm working within the optimizer could avoid this by caching the key position in the index and issuing <code>next_key</code> / <code>prev_key</code> commands, which would improve the algorithm yet more.</p>
<p>Sargability of the monotonic functions, as shown above, can help to make the queries like the one described in this article much more legible, maintainable and efficient.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/23/sargability-of-monotonic-functions-example/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Searching for arbitrary portions of a date</title>
		<link>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/</link>
		<comments>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 20:00:13 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=4132</guid>
		<description><![CDATA[From Stack Overflow: I have a Ruby on Rails application with a PostgreSQL database; several tables have created_at and updated_at timestamp attributes. When displayed, those dates are formatted in the user&#8217;s locale; for example, the timestamp 2009-10-15 16:30:00.435 becomes the string 15.10.2009 &#8211; 16:30 (the date format for this example being dd.mm.yyyy - hh.mm). The [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://stackoverflow.com/questions/2175844/is-it-possible-to-search-for-dates-as-strings-in-a-database-agnostic-way"><strong>Stack Overflow</strong></a>:</p>
<blockquote><p>I have a <strong>Ruby on Rails</strong> application with a <strong>PostgreSQL</strong> database; several tables have <code>created_at</code> and <code>updated_at</code> timestamp attributes.</p>
<p>When displayed, those dates are formatted in the user&#8217;s locale; for example, the timestamp <strong>2009-10-15 16:30:00.435</strong> becomes the string <strong>15.10.2009 &#8211; 16:30</strong> (the date format for this example being <code>dd.mm.yyyy - hh.mm</code>).</p>
<p>The requirement is that the user must be able to search for records by date, as if they were strings formatted in the current locale.</p>
<p>For example, searching for <strong>15.10.2009</strong> would return records with dates on <strong>October 15th 2009</strong>; searching for <strong>15.10</strong> would return records with dates on <strong>October 15th</strong> of any year, searching for <strong>15</strong> would return all dates that match <strong>15</strong> (be it day, month or year).</p></blockquote>
<p>The simplest solution would be just retrieve the locale string from the client, format the dates according to that string and search them using <code>LIKE</code> or <code>~</code> operators (the latter, as we all know, searches for <a href="http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-POSIX-TABLE"><strong>POSIX</strong> regular expressions</a>).</p>
<p>However, this would be not very efficient.</p>
<p>Let&#8217;s create a sample table and see:<br />
<span id="more-4132"></span><br />
<a href="#" onclick="xcollapse('X10740');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X10740" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_dates (
        id INT NOT NULL PRIMARY KEY,
        date TIMESTAMP NOT NULL,
        name VARCHAR(20) NOT NULL,
        stuffing VARCHAR(200) NOT NULL
);

CREATE INDEX ix_dates_date ON t_dates (date);

CREATE INDEX ix_dates_parts ON t_dates
USING GIN((
        ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
        ));

SELECT  SETSEED(0.20100202);

INSERT
INTO    t_dates (id, date, name, stuffing)
SELECT  id,
        TO_TIMESTAMP(&#039;2010-02-02&#039;, &#039;YYYY-MM-DD&#039;) -
        (id || &#039; hour&#039;)::INTERVAL +
        (FLOOR(RANDOM() * 1800) || &#039; second&#039;)::INTERVAL,
        &#039;Date &#039; || id,
        RPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) id;
</pre>
</div>
<p>This query contains <strong>1,000,000</strong> records with random timestamps spanning more than <strong>114 years</strong>.</p>
<p>Assuming that the client&#8217;s date format is set to <code>dd.mm.yy hh24.mi.ss</code>, let&#8217;s try to select the number of records that satisfy this string: <code>'20.12'</code>. We are also assuming that the beginning of string and the end of string are the field separators as well:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
</tr>
<tr>
<td class="int8">4235</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (7.9138s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=47262.58..47262.59 rows=1 width=0)
  -&gt;  Seq Scan on t_dates  (cost=0.00..47258.97 rows=1444 width=0)
        Filter: (to_char(date, &#39;dd.mm.yy hh24.mi.ss&#39;::text) ~ &#39;(^|[^\\d])20\\.12([^\\d]|$)&#39;::text)
</pre>
<p>This query runs for almost <strong>8 seconds</strong>. Let&#8217;s see which values does it return:</p>
<pre class="brush: sql">
SELECT  id, date
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
ORDER BY
        MD5(id::TEXT)
LIMIT 10
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>date</th>
</tr>
<tr>
<td class="int4">622924</td>
<td class="timestamp">1939-01-10 20:12:31</td>
</tr>
<tr>
<td class="int4">781217</td>
<td class="timestamp">1920-12-20 07:11:26</td>
</tr>
<tr>
<td class="int4">501772</td>
<td class="timestamp">1952-11-05 20:12:03</td>
</tr>
<tr>
<td class="int4">956539</td>
<td class="timestamp">1900-12-20 04:57:58</td>
</tr>
<tr>
<td class="int4">523679</td>
<td class="timestamp">1950-05-08 01:20:12</td>
</tr>
<tr>
<td class="int4">141308</td>
<td class="timestamp">1993-12-20 04:07:37</td>
</tr>
<tr>
<td class="int4">648220</td>
<td class="timestamp">1936-02-21 20:12:30</td>
</tr>
<tr>
<td class="int4">236980</td>
<td class="timestamp">1983-01-20 20:12:29</td>
</tr>
<tr>
<td class="int4">413051</td>
<td class="timestamp">1962-12-20 13:20:54</td>
</tr>
<tr>
<td class="int4">323566</td>
<td class="timestamp">1973-03-06 02:20:12</td>
</tr>
</table>
</div>
<p>We see the matches on <strong>day-month</strong>, <strong>hour-minute</strong> and <strong>minute-second</strong>. There are no <strong>month-year</strong> or <strong>year-hour</strong> matches, since there is no <strong>20th</strong> month and the year-hour separator is not a period.</p>
<p>The query seems to return correct values but is quite slow.</p>
<p>To improve this query we can use <strong>PostgreSQL</strong>&#8216;s <code>GIN</code> indexing abilities.</p>
<p>A <a href="http://www.postgresql.org/docs/8.4/static/textsearch-indexes.html"><code>GIN</code> index </a> is a way to index one record with several keys.</p>
<p>A plain index is a <strong>B-Tree</strong> structure that stores the pointers to the records (the <code>ctid</code>&#8216;s) in the leaf nodes. Such an index only accepts a single expression as a key and builds a single sort order over these expressions, so each record can be pointed to at most once. There is a one-to-many mapping between keys and records: a key can point to many records, but a record can be pointed to by at most one key.</p>
<p>A <code>GIN</code> index, on the other hand, accepts an array of expressions as a parameter and uses each element of the array as a key. This way, the mapping becomes many-to-many: each key can point to many records, and each record can be pointed to by many keys.</p>
<p>Usually, <code>GIN</code> indexes are used for <code>FULLTEXT</code> indexing: the piece of text stored in a record is split into the separate words and each word is indexed separately so that search for any word can be performed using the index.</p>
<p>However, in <strong>PostgreSQL</strong>, <code>GIN</code> indexes support integer arrays as well. And we can use this support to improve our query.</p>
<p>As many of you may have noted, I created a <code>GIN</code> index in the table creation script. Here&#8217;s how it looks:</p>
<pre class="brush: sql">
CREATE INDEX ix_dates_parts ON t_dates
USING GIN((
        ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
        ));
</pre>
<p>Each record is split into an array of <strong>8</strong> integers, each representing a certain portion of a date which are normally used in the date formatting options. <strong>6</strong> of them just represent date parts, and there are two extra integers that represent a <strong>2</strong>-digit year and an <strong>AM/PM</strong> hour.</p>
<p>This way, any record in year <strong>2010</strong> gets indexed with both <strong>2010</strong> and <strong>10</strong>, and any record with hour <strong>19</strong> gets indexed with both <strong>19</strong> and <strong>7</strong>.</p>
<p>This covers most formatting options, and can be changed to include less used ones.</p>
<p>To make use of this index we should provide an additional predicate in the <code>WHERE</code> clause. This predicate will take a user-provided array as an input and search for the records that contain <em>all</em> elements of the user-provided array in the date parts.</p>
<p>Here&#8217;s how our query looks now:</p>
<pre class="brush: sql">
SELECT  COUNT(*)
FROM    t_dates
WHERE   TO_CHAR(date, &#039;dd.mm.yy hh24.mi.ss&#039;) ~ E&#039;(^|[^\\d])20\\.12([^\\d]|$)&#039;
        AND ARRAY[20, 12] &lt;@ ARRAY[
        DATE_PART(&#039;year&#039;, date)::INTEGER,
        DATE_PART(&#039;year&#039;, date)::INTEGER % 100,
        DATE_PART(&#039;month&#039;, date)::INTEGER,
        DATE_PART(&#039;day&#039;, date)::INTEGER,
        DATE_PART(&#039;hour&#039;, date)::INTEGER,
        (DATE_PART(&#039;hour&#039;, date)::INTEGER + 1) % 12 + 1,
        DATE_PART(&#039;minute&#039;, date)::INTEGER,
        DATE_PART(&#039;second&#039;, date)::INTEGER
        ]
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
</tr>
<tr>
<td class="int8">4235</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.2366s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=3595.90..3595.91 rows=1 width=0)
  -&gt;  Bitmap Heap Scan on t_dates  (cost=104.76..3595.90 rows=1 width=0)
        Recheck Cond: (&#39;{20,12}&#39;::integer[] &lt;@ ARRAY[(date_part(&#39;year&#39;::text, date))::integer, ((date_part(&#39;year&#39;::text, date))::integer % 100), (date_part(&#39;month&#39;::text, date))::integer, (date_part(&#39;day&#39;::text, date))::integer, (date_part(&#39;hour&#39;::text, date))::integer, ((((date_part(&#39;hour&#39;::text, date))::integer + 1) % 12) + 1), (date_part(&#39;minute&#39;::text, date))::integer, (date_part(&#39;second&#39;::text, date))::integer])
        Filter: (to_char(date, &#39;dd.mm.yy hh24.mi.ss&#39;::text) ~ &#39;(^|[^\\d])20\\.12([^\\d]|$)&#39;::text)
        -&gt;  Bitmap Index Scan on ix_dates_parts  (cost=0.00..104.76 rows=1000 width=0)
              Index Cond: (&#39;{20,12}&#39;::integer[] &lt;@ ARRAY[(date_part(&#39;year&#39;::text, date))::integer, ((date_part(&#39;year&#39;::text, date))::integer % 100), (date_part(&#39;month&#39;::text, date))::integer, (date_part(&#39;day&#39;::text, date))::integer, (date_part(&#39;hour&#39;::text, date))::integer, ((((date_part(&#39;hour&#39;::text, date))::integer + 1) % 12) + 1), (date_part(&#39;minute&#39;::text, date))::integer, (date_part(&#39;second&#39;::text, date))::integer])
</pre>
<p>This returns the same records but does it <strong>40</strong> times as fast, since the <code>GIN</code> index is used for coarse filtering and the fine-filtering operator is applied to selected results only.</p>
<p>For the index to work, the expression we used to create the index should be provided verbatim to the right side of the <code>&lt;@</code> (contains) operator.</p>
<p>This solution, however, does not cover all possible conditions: a client can use less obvious formats. However, with a proper design this should not be a problem. The client application should just ignore the parts which are formatted in a way not suitable for the index and do not put them into the array (but of course leave them in the regular expression).</p>
<p>This makes the index less selective but still usable, and the query performance will still be improved greatly.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2010/02/02/searching-for-arbitrary-portions-of-a-date/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: Selecting records holding group-wise maximum</title>
		<link>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/</link>
		<comments>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/#comments</comments>
		<pubDate>Thu, 26 Nov 2009 20:00:25 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3777</guid>
		<description><![CDATA[Continuing the series on selecting records holding group-wise maximums: How do I select the whole records, grouped on grouper and holding a group-wise maximum (or minimum) on other column? In this article, I&#8217;ll describe several ways to do this in PostgreSQL 8.4. PostgreSQL 8.4 syntax is much richer than that of MySQL. The former can [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing the series on <a href="/2009/11/24/mysql-selecting-records-holding-group-wise-maximum-on-a-unique-column/">selecting records holding group-wise maximums</a>:</p>
<blockquote><p>How do I select the <em>whole</em> records, grouped on <code>grouper</code> and holding a group-wise maximum (or minimum) on other column?</p></blockquote>
<p>In this article, I&#8217;ll describe several ways to do this in <strong>PostgreSQL 8.4</strong>.</p>
<p><strong>PostgreSQL 8.4</strong> syntax is much richer than that of <strong>MySQL</strong>. The former can use the analytic functions, recursive <strong>CTE</strong>&#8216;s and proprietary syntax extensions, all of which can be used for this task.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3777"></span><br />
<a href="#" onclick="xcollapse('X10125');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X10125" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_distinct (
      id INT NOT NULL PRIMARY KEY,
      orderer INT NOT NULL,
      glow INT NOT NULL,
      ghigh INT NOT NULL,
      stuffing VARCHAR(200) NOT NULL
);

CREATE INDEX ix_distinct_glow_id ON t_distinct (glow, id);
CREATE INDEX ix_distinct_ghigh_id ON t_distinct (ghigh, id);
CREATE INDEX ix_distinct_glow_orderer_id ON t_distinct (glow, orderer, id);
CREATE INDEX ix_distinct_ghigh_orderer_id ON t_distinct (ghigh, orderer, id);

SELECT  SETSEED(0.20091126);

INSERT
INTO    t_distinct (id, orderer, glow, ghigh, stuffing)
SELECT  id, FLOOR(RANDOM() * 9) + 1,
        (id - 1) % 10 + 1,
        (id - 1) % 10000 + 1,
        LPAD(&#039;&#039;, 200, &#039;*&#039;)
FROM    generate_series(1, 1000000) id;
</pre>
</div>
<p>This table has <strong>1,000,000</strong> records:</p>
<ul>
<li><code>id</code> is the <code>PRIMARY KEY</code></li>
<li><code>orderer</code> is filled with random values from <strong>1</strong> to <strong>10</strong></li>
<li><code>glow</code> is a low cardinality grouping field (<strong>10</strong> distinct values)</li>
<li><code>ghigh</code> is a high cardinality grouping field (<strong>10,000</strong> distinct values)</li>
<li><code>stuffing</code> is an asterisk-filled <code>VARCHAR(200)</code> column added to emulate payload of the actual tables</li>
</ul>
<h3>Analytic functions</h3>
<p><strong>PostgreSQL 8.4</strong> supports analytic functions. These functions extend the aggregate abilities: they work on the groups rather than on the individual records, but return their values to each individual record instead of shrinking the set. For instance, <code>ROW_NUMBER</code> enumerates the records within the group according to the ordering condition, and <code>DENSE_RANK</code> enumerates distinct values of the ordering column (it assigns same number to the records with the same value of the ordering column).</p>
<p>Let&#8217;s make a query to select the records holding the group-wise maximums of <code>id</code>. Since <code>id</code> is a <code>PRIMARY KEY</code> we don&#8217;t have to worry about the ties.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
SELECT  id, orderer, glow, ghigh
FROM    (
        SELECT  *, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY id) AS rn
        FROM    t_distinct
        ) q
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X4310');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4310" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">4</td>
<td class="int4">9</td>
<td class="int4">4</td>
<td class="int4">4</td>
</tr>
<tr>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">5</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">4</td>
<td class="int4">6</td>
<td class="int4">6</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">6</td>
<td class="int4">7</td>
<td class="int4">7</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">8</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">7</td>
<td class="int4">9</td>
<td class="int4">9</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">2</td>
<td class="int4">10</td>
<td class="int4">10</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0507s (11.7804s)</td>
</tr>
</table>
</div>
<pre>
Subquery Scan q  (cost=246866.84..279366.84 rows=5000 width=16)
  Filter: (q.rn = 1)
  -&gt;  WindowAgg  (cost=246866.84..266866.84 rows=1000000 width=220)
        -&gt;  Sort  (cost=246866.84..249366.84 rows=1000000 width=220)
              Sort Key: t_distinct.glow, t_distinct.id
              -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=220)
</pre>
</div>
<p>This works, but is <em>very</em> inefficient (more than <strong>12</strong> seconds). <strong>PostgreSQL</strong> chooses the sorting in this case but it is not very good in sorting the tables with large rows.</p>
<p>This can be improved by making <strong>PostgreSQL</strong> to use the index which covers both columns and then join the <code>id</code> (which is also covered by the index):</p>
<pre class="brush: sql">
SELECT  di.id, di.orderer, di.glow, di.ghigh
FROM    (
        SELECT  id, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY id) AS rn
        FROM    t_distinct d
        ) dd
JOIN    t_distinct di
ON      di.id = dd.id
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X4873');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4873" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">1</td>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
<td class="int4">2</td>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">3</td>
<td class="int4">3</td>
</tr>
<tr>
<td class="int4">4</td>
<td class="int4">9</td>
<td class="int4">4</td>
<td class="int4">4</td>
</tr>
<tr>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">5</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">4</td>
<td class="int4">6</td>
<td class="int4">6</td>
</tr>
<tr>
<td class="int4">7</td>
<td class="int4">6</td>
<td class="int4">7</td>
<td class="int4">7</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">8</td>
<td class="int4">8</td>
</tr>
<tr>
<td class="int4">9</td>
<td class="int4">7</td>
<td class="int4">9</td>
<td class="int4">9</td>
</tr>
<tr>
<td class="int4">10</td>
<td class="int4">2</td>
<td class="int4">10</td>
<td class="int4">10</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.9997s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=140907.84..205920.84 rows=5000 width=16)
  -&gt;  Subquery Scan dd  (cost=140907.84..173407.84 rows=5000 width=4)
        Filter: (dd.rn = 1)
        -&gt;  WindowAgg  (cost=140907.84..160907.84 rows=1000000 width=8)
              -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=8)
                    Sort Key: d.glow, d.id
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=8)
  -&gt;  Index Scan using t_distinct_pkey on t_distinct di  (cost=0.00..6.49 rows=1 width=16)
        Index Cond: (di.id = dd.id)
</pre>
</div>
<p>This is much faster (<strong>4 s</strong>) but there is still much space for improvement.</p>
<p>If we wish to order by <code>orderer</code> we need to define a method to resolve ties.</p>
<p>Using the same approach we can return all records with ties:</p>
<pre class="brush: sql">
SELECT  COUNT(*), SUM(id)
FROM    (
        SELECT  *, DENSE_RANK() OVER (PARTITION BY glow ORDER BY orderer) AS dr
        FROM    t_distinct d
        ) dd
WHERE   dr = 1
</pre>
<p><a href="#" onclick="xcollapse('X2727');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X2727" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>count</th>
<th>sum</th>
</tr>
<tr>
<td class="int8">111058</td>
<td class="int8">55543096995</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0078s (16.9997s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=279391.85..279391.86 rows=1 width=4)
  -&gt;  Subquery Scan dd  (cost=246866.84..279366.84 rows=5000 width=4)
        Filter: (dd.dr = 1)
        -&gt;  WindowAgg  (cost=246866.84..266866.84 rows=1000000 width=220)
              -&gt;  Sort  (cost=246866.84..249366.84 rows=1000000 width=220)
                    Sort Key: d.glow, d.orderer
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=220)
</pre>
</div>
<p>, or resolve ties by return the record with the maximum <code>id</code> among those holding the minimum value of the <code>orderer</code>:</p>
<pre class="brush: sql">
SELECT  di.id, di.orderer, di.glow, di.ghigh
FROM    (
        SELECT  id, ROW_NUMBER() OVER (PARTITION BY glow ORDER BY orderer, id DESC) AS rn
        FROM    t_distinct d
        ) dd
JOIN    t_distinct di
ON      di.id = dd.id
WHERE   rn = 1
</pre>
<p><a href="#" onclick="xcollapse('X7172');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X7172" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (4.8593s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=140907.84..208420.84 rows=5000 width=16)
  -&gt;  Subquery Scan dd  (cost=140907.84..175907.84 rows=5000 width=4)
        Filter: (dd.rn = 1)
        -&gt;  WindowAgg  (cost=140907.84..163407.84 rows=1000000 width=12)
              -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=12)
                    Sort Key: d.glow, d.orderer, d.id
                    -&gt;  Seq Scan on t_distinct d  (cost=0.00..41250.00 rows=1000000 width=12)
  -&gt;  Index Scan using t_distinct_pkey on t_distinct di  (cost=0.00..6.49 rows=1 width=16)
        Index Cond: (di.id = dd.id)
</pre>
</div>
<p>As you can see, all these queries are elegant but rather inefficient.</p>
<h3>Using DISTINCT ON</h3>
<p><strong>PostgreSQL</strong> implements another way to return the whole records holding group-wise maximums or minimums.</p>
<p>By using a special clause, <a href="http://www.postgresql.org/docs/8.4/interactive/queries-select-lists.html"><code>DISTINCT ON</code></a>, we can return records holding only the distinct values of the certain columns. For this to work correctly, one needs to define an <code>ORDER BY</code> condition in addition to <code>DISTINCT ON</code>, with the leading expressions being the same as those using in <code>DISTINCT ON</code>. This guarantees that all the records belonging to each group would be consecutive if not for the <code>DISTINCT ON</code> clause.</p>
<p><code>DISTINCT ON</code> is applied after the <code>ORDER BY</code> condition. It just returns the first record from each group, skipping the others. This is very easy to do: return a record if the grouping expression changed from the previous row; don&#8217;t return if it didn&#8217;t.</p>
<p>This is quite similar to <strong>MySQL</strong>&#8216;s extension for <code>GROUP BY</code>, but, unlike <strong>MySQL</strong>, this solution guarantees correct order and the fact that all values returned will be taken from a single record.</p>
<p>This query cannot be used to return all records with ties (since the values of the grouping column won&#8217;t be distinct), but it will work if the ties are impossible (as in selecting a maximum <code>id</code>), or if a correct condition for resolving ties is provided.</p>
<p>Here&#8217;s the query to return records holding <code>MAX(id)</code>:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (glow) id, orderer, glow, ghigh
FROM    t_distinct
ORDER BY
        glow, id DESC
</pre>
<p><a href="#" onclick="xcollapse('X9913');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X9913" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999991</td>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">9991</td>
</tr>
<tr>
<td class="int4">999992</td>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">9992</td>
</tr>
<tr>
<td class="int4">999993</td>
<td class="int4">6</td>
<td class="int4">3</td>
<td class="int4">9993</td>
</tr>
<tr>
<td class="int4">999994</td>
<td class="int4">5</td>
<td class="int4">4</td>
<td class="int4">9994</td>
</tr>
<tr>
<td class="int4">999995</td>
<td class="int4">4</td>
<td class="int4">5</td>
<td class="int4">9995</td>
</tr>
<tr>
<td class="int4">999996</td>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">9996</td>
</tr>
<tr>
<td class="int4">999997</td>
<td class="int4">3</td>
<td class="int4">7</td>
<td class="int4">9997</td>
</tr>
<tr>
<td class="int4">999998</td>
<td class="int4">2</td>
<td class="int4">8</td>
<td class="int4">9998</td>
</tr>
<tr>
<td class="int4">999999</td>
<td class="int4">8</td>
<td class="int4">9</td>
<td class="int4">9999</td>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">6</td>
<td class="int4">10</td>
<td class="int4">10000</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.3593s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=140907.84..145907.84 rows=10 width=16)
  -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=16)
        Sort Key: glow, id
        -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=16)
</pre>
</div>
<p>And here&#8217;s the one to return the <code>MAX(id)</code> within the <code>MIN(orderer)</code>:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (glow) id, orderer, glow, ghigh
FROM    t_distinct
ORDER BY
        glow, orderer, id DESC
</pre>
<p><a href="#" onclick="xcollapse('X9326');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X9326" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0006s (3.9530s)</td>
</tr>
</table>
</div>
<pre>
Unique  (cost=140907.84..145907.84 rows=10 width=16)
  -&gt;  Sort  (cost=140907.84..143407.84 rows=1000000 width=16)
        Sort Key: glow, orderer, id
        -&gt;  Seq Scan on t_distinct  (cost=0.00..41250.00 rows=1000000 width=16)
</pre>
</div>
<p>This is more efficient than the window function. However, both queries still take <strong>4 seconds</strong>. This is almost <strong>40 times</strong> as much as the same queries in <strong>MySQL </strong>, even without any improvements.</p>
<p>Unlike <strong>MySQL</strong>, <strong>PostgreSQL</strong> does not implement loose index scan which would allow to jump over the distinct index records. However, it can be emulated using recursive <strong>CTE</strong>&#8216;s.</p>
<h3>Recursive CTE&#8217;s to emulate loose index scan</h3>
<p>The main idea here is simple:</p>
<ul>
<li>In the anchor part of the <strong>CTE</strong> take the lowest value of the key</li>
<li>In the recursive part of the <strong>CTE</strong> take the next value of the key by using <code>&gt;</code> or <code>&lt;</code> operators along with the <code>ORDER BY</code> and <code>LIMIT 1</code></li>
</ul>
<p><strong>PostgreSQL</strong>&#8216;s syntax allows compacting a whole table record into a single field (which can be exploded later). This will allow us to avoid joins by placing the whole recursive part into a subquery which will use the index efficiently.</p>
<p>Here&#8217;s the query to return records holding group-wise <code>MAX(id)</code>:</p>
<pre class="brush: sql">
WITH    RECURSIVE rows AS
        (
        SELECT  d
        FROM    (
                SELECT  d
                FROM    t_distinct d
                ORDER BY
                        glow DESC, id DESC
                LIMIT 1
                ) q
        UNION ALL
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow &lt; (r.d).glow
                ORDER BY
                        di.glow DESC, di.id DESC
                LIMIT 1
                )
        FROM    rows r
        WHERE   d IS NOT NULL
        )
SELECT  (d).id, (d).orderer, (d).glow, (d).ghigh
FROM    rows
WHERE   d IS NOT NULL
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">6</td>
<td class="int4">10</td>
<td class="int4">10000</td>
</tr>
<tr>
<td class="int4">999999</td>
<td class="int4">8</td>
<td class="int4">9</td>
<td class="int4">9999</td>
</tr>
<tr>
<td class="int4">999998</td>
<td class="int4">2</td>
<td class="int4">8</td>
<td class="int4">9998</td>
</tr>
<tr>
<td class="int4">999997</td>
<td class="int4">3</td>
<td class="int4">7</td>
<td class="int4">9997</td>
</tr>
<tr>
<td class="int4">999996</td>
<td class="int4">8</td>
<td class="int4">6</td>
<td class="int4">9996</td>
</tr>
<tr>
<td class="int4">999995</td>
<td class="int4">4</td>
<td class="int4">5</td>
<td class="int4">9995</td>
</tr>
<tr>
<td class="int4">999994</td>
<td class="int4">5</td>
<td class="int4">4</td>
<td class="int4">9994</td>
</tr>
<tr>
<td class="int4">999993</td>
<td class="int4">6</td>
<td class="int4">3</td>
<td class="int4">9993</td>
</tr>
<tr>
<td class="int4">999992</td>
<td class="int4">3</td>
<td class="int4">2</td>
<td class="int4">9992</td>
</tr>
<tr>
<td class="int4">999991</td>
<td class="int4">5</td>
<td class="int4">1</td>
<td class="int4">9991</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0005s (0.0049s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=232.94..234.96 rows=100 width=32)
  Filter: (d IS NOT NULL)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..232.94 rows=101 width=32)
          -&gt;  Subquery Scan q  (cost=0.00..2.24 rows=1 width=32)
                -&gt;  Limit  (cost=0.00..2.23 rows=1 width=40)
                      -&gt;  Index Scan Backward using ix_distinct_glow_id on t_distinct d  (cost=0.00..2231423.39 rows=1000000 width=40)
          -&gt;  WorkTable Scan on rows r  (cost=0.00..22.87 rows=10 width=32)
                Filter: (r.d IS NOT NULL)
                SubPlan 1
                  -&gt;  Limit  (cost=0.00..2.27 rows=1 width=40)
                        -&gt;  Index Scan Backward using ix_distinct_glow_id on t_distinct di  (cost=0.00..755578.45 rows=333333 width=40)
                              Index Cond: (glow &lt; ($1).glow)
</pre>
<p>As you can see, this query takes only <strong>5 ms</strong>, next to instant. This is because on each iteration step, the whole record can be returned in a single index seek for the first value of the key which is greater than the previous value.</p>
<p>If we wanted to resolve the ties with more complex conditions, the query would become a little more complex too.</p>
<p>Let&#8217;s consider the query to resolve ties by selecting <code>MAX(id)</code> within the <code>MIN(orderer)</code>, just like in the previous example.</p>
<p>The indexes we created order all columns in the same directions: <code>(glow ASC, orderer ASC, id ASC)</code>. Of course, the whole index could be used as well if <em>all</em> directions were reversed: <code>(glow DESC, orderer DESC, id DESC)</code>.</p>
<p>However, if only some of the directions are reversed, like in <code>(orderer DESC, id ASC)</code> (which is what we need here), the index cannot be used for ordering anymore.</p>
<p>The same problem was mentioned in one of the previous articles on <a href="http://explainextended.com/2009/11/24/mysql-selecting-records-holding-group-wise-maximum-on-a-unique-column/">selecting records holding group-wise maximums in <strong>MySQL</strong></a>. And this is the reason for the <code>MAX(id)</code> being less efficient than <code>MIN(id)</code> with a loose index scan (which is described in more details in the article aforementioned). However, <strong>MySQL</strong> deals with it automatically, while we need to implement this with our own hands.</p>
<p>To do this, we should need to use the same trick as we did in <strong>MySQL</strong>: select the <code>MIN(orderer)</code> and <code>MAX(id)</code> within this <code>orderer</code> in two different queries which would use two different index seeks, each in the appropriate direction.</p>
<p>Here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE groups AS
        (
        SELECT  d
        FROM    (
                SELECT  d
                FROM    t_distinct d
                ORDER BY
                        glow, orderer
                LIMIT 1
                ) q
        UNION ALL
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow &gt; (g.d).glow
                ORDER BY
                        di.glow, di.orderer
                LIMIT 1
                )
        FROM    groups g
        WHERE   d IS NOT NULL
        ),
        rows AS
        (
        SELECT  (
                SELECT  di
                FROM    t_distinct di
                WHERE   di.glow = (g.d).glow
                        AND di.orderer = (g.d).orderer
                ORDER BY
                        id DESC
                LIMIT 1
                ) di
        FROM    groups g
        WHERE   d IS NOT NULL
        )
SELECT  (di).id, (di).orderer, (di).glow, (di).ghigh
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>orderer</th>
<th>glow</th>
<th>ghigh</th>
</tr>
<tr>
<td class="int4">999881</td>
<td class="int4">1</td>
<td class="int4">1</td>
<td class="int4">9881</td>
</tr>
<tr>
<td class="int4">999892</td>
<td class="int4">1</td>
<td class="int4">2</td>
<td class="int4">9892</td>
</tr>
<tr>
<td class="int4">999923</td>
<td class="int4">1</td>
<td class="int4">3</td>
<td class="int4">9923</td>
</tr>
<tr>
<td class="int4">999984</td>
<td class="int4">1</td>
<td class="int4">4</td>
<td class="int4">9984</td>
</tr>
<tr>
<td class="int4">999955</td>
<td class="int4">1</td>
<td class="int4">5</td>
<td class="int4">9955</td>
</tr>
<tr>
<td class="int4">999936</td>
<td class="int4">1</td>
<td class="int4">6</td>
<td class="int4">9936</td>
</tr>
<tr>
<td class="int4">999827</td>
<td class="int4">1</td>
<td class="int4">7</td>
<td class="int4">9827</td>
</tr>
<tr>
<td class="int4">999848</td>
<td class="int4">1</td>
<td class="int4">8</td>
<td class="int4">9848</td>
</tr>
<tr>
<td class="int4">999829</td>
<td class="int4">1</td>
<td class="int4">9</td>
<td class="int4">9829</td>
</tr>
<tr>
<td class="int4">999930</td>
<td class="int4">1</td>
<td class="int4">10</td>
<td class="int4">9930</td>
</tr>
<tr class="statusbar">
<td colspan="100">10 rows fetched in 0.0005s (0.0058s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=588.47..590.47 rows=100 width=32)
  CTE groups
    -&gt;  Recursive Union  (cost=0.00..243.79 rows=101 width=32)
          -&gt;  Subquery Scan q  (cost=0.00..2.35 rows=1 width=32)
                -&gt;  Limit  (cost=0.00..2.34 rows=1 width=40)
                      -&gt;  Index Scan using ix_distinct_glow_orderer_id on t_distinct d  (cost=0.00..2343052.63 rows=1000000 width=40)
          -&gt;  WorkTable Scan on groups g  (cost=0.00..23.94 rows=10 width=32)
                Filter: (g.d IS NOT NULL)
                SubPlan 1
                  -&gt;  Limit  (cost=0.00..2.37 rows=1 width=40)
                        -&gt;  Index Scan using ix_distinct_glow_orderer_id on t_distinct di  (cost=0.00..791389.47 rows=333333 width=40)
                              Index Cond: (glow &gt; ($1).glow)
  CTE rows
    -&gt;  CTE Scan on groups g  (cost=0.00..344.68 rows=100 width=32)
          Filter: (d IS NOT NULL)
          SubPlan 3
            -&gt;  Limit  (cost=0.00..3.43 rows=1 width=36)
                  -&gt;  Index Scan Backward using ix_distinct_glow_orderer_id on t_distinct di  (cost=0.00..38073.13 rows=11111 width=36)
                        Index Cond: ((glow = ($3).glow) AND (orderer = ($3).orderer))
</pre>
<p>We see both <code>Index Scan</code> and <code>Index Scan Backward</code> in the plan above. The first one finds the <code>MIN(orderer)</code>, the second one finds the <code>MAX(id)</code> within the previously found value of the <code>orderer</code>.</p>
<p>Note that unlike <strong>MySQL</strong>, in <strong>PostgreSQL</strong> it&#8217;s enough to use just a single <code>ORDER BY id DESC</code> condition in the subquery which selects the top <code>id</code> within the records with the lowest <code>orderer</code>. <strong>PostgreSQL</strong>&#8216;s optimizer is smart enough to pick the correct index (that is the index on <code>(glow, orderer, id)</code>) to serve this query.</p>
<p>This query also takes only <strong>5 ms</strong>.</p>
<h4>Summary</h4>
<p>Unlike <strong>MySQL</strong>, <strong>PostgreSQL</strong> implements several clean and documented ways to select the records holding group-wise maximums, including window functions and <code>DISTINCT ON</code>.</p>
<p>However to the lack of the loose index scan support by the <strong>PostgreSQL</strong>&#8216;s optimizer and the less efficient usage of indexes in <strong>PostgreSQL</strong>, the queries using these function take too long.</p>
<p>To word around these problems and improve the queries against the low cardinality grouping conditions, a certain solution described in the article should be used.</p>
<p>This solution uses recursive <strong>CTE</strong>&#8216;s to emulate loose index scan and is very efficient if the grouping columns have low cardinality.</p>
<p><strong>To be continued.</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/11/26/postgresql-selecting-records-holding-group-wise-maximum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recursive CTE&#8217;s: PostgreSQL</title>
		<link>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/</link>
		<comments>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/#comments</comments>
		<pubDate>Mon, 23 Nov 2009 20:00:24 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3721</guid>
		<description><![CDATA[In the previous article on recursive CTE&#8216;s in SQL Server I demonstrated that they are not really set-based. SQL Server implements the recursive CTE&#8216;s syntax, but forbids all operations that do not distribute over UNION ALL, and each recursive step sees only a single record from the previous step. Now, let&#8217;s check the same operations [...]]]></description>
			<content:encoded><![CDATA[<p>In the previous article on <a href="/2009/11/18/sql-server-are-the-recursive-ctes-really-set-based/">recursive <strong>CTE</strong>&#8216;s in <strong>SQL Server</strong></a> I demonstrated that they are not really set-based.</p>
<p><strong>SQL Server</strong> implements the recursive <strong>CTE</strong>&#8216;s syntax, but forbids all operations that do not distribute over <code>UNION ALL</code>, and each recursive step sees only a single record from the previous step.</p>
<p>Now, let&#8217;s check the same operations in <strong>PostgreSQL 8.4</strong>.</p>
<p>To do this, we well write a query that selects only the very first branch of a tree: that is, each item would be the first child of its parent. To do this, we should select the item that would be the first child of the root, the select the first child of that item etc.</p>
<p>This is a set-based operation.</p>
<p><strong>Oracle</strong>&#8216;s <code>CONNECT BY</code> syntax, despite being set-based, offers some limited set-based capabilities: you can use <code>ORDER SIBLINGS BY</code> clause to define the order in which the siblings are returned. However, this would require some additional work to efficiently return only the first branch.</p>
<p>In a true set-based system, this is much more simple.</p>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3721"></span></p>
<pre class="brush: sql">
CREATE TABLE t_recursive (
        id INT NOT NULL PRIMARY KEY,
        parent INT NOT NULL,
        orderer INT NOT NULL,
        data VARCHAR(100) NOT NULL
        );

CREATE INDEX ix_recursive_parent_orderer ON t_recursive (parent, orderer);

SELECT  SETSEED(0.20091123);

INSERT
INTO    t_recursive
SELECT  s, (s - 1) / 5, FLOOR(RANDOM() * 10000), &#039;Item &#039; || s
FROM    generate_series(1, 1000000) s;
</pre>
<p>This table contains <strong>1,000,000</strong> records and implements an <a href="/2009/09/24/adjacency-list-vs-nested-sets-postgresql/">adjacency tree hierarchy</a>.</p>
<p>Each item has at most <strong>5</strong> children and a randomly filled column, <code>orderer</code>, which defines its order along its siblings.</p>
<p>Now, let&#8217;s try to make a query that would select the first branch, the order being defined by the value provided in <code>orderer</code>.</p>
<p>To do this, we should make the anchor step to return the first child of the root item, and the recursive steps to return the first child of the previously returned item.</p>
<p>To return the first child of a given parent in the <code>orderer</code> order, we can use <strong>PostgreSQL</strong>&#8216;s <code>DISTINCT ON</code> functionality.</p>
<p>This is the same as <code>DISTINCT</code>, but can return the whole row rather than a column <code>DISTINCT</code> is being applied to.</p>
<p>If two rows share the value of the column <code>DISTINCT ON</code> is being applied to, only one of the rows will be returned. Which row will it be is defined by the <code>ORDER BY</code> clause.</p>
<p>This solves the problems like <q>return a single row that holds group-wise maximum</q>.</p>
<p>This query:</p>
<pre class="brush: sql">
SELECT  DISTINCT ON (parent) *
FROM    t_recursive
ORDER BY
        parent, orderer
</pre>
<p>is the same as this one:</p>
<pre class="brush: sql">
SELECT  (q.r).*
FROM    (
        SELECT  r, ROW_NUMBER() OVER (PARTITION BY grouper ORDER BY orderer) AS rn
        FROM    t_recursive r
        ) q
WHERE   rn = 1
</pre>
<p>, but more legible and in some cases more efficient.</p>
<p>To apply this clause to the task we need, we should just use <code>DISTINCT ON (parent)</code> in both anchor and recursive parts:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (parent) *
                FROM    t_recursive
                WHERE   parent = 0
                ORDER BY
                        parent, orderer
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  DISTINCT ON (c.parent) c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.parent, c.orderer
                ) q2
        )
SELECT  *
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">0</td>
<td class="int4">1686</td>
<td class="varchar">Item 3</td>
</tr>
<tr>
<td class="int4">19</td>
<td class="int4">3</td>
<td class="int4">3370</td>
<td class="varchar">Item 19</td>
</tr>
<tr>
<td class="int4">98</td>
<td class="int4">19</td>
<td class="int4">42</td>
<td class="varchar">Item 98</td>
</tr>
<tr>
<td class="int4">492</td>
<td class="int4">98</td>
<td class="int4">1762</td>
<td class="varchar">Item 492</td>
</tr>
<tr>
<td class="int4">2464</td>
<td class="int4">492</td>
<td class="int4">2295</td>
<td class="varchar">Item 2464</td>
</tr>
<tr>
<td class="int4">12322</td>
<td class="int4">2464</td>
<td class="int4">2050</td>
<td class="varchar">Item 12322</td>
</tr>
<tr>
<td class="int4">61614</td>
<td class="int4">12322</td>
<td class="int4">768</td>
<td class="varchar">Item 61614</td>
</tr>
<tr>
<td class="int4">308074</td>
<td class="int4">61614</td>
<td class="int4">1925</td>
<td class="varchar">Item 308074</td>
</tr>
<tr class="statusbar">
<td colspan="100">8 rows fetched in 0.0004s (0.0042s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=3188.19..3207.83 rows=982 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..3188.19 rows=982 width=230)
          -&gt;  Unique  (cost=0.00..15.46 rows=2 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Unique  (cost=313.84..314.33 rows=98 width=23)
                -&gt;  Sort  (cost=313.84..314.08 rows=98 width=23)
                      Sort Key: c.parent, c.orderer
                      -&gt;  Nested Loop  (cost=0.00..310.60 rows=98 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.40 rows=20 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This gives us a single branch containing just the records we need: each one being the first child of its parent.</p>
<p>Now, what if we wanted to return a list of records, each being the <em>second</em> child to its parent?</p>
<p>Since each recursive part takes only one record as an input (and returns one record as an output), we can just replace <code>DISTINCT ON</code> (which returns the first child of each group) with a <code>OFFSET 1 LIMIT 1</code>:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_recursive
                WHERE   parent = 0
                ORDER BY
                        orderer
                OFFSET 1 LIMIT 1
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.orderer
                OFFSET 1 LIMIT 1
                ) q2
        )
SELECT  *
FROM    rows
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">0</td>
<td class="int4">2540</td>
<td class="varchar">Item 1</td>
</tr>
<tr>
<td class="int4">6</td>
<td class="int4">1</td>
<td class="int4">3405</td>
<td class="varchar">Item 6</td>
</tr>
<tr>
<td class="int4">33</td>
<td class="int4">6</td>
<td class="int4">2884</td>
<td class="varchar">Item 33</td>
</tr>
<tr>
<td class="int4">166</td>
<td class="int4">33</td>
<td class="int4">3084</td>
<td class="varchar">Item 166</td>
</tr>
<tr>
<td class="int4">833</td>
<td class="int4">166</td>
<td class="int4">1848</td>
<td class="varchar">Item 833</td>
</tr>
<tr>
<td class="int4">4169</td>
<td class="int4">833</td>
<td class="int4">993</td>
<td class="varchar">Item 4169</td>
</tr>
<tr>
<td class="int4">20850</td>
<td class="int4">4169</td>
<td class="int4">3126</td>
<td class="varchar">Item 20850</td>
</tr>
<tr>
<td class="int4">104251</td>
<td class="int4">20850</td>
<td class="int4">3021</td>
<td class="varchar">Item 104251</td>
</tr>
<tr>
<td class="int4">521256</td>
<td class="int4">104251</td>
<td class="int4">5492</td>
<td class="varchar">Item 521256</td>
</tr>
<tr class="statusbar">
<td colspan="100">9 rows fetched in 0.0005s (0.0043s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows  (cost=1564.44..1564.66 rows=11 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=3.09..1564.44 rows=11 width=230)
          -&gt;  Limit  (cost=3.09..6.18 rows=1 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Limit  (cost=155.79..155.79 rows=1 width=23)
                -&gt;  Sort  (cost=155.79..155.91 rows=49 width=23)
                      Sort Key: c.orderer
                      -&gt;  Nested Loop  (cost=0.00..155.30 rows=49 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.20 rows=10 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This query returns a whole branch of the items that are second children to their parents.</p>
<p>Both these queries are available in <strong>SQL Server 2005</strong>: despite the fact they do not distribute over <code>UNION ALL</code>, they can be rewritten using a <code>ROW_NUMBER()</code>.</p>
<p>As was shown in the <a href="/2009/11/18/sql-server-are-the-recursive-ctes-really-set-based/">previous article</a>, for some strange reason, <strong>SQL Server 2005</strong> does not forbid this clause, but rather implements it in the incorrect way. This, however, allows writing the query we&#8217;re after.</p>
<p>Now, let&#8217;s make a query that would recursively return the first <em>two children</em>.</p>
<p>This requires a set-based solution, since first two children can come from different parents.</p>
<p>A grandchild that is third to its grandfather won&#8217;t count, even if it was a first child to the first child of the grandfather.</p>
<p>A first child to the third child of the grandfather won&#8217;t count either, even if it was the first grandchild to its grandfather.</p>
<p>To do this right, on each step the query should accept <strong>2</strong> items, return <strong>2</strong> items and process the items accepted <em>at once</em>.</p>
<p>This is also easy in a true set-based recursive <strong>CTE</strong>:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        rows AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_recursive r
                WHERE   parent = 0
                ORDER BY
                        orderer
                LIMIT 2
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  c.*
                FROM    rows r
                JOIN    t_recursive c
                ON      c.parent = r.id
                ORDER BY
                        c.orderer
                LIMIT 2
                ) q
        )
SELECT  *
FROM    rows r
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>orderer</th>
<th>data</th>
</tr>
<tr>
<td class="int4">3</td>
<td class="int4">0</td>
<td class="int4">1686</td>
<td class="varchar">Item 3</td>
</tr>
<tr>
<td class="int4">1</td>
<td class="int4">0</td>
<td class="int4">2540</td>
<td class="varchar">Item 1</td>
</tr>
<tr>
<td class="int4">8</td>
<td class="int4">1</td>
<td class="int4">2181</td>
<td class="varchar">Item 8</td>
</tr>
<tr>
<td class="int4">19</td>
<td class="int4">3</td>
<td class="int4">3370</td>
<td class="varchar">Item 19</td>
</tr>
<tr>
<td class="int4">98</td>
<td class="int4">19</td>
<td class="int4">42</td>
<td class="varchar">Item 98</td>
</tr>
<tr>
<td class="int4">99</td>
<td class="int4">19</td>
<td class="int4">1351</td>
<td class="varchar">Item 99</td>
</tr>
<tr>
<td class="int4">497</td>
<td class="int4">99</td>
<td class="int4">1245</td>
<td class="varchar">Item 497</td>
</tr>
<tr>
<td class="int4">496</td>
<td class="int4">99</td>
<td class="int4">1255</td>
<td class="varchar">Item 496</td>
</tr>
<tr>
<td class="int4">2486</td>
<td class="int4">497</td>
<td class="int4">205</td>
<td class="varchar">Item 2486</td>
</tr>
<tr>
<td class="int4">2484</td>
<td class="int4">496</td>
<td class="int4">362</td>
<td class="varchar">Item 2484</td>
</tr>
<tr>
<td class="int4">12431</td>
<td class="int4">2486</td>
<td class="int4">29</td>
<td class="varchar">Item 12431</td>
</tr>
<tr>
<td class="int4">12423</td>
<td class="int4">2484</td>
<td class="int4">311</td>
<td class="varchar">Item 12423</td>
</tr>
<tr>
<td class="int4">62119</td>
<td class="int4">12423</td>
<td class="int4">1113</td>
<td class="varchar">Item 62119</td>
</tr>
<tr>
<td class="int4">62120</td>
<td class="int4">12423</td>
<td class="int4">1121</td>
<td class="varchar">Item 62120</td>
</tr>
<tr>
<td class="int4">310602</td>
<td class="int4">62120</td>
<td class="int4">341</td>
<td class="varchar">Item 310602</td>
</tr>
<tr>
<td class="int4">310605</td>
<td class="int4">62120</td>
<td class="int4">468</td>
<td class="varchar">Item 310605</td>
</tr>
<tr class="statusbar">
<td colspan="100">16 rows fetched in 0.0007s (0.0043s)</td>
</tr>
</table>
</div>
<pre>
CTE Scan on rows r  (cost=3122.65..3123.09 rows=22 width=230)
  CTE rows
    -&gt;  Recursive Union  (cost=0.00..3122.65 rows=22 width=230)
          -&gt;  Limit  (cost=0.00..6.18 rows=2 width=23)
                -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive r  (cost=0.00..15.45 rows=5 width=23)
                      Index Cond: (parent = 0)
          -&gt;  Limit  (cost=311.58..311.58 rows=2 width=23)
                -&gt;  Sort  (cost=311.58..311.82 rows=98 width=23)
                      Sort Key: c.orderer
                      -&gt;  Nested Loop  (cost=0.00..310.60 rows=98 width=23)
                            -&gt;  WorkTable Scan on rows r  (cost=0.00..0.40 rows=20 width=4)
                            -&gt;  Index Scan using ix_recursive_parent_orderer on t_recursive c  (cost=0.00..15.45 rows=5 width=23)
                                  Index Cond: (c.parent = r.id)
</pre>
<p>This query takes all children of the previous set and returns the first two of their descendants.</p>
<p>We see that <strong>PostgreSQL</strong>, unlike <strong>SQL Server</strong>, implements the recursive <strong>CTE</strong>&#8216;s in truly set-based way.</p>
<p>The recursive part of the query can accept a set on input, return a set on output and do the set-based operations on the set received.</p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/11/23/recursive-ctes-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shuffling rows: PostgreSQL</title>
		<link>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/</link>
		<comments>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 19:00:08 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3342</guid>
		<description><![CDATA[Answering questions asked on the site. Josh asks: I am building a music application and need to create a playlist of arbitrary length from the tracks stored in the database. This playlist should be shuffled and a track can repeat only after at least 10 other tracks had been played. Is it possible to do [...]]]></description>
			<content:encoded><![CDATA[<p>Answering questions asked on the site.</p>
<p><strong>Josh</strong> asks:</p>
<blockquote><p>
I am building a music application and need to create a playlist of arbitrary length from the tracks stored in the database.</p>
<p>This playlist should be shuffled and a track can repeat only after at least <strong>10</strong> other tracks had been played.</p>
<p>Is it possible to do this with a single <strong>SQL</strong> query or I need to create a cursor?</p>
<p>This is in <strong>PostgreSQL 8.4</strong>
</p></blockquote>
<p><strong>PostgreSQL 8.4</strong> is a wise choice, since it introduces some new features that ease this task.</p>
<p>To do this we just need to keep a running set that would hold the previous <strong>10</strong> tracks so that we could filter on them.</p>
<p><strong>PostgreSQL 8.4</strong> supports recursive <strong>CTE</strong>&#8216;s that allow iterating the resultsets, and arrays that can be easily used to keep the set of <strong>10</strong> latest tracks.</p>
<p>Here&#8217;s what we should do to build the playlist:</p>
<ol>
<li>We make a recursive <strong>CTE</strong> that would generate as many records as we need and just use <strong>LIMIT</strong> to limit the number</li>
<li>The base part of the <strong>CTE</strong> is just a random record (fetched with <code>ORDER BY RANDOM() LIMIT 1</code>)</li>
<li>The base part also defines the <strong>queue</strong>. This is an array which holds <strong>10</strong> latest records selected. It is initialized in the base part with the <code>id</code> of the random track just selected</li>
<li>The recursive part of the <strong>CTE</strong> joins the previous record with the table, making sure that no record from the latest <strong>10</strong> will be selected on this step. To do this, we just use the array operator <code>&lt;@</code> (<em>contained by</em>)</li>
<li>The recursive part adds newly selected record to the queue. The queue should be no more than 10 records long, that&#8217;s why we apply array slicing operator to it (<code>[1:10]</code>)</li>
</ol>
<p>Let&#8217;s create a sample table:<br />
<span id="more-3342"></span><br />
<a href="#" onclick="xcollapse('X179');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X179" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_track
        (
        id INT NOT NULL PRIMARY KEY,
        name VARCHAR(20) NOT NULL
        );

INSERT
INTO    t_track
SELECT  s, &#039;Track &#039; || s
FROM    generate_series(1, 1000) s;

ANALYZE t_track;
</pre>
</div>
<p>This table is quite simple: it just contains <strong>1,000</strong> tracks with generated names.</p>
<p>And here&#8217;s the query:</p>
<pre class="brush: sql">
WITH    RECURSIVE
        shuffle AS
        (
        SELECT  *
        FROM    (
                SELECT  id, name, ARRAY[id] AS queue
                FROM    t_track
                ORDER BY
                        RANDOM()
                LIMIT 1
                ) q
        UNION ALL
        SELECT  *
        FROM    (
                SELECT  t.id, t.name, (t.id || s.queue)[1:10]
                FROM    shuffle s
                JOIN    t_track t
                ON      NOT ARRAY[t.id] &lt;@ s.queue
                ORDER BY
                        RANDOM()
                LIMIT 1
                ) q
        )
SELECT  id, name, queue::VARCHAR
FROM    shuffle
LIMIT 30
</pre>
<p><a href="#" onclick="xcollapse('X7857');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X7857" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>name</th>
<th>queue</th>
</tr>
<tr>
<td class="int4">739</td>
<td class="varchar">Track 739</td>
<td class="varchar">{739}</td>
</tr>
<tr>
<td class="int4">811</td>
<td class="varchar">Track 811</td>
<td class="varchar">{811,739}</td>
</tr>
<tr>
<td class="int4">216</td>
<td class="varchar">Track 216</td>
<td class="varchar">{216,811,739}</td>
</tr>
<tr>
<td class="int4">192</td>
<td class="varchar">Track 192</td>
<td class="varchar">{192,216,811,739}</td>
</tr>
<tr>
<td class="int4">286</td>
<td class="varchar">Track 286</td>
<td class="varchar">{286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">287</td>
<td class="varchar">Track 287</td>
<td class="varchar">{287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">856</td>
<td class="varchar">Track 856</td>
<td class="varchar">{856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">371</td>
<td class="varchar">Track 371</td>
<td class="varchar">{371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">336</td>
<td class="varchar">Track 336</td>
<td class="varchar">{336,371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">558</td>
<td class="varchar">Track 558</td>
<td class="varchar">{558,336,371,856,287,286,192,216,811,739}</td>
</tr>
<tr>
<td class="int4">99</td>
<td class="varchar">Track 99</td>
<td class="varchar">{99,558,336,371,856,287,286,192,216,811}</td>
</tr>
<tr>
<td class="int4">462</td>
<td class="varchar">Track 462</td>
<td class="varchar">{462,99,558,336,371,856,287,286,192,216}</td>
</tr>
<tr>
<td class="int4">653</td>
<td class="varchar">Track 653</td>
<td class="varchar">{653,462,99,558,336,371,856,287,286,192}</td>
</tr>
<tr>
<td class="int4">682</td>
<td class="varchar">Track 682</td>
<td class="varchar">{682,653,462,99,558,336,371,856,287,286}</td>
</tr>
<tr>
<td class="int4">329</td>
<td class="varchar">Track 329</td>
<td class="varchar">{329,682,653,462,99,558,336,371,856,287}</td>
</tr>
<tr>
<td class="int4">365</td>
<td class="varchar">Track 365</td>
<td class="varchar">{365,329,682,653,462,99,558,336,371,856}</td>
</tr>
<tr>
<td class="int4">72</td>
<td class="varchar">Track 72</td>
<td class="varchar">{72,365,329,682,653,462,99,558,336,371}</td>
</tr>
<tr>
<td class="int4">841</td>
<td class="varchar">Track 841</td>
<td class="varchar">{841,72,365,329,682,653,462,99,558,336}</td>
</tr>
<tr>
<td class="int4">159</td>
<td class="varchar">Track 159</td>
<td class="varchar">{159,841,72,365,329,682,653,462,99,558}</td>
</tr>
<tr>
<td class="int4">521</td>
<td class="varchar">Track 521</td>
<td class="varchar">{521,159,841,72,365,329,682,653,462,99}</td>
</tr>
<tr>
<td class="int4">736</td>
<td class="varchar">Track 736</td>
<td class="varchar">{736,521,159,841,72,365,329,682,653,462}</td>
</tr>
<tr>
<td class="int4">759</td>
<td class="varchar">Track 759</td>
<td class="varchar">{759,736,521,159,841,72,365,329,682,653}</td>
</tr>
<tr>
<td class="int4">142</td>
<td class="varchar">Track 142</td>
<td class="varchar">{142,759,736,521,159,841,72,365,329,682}</td>
</tr>
<tr>
<td class="int4">607</td>
<td class="varchar">Track 607</td>
<td class="varchar">{607,142,759,736,521,159,841,72,365,329}</td>
</tr>
<tr>
<td class="int4">331</td>
<td class="varchar">Track 331</td>
<td class="varchar">{331,607,142,759,736,521,159,841,72,365}</td>
</tr>
<tr>
<td class="int4">957</td>
<td class="varchar">Track 957</td>
<td class="varchar">{957,331,607,142,759,736,521,159,841,72}</td>
</tr>
<tr>
<td class="int4">985</td>
<td class="varchar">Track 985</td>
<td class="varchar">{985,957,331,607,142,759,736,521,159,841}</td>
</tr>
<tr>
<td class="int4">702</td>
<td class="varchar">Track 702</td>
<td class="varchar">{702,985,957,331,607,142,759,736,521,159}</td>
</tr>
<tr>
<td class="int4">914</td>
<td class="varchar">Track 914</td>
<td class="varchar">{914,702,985,957,331,607,142,759,736,521}</td>
</tr>
<tr>
<td class="int4">569</td>
<td class="varchar">Track 569</td>
<td class="varchar">{569,914,702,985,957,331,607,142,759,736}</td>
</tr>
<tr class="statusbar">
<td colspan="100">30 rows fetched in 0.0012s (0.1246s)</td>
</tr>
</table>
</div>
<pre>
Limit  (cost=3444.86..3445.13 rows=11 width=94)
  CTE shuffle
    -&gt;  Recursive Union  (cost=23.50..3444.86 rows=11 width=94)
          -&gt;  Subquery Scan q  (cost=23.50..23.51 rows=1 width=94)
                -&gt;  Limit  (cost=23.50..23.50 rows=1 width=13)
                      -&gt;  Sort  (cost=23.50..26.00 rows=1000 width=13)
                            Sort Key: (random())
                            -&gt;  Seq Scan on t_track  (cost=0.00..18.50 rows=1000 width=13)
          -&gt;  Subquery Scan q  (cost=342.10..342.11 rows=1 width=94)
                -&gt;  Limit  (cost=342.10..342.10 rows=1 width=45)
                      -&gt;  Sort  (cost=342.10..367.08 rows=9990 width=45)
                            Sort Key: (random())
                            -&gt;  Nested Loop  (cost=17.00..292.15 rows=9990 width=45)
                                  Join Filter: (NOT (ARRAY[t.id] &lt;@ s.queue))
                                  -&gt;  WorkTable Scan on shuffle s  (cost=0.00..0.20 rows=10 width=32)
                                  -&gt;  Materialize  (cost=17.00..27.00 rows=1000 width=13)
                                        -&gt;  Seq Scan on t_track t  (cost=0.00..16.00 rows=1000 width=13)
  -&gt;  CTE Scan on shuffle  (cost=0.00..0.28 rows=11 width=94)
</pre>
</div>
<p>This query selects first <strong>30</strong> records but the <code>LIMIT</code> clause can be changed to select an arbitrary number of records (including that exceeding <strong>1,000</strong>), since we don&#8217;t apply any limits into the recursive part of the query.</p>
<p>Normally, the <code>queue</code> would be hidden but I left it just to illustrate what&#8217;s going on. As you can see, the <code>queue</code> holds the <code>id</code>&#8216;s of last <strong>10</strong> records.</p>
<p>The query runs for <strong>120 ms</strong> which is quite fast but could be yet improved using approaches described in <a href="/2009/07/18/postgresql-8-4-sampling-random-rows/"><strong>PostgreSQL 8.4: sampling random rows</strong></a>. However, this will make the query too hard to read and <code>ORDER BY RANDOM()</code> is just fine to demonstrates the principle.</p>
<p>Hope that helps.</p>
<hr/>
<p>I&#8217;m always glad to answer the questions regarding database queries.</p>
<p><a href="/ask-a-question"><strong>Ask me a question</strong></a></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/10/06/shuffling-rows-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adjacency list vs. nested sets: PostgreSQL</title>
		<link>http://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/</link>
		<comments>http://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 19:00:35 +0000</pubDate>
		<dc:creator>Quassnoi</dc:creator>
				<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://explainextended.com/?p=3183</guid>
		<description><![CDATA[This series of articles is inspired by numerous questions asked on the site and on Stack Overflow. What is better to store hierarchical data: nested sets model or adjacency list (parent-child) model? First, let&#8217;s explain what all this means. Adjacency list Hierarchical relations (not to be confused with hierarchical data model) are 0-1:0-N transitive relations [...]]]></description>
			<content:encoded><![CDATA[<p>This series of articles is inspired by numerous questions asked on the site and on <a href="http://stackoverflow.com"><strong>Stack Overflow</strong></a>.</p>
<blockquote><p>What is better to store hierarchical data: <strong>nested sets</strong> model or <strong>adjacency list</strong> (parent-child) model?
</p></blockquote>
<p>First, let&#8217;s explain what all this means.</p>
<h3>Adjacency list</h3>
<p>Hierarchical relations (not to be confused with <a href="http://explainextended.com/2009/08/23/what-is-a-relational-database/">hierarchical data model</a>) are <code>0-1:0-N</code> transitive relations between entities of same domain.</p>
<p>For instance, ancestor-descendant relation is:</p>
<ul>
<li>Transitive:
<ul>
<li>If <strong>A</strong> is an ancestor of <strong>B</strong> and <strong>B</strong> is an ancestor of <strong>C</strong>, then <strong>A</strong> is an ancestor of <strong>C</strong></li>
</ul>
</li>
<li>Irreflexive:
<ul>
<li>If <strong>A</strong> is an ancestor of <strong>B</strong>, then <strong>B</strong> is never an ancestor of <strong>A</strong></li>
</ul>
</li>
<li><code>0-1:0-N</code>
<ul>
<li><strong>A</strong> can have zero, one or many children. <strong>A</strong> can have zero or one parents.</li>
</ul>
</li>
</ul>
<p>These relations can be represented by an ordered directed <a href="http://en.wikipedia.org/wiki/Tree_%28graph_theory%29">tree</a>.</p>
<p>Tree is a simple directed graph (that with at most one directed edge between two different vertices) and relational model has means to represent simple graphs.</p>
<p>Two vertices are considered related (and therefore their primary keys forming a row in the table) if and only if they are connected with an edge.</p>
<p>This table along with the table defining the vertices identifies a graph completely by defining pairs of vertices connected by the edges. Each record in the table defines a pair of adjacent vertices, that&#8217;s why this representation is called <strong>adjacency list</strong>.</p>
<p>Adjacency lists can represent any simple directed graphs, not ony hierarchy trees. But due to the fact that this structure is most commonly used to define the parent-child relationships, the terms <q>parent-child model</q> and <q>adjacency list model</q> have almost become synonymous. However, they are not: adjacency list model is much wider and parent-child model is one of its implementations.</p>
<p>Now, since we have a tree here which implies <code>0-1:0-N</code> relationship between the vertices, we can define the relation as a <em>self-relation</em>: the table defines both the entity and the relationship. Parent is just a one attribute among other attributes with a <code>FOREIGN KEY</code> reference to the table itself.</p>
<p>Since multple items can have no parents (and therefore be the roots of their trees), it&#8217;s sometimes useful to convert this tree into an <a href="http://en.wikipedia.org/wiki/Arborescence_%28graph_theory%29">arborescence</a>: make a single fake root that considered a parent of all entries that have no actual parent.</p>
<p>This is a nice and elegant model, but until recently it had one drawback: it could not be used with <strong>SQL</strong>.<br />
<span id="more-3183"></span><br />
<strong>SQL</strong>, as we all know, deals with relational tables which can be transformed by the means of relational algebra.</p>
<p>It provides a way to do number of operations including relational multiplication, projection, sum etc.</p>
<p>However, earlier versions or <strong>SQL</strong> lacked recursion which is required to do certain operations efficiently. Namely, recursion is required to operate upon the adjacency list structure.</p>
<p>The most common operations are:</p>
<ul>
<li>Find all descendants of a given node</li>
<li>Find all ancestors of a given node</li>
<li>Find all descendants of a given node up to a certain depth</li>
</ul>
<p>The first two operations require recursion.</p>
<p>The third one does too if the depth level should serve as a parameter.</p>
<p>To work around this, the <a href="http://en.wikipedia.org/wiki/Nested_set_model">nested sets model</a> was proposed by <a href="http://www.kamfonas.com/id3.html">Michael Kamfonas</a> and popularized by <a href="http://en.wikipedia.org/wiki/Joe_Celko">Joe Celko</a>.</p>
<h3>Nested sets</h3>
<p>The idea of nested sets is quite simple and can be illustrated using one of the most popular ways to store hierarchical data, the <strong>XML</strong>.</p>
<p>How would we store the hierarchies in <strong>XML</strong>?</p>
<p>We would just use one tag to describe every item and nest it accordingly, like this:</p>
<pre class="brush: xml">
&lt;item id=&quot;0&quot;&gt;
 &lt;item id=&quot;1&quot;&gt;
  &lt;item id=&quot;2&quot;/&gt;
  &lt;item id=&quot;3&quot;&gt;
   &lt;item id=&quot;4&quot;/&gt;
  &lt;/item&gt;
  &lt;item id=&quot;5&quot;&gt;
   &lt;item id=&quot;6&quot;&gt;
    &lt;item id=&quot;7&quot;/&gt;
   &lt;/item&gt;
  &lt;/item&gt;
 &lt;/item&gt;
 &lt;item id=&quot;8&quot;/&gt;
&lt;/item&gt;
</pre>
<p>This is fine, but how to use this structure in a relational table?</p>
<p>As you can see, opening and closing tags of each node are contained on their own lines here. The opening and closing tags may (or may not) be the same for the nodes containing no children (this is not important).</p>
<p>We see that the node ranges never intersect: given two node ranges, their intersection always makes the range of either of the nodes or an empty set.</p>
<p>In other words, a node open within another node should also close within it.</p>
<p>If and only if this holds for each and every node, the nodes make a valid <strong>XML</strong> file and a valid hierarchy.</p>
<p>Now, the nested set model can be described in one sentence:</p>
<div class="plainnote">
To store a hierarchy in a nested set model, we just store the <strong>line numbers</strong> of the opening and closing tags of each node as if it were an <strong>XML</strong> file.
</div>
<p>Each set of nodes having a common ancestor is nested within the node of this ancestor. That&#8217;s why this model is called <q>nested sets</q>.</p>
<p>Historically, these line numbers are stored in columns named <code>lft</code> and <code>rgt</code> (since <code>LEFT</code> and <code>RIGHT</code> are reserved words in most <strong>SQL</strong> dialects).</p>
<p>That&#8217;s how this hierarchy would look in a nested sets model:</p>
<table class="excel">
<tr>
<th>ID</th>
<th>LFT</th>
<th>RGT</th>
</tr>
<tr>
<td class="integer">0</td>
<td class="integer">1</td>
<td class="integer">14</td>
</tr>
<tr>
<td class="integer">1</td>
<td class="integer">2</td>
<td class="integer">12</td>
</tr>
<tr>
<td class="integer">2</td>
<td class="integer">3</td>
<td class="integer">3</td>
</tr>
<tr>
<td class="integer">3</td>
<td class="integer">4</td>
<td class="integer">6</td>
</tr>
<tr>
<td class="integer">4</td>
<td class="integer">5</td>
<td class="integer">5</td>
</tr>
<tr>
<td class="integer">5</td>
<td class="integer">7</td>
<td class="integer">11</td>
</tr>
<tr>
<td class="integer">6</td>
<td class="integer">8</td>
<td class="integer">10</td>
</tr>
<tr>
<td class="integer">7</td>
<td class="integer">9</td>
<td class="integer">9</td>
</tr>
<tr>
<td class="integer">8</td>
<td class="integer">13</td>
<td class="integer">13</td>
</tr>
</table>
<p>This model is more <strong>SQL</strong> friendly, since the tasks described above can be performed in <strong>SQL</strong> without using recursion.</p>
<p>To find out all ancestors of a given node, we just select all nodes that <strong>contain</strong> its <code>LFT</code> boundary (which in a properly built hierarchy implies containing the <code>RGT</code> boundary too):</p>
<pre class="brush: sql">
SELECT  hp.*
FROM    t_hierarchy hc
JOIN    t_hierarchy hp
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hc.id = ?
</pre>
<p>And to find the descendants, we shoud just reverse the condition, i. e. find all nodes that <strong>are contained</strong> between the current node&#8217;s boundaries:</p>
<pre class="brush: sql">
SELECT  hc.*
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = ?
</pre>
<p>Ironically, selecting all descendants up to a given depth (which is the least problem for the adjacency list as long as the depth is known in design time) is the hardest task for the nested set model. Even obtaning the list of immediate parents and immediate children is not so simple.</p>
<p>However, this is solvable. This is the query to get all descendants up to the third generation (that is node itself, all children and grandchildren):</p>
<pre class="brush: sql">
SELECT  hc.*
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = ?
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hn
        WHERE   hc.lft BETWEEN hn.lft AND hn.rgt
                AND hn.lft BETWEEN hp.lft AND hp.rgt
        ) &lt;= 3
</pre>
<p>Unlike adjacency list model, the depth level can be parametrized in this query which makes it possible to use a single query for all depth level.</p>
<p>Nested set model can be relatively easily queried for, but it&#8217;s extremely hard to manage.</p>
<p>To insert a new child into a node or make an existing mode a child in adjacency list, everything we need is provide the new value of its parent column. With a single update we can move a whole branch.</p>
<p>To add a new node into a nested set model we should do exactly the same as if we were adding a new node into an <strong>XML</strong> file: all subsequent nodes are moved several lines further. Since the boundaries in <strong>SQL</strong> table represent the line numbers, we should do the same: calculate the offset and make a batch update to all nodes to the right of the updated or inserted one. Very hard to implement and very inefficient.</p>
<p>Now good news.</p>
<p>Three of four major systems (that is <strong>SQL Server</strong>, <strong>Oracle</strong> and <strong>PostgreSQL 8.4</strong>) now support recursion natively.</p>
<p>The fourth one (<strong>MySQL</strong>) does not support it, but it can be emulated to the extent required to run queries against the hierarchical data modelled according to adjacency list model:</p>
<ul>
<li><a href="http://explainextended.com/2009/03/17/hierarchical-queries-in-mysql/"><strong>Hierarchical queries in MySQL</strong></a></li>
</ul>
<p>In this series of articles, we will compare efficiency of the <strong>adjacency list</strong> model to that of the <strong>nested sets</strong> model.</p>
<p><strong>PostgreSQL 8.4</strong> is the system we begin with.</p>
<h3>Analysis</h3>
<p><strong>PostgreSQL 8.4</strong> supports recursive queries by means of such called <strong>hierarchical CTE&#8217;s</strong>.</p>
<p>A hierarchical CTE is an analog of this query:</p>
<pre class="brush: sql">
SELECT  *
FROM    t_hierarchy h1
WHERE   … 

UNION ALL

SELECT  *
FROM    t_hierarchy h1
JOIN    t_hierarchy h2
ON    …
WHERE   … 

UNION ALL

SELECT  *
FROM    t_hierarchy h1
JOIN    t_hierarchy h2
ON    …
JOIN    t_hierarchy h3
ON    …
WHERE   …

…
</pre>
<p>with the theoretically unlimited number of <code>UNION ALL</code>&#8216;s built at runtime and the results of each query cached and called recursively.</p>
<p>To define such a construct, one uses <code>WITH RECURSIVE</code> clause.</p>
<p>To compare both methods, we will create a sample table which combines both data models. Each node will have both <code>parent</code> and the boundaries (<code>lft</code> and <code>rgt</code>) defined. Then we will run the three most important queries, which, again, are:</p>
<ul>
<li>Find all descendants of a given node</li>
<li>Find all ancestors of a given node</li>
<li>Find all descendants of a given node up to a certain depth</li>
</ul>
<p>Here&#8217;s the script to create a sample table:</p>
<p><a href="#" onclick="xcollapse('X6736');return false;"><strong>Table creation details</strong></a><br />
</p>
<div id="X6736" style="display: none; ">
<pre class="brush: sql">
CREATE TABLE t_hierarchy (
        id INT NOT NULL,
        parent INT NOT NULL,
        lft INT NOT NULL,
        rgt INT NOT NULL,
        data VARCHAR(100) NOT NULL,
        stuffing VARCHAR(100) NOT NULL
);

INSERT
INTO    t_hierarchy
WITH RECURSIVE
        ini AS
        (
        SELECT  8 AS level, 5 AS children
        ),
        range AS
        (
        SELECT  level, children,
                (
                SELECT  SUM(POW(children, n)::INTEGER * ((n &lt; level)::INTEGER + 1))
                FROM    generate_series(level, 0, -1) n
                ) width
        FROM    ini
        ),
        q AS
        (
        SELECT  s AS id, 0 AS parent, level, children,
                1 + width * (s - 1) AS lft,
                1 + width * s - 1 AS rgt,
                width / children AS width
        FROM    (
                SELECT  r.*, generate_series(1, children) s
                FROM    range r
                ) q2
        UNION ALL
        SELECT  id * children + position, id, level - 1, children,
                1 + lft + width * (position - 1),
                1 + lft + width * position - 1,
                width / children
        FROM    (
                SELECT  generate_series(1, children) AS position, q.*
                FROM    q
                ) q2
        WHERE   level &gt; 0
        )
SELECT  id, parent, lft, rgt, &#039;Value &#039; || id, RPAD(&#039;&#039;, 100, &#039;*&#039;)
FROM    q;

ALTER TABLE t_hierarchy ADD CONSTRAINT pk_hierarchy_id PRIMARY KEY (id);
CREATE INDEX ix_hierarchy_lft ON t_hierarchy (lft);
CREATE INDEX ix_hierarchy_rgt ON t_hierarchy (rgt);
CREATE INDEX ix_hierarchy_parent ON t_hierarchy (parent);

ANALYZE t_hierarchy;
</pre>
</div>
<p>The table contains <strong>8</strong> levels of hierarchy with each node having <strong>5</strong> immediate children. This makes the table <strong>2,441,405</strong> records long.</p>
<p>Each record has a <strong>100</strong>-byte long field <code>stuffing</code> which emulates the payload in actual tables.</p>
<p>The fields <code>parent</code>, <code>lft</code> and <code>rgt</code> are indexed.</p>
<h3>All descendants</h3>
<p>There are lots of descendants, that&#8217;s why we will select and aggregate the lengths of their <code>stuffing</code> fields. Since that field is not indexed, it will emulate selection of all values from an actual table rather well.</p>
<h4>Nested sets</h4>
<pre class="brush: sql">
SELECT  SUM(LENGTH(hc.stuffing))
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = 42
</pre>
<p><a href="#" onclick="xcollapse('X6982');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X6982" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">1953100</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0559s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=63042.65..63042.66 rows=1 width=101)
  -&gt;  Nested Loop  (cost=5761.08..62364.46 rows=271274 width=101)
        -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
              Index Cond: (id = 42)
        -&gt;  Bitmap Heap Scan on t_hierarchy hc  (cost=5761.08..58286.82 rows=271274 width=105)
              Recheck Cond: ((hc.lft &gt;= hp.lft) AND (hc.lft &lt;= hp.rgt))
              -&gt;  Bitmap Index Scan on ix_hierarchy_lft  (cost=0.00..5693.26 rows=271274 width=0)
                    Index Cond: ((hc.lft &gt;= hp.lft) AND (hc.lft &lt;= hp.rgt))
</pre>
</div>
<p>Nested sets is particularly good for this kind of query, since it requires a single range scan on the index on <code>lft</code>.</p>
<p>This query runs for <strong>50 ms</strong>.</p>
<h4>Adjacency list</h4>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, stuffing
        FROM    t_hierarchy h
        WHERE   id = 42
        UNION ALL
        SELECT  hc.id, hc.stuffing
        FROM    q
        JOIN    t_hierarchy hc
        ON      hc.parent = q.id
        )
SELECT  SUM(LENGTH(stuffing))
FROM    q
</pre>
<p><a href="#" onclick="xcollapse('X2033');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X2033" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>sum</th>
</tr>
<tr>
<td class="int8">1953100</td>
</tr>
<tr class="statusbar">
<td colspan="100">1 row fetched in 0.0001s (0.0985s)</td>
</tr>
</table>
</div>
<pre>
Aggregate  (cost=915.24..915.26 rows=1 width=218)
  CTE q
    -&gt;  Recursive Union  (cost=0.00..898.34 rows=751 width=105)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy h  (cost=0.00..8.54 rows=1 width=105)
                Index Cond: (id = 42)
          -&gt;  Nested Loop  (cost=0.00..87.48 rows=75 width=105)
                -&gt;  WorkTable Scan on q  (cost=0.00..0.20 rows=10 width=4)
                -&gt;  Index Scan using ix_hierarchy_parent on t_hierarchy hc  (cost=0.00..8.64 rows=7 width=109)
                      Index Cond: (hc.parent = q.id)
  -&gt;  CTE Scan on q  (cost=0.00..15.02 rows=751 width=218)
</pre>
</div>
<p>This query is a trifle less efficient since it requires several index scans instead of a single one. However, the resulting range is of course the same (because the values returned are the same).</p>
<p>This query completes in <strong>98 ms</strong>, or less than twice as long as the nested sets one.</p>
<h3>All ancestors</h3>
<h4>Nested sets</h4>
<pre class="brush: sql">
SELECT  hp.id, hp.parent, hp.lft, hp.rgt, hp.data
FROM    t_hierarchy hc
JOIN    t_hierarchy hp
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hc.id = 1000000
ORDER BY
        hp.lft
</pre>
<p><a href="#" onclick="xcollapse('X6127');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X6127" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">0</td>
<td class="int4">585938</td>
<td class="int4">1171874</td>
<td class="varchar">Value 2</td>
</tr>
<tr>
<td class="int4">12</td>
<td class="int4">2</td>
<td class="int4">703126</td>
<td class="int4">820312</td>
<td class="varchar">Value 12</td>
</tr>
<tr>
<td class="int4">63</td>
<td class="int4">12</td>
<td class="int4">750001</td>
<td class="int4">773437</td>
<td class="varchar">Value 63</td>
</tr>
<tr>
<td class="int4">319</td>
<td class="int4">63</td>
<td class="int4">764063</td>
<td class="int4">768749</td>
<td class="varchar">Value 319</td>
</tr>
<tr>
<td class="int4">1599</td>
<td class="int4">319</td>
<td class="int4">766875</td>
<td class="int4">767811</td>
<td class="varchar">Value 1599</td>
</tr>
<tr>
<td class="int4">7999</td>
<td class="int4">1599</td>
<td class="int4">767437</td>
<td class="int4">767623</td>
<td class="varchar">Value 7999</td>
</tr>
<tr>
<td class="int4">39999</td>
<td class="int4">7999</td>
<td class="int4">767549</td>
<td class="int4">767585</td>
<td class="varchar">Value 39999</td>
</tr>
<tr>
<td class="int4">199999</td>
<td class="int4">39999</td>
<td class="int4">767571</td>
<td class="int4">767577</td>
<td class="varchar">Value 199999</td>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">199999</td>
<td class="int4">767576</td>
<td class="int4">767576</td>
<td class="varchar">Value 1000000</td>
</tr>
<tr class="statusbar">
<td colspan="100">9 rows fetched in 0.0006s (1.8281s)</td>
</tr>
</table>
</div>
<pre>
Sort  (cost=109888.38..110566.56 rows=271274 width=29)
  Sort Key: hp.lft
  -&gt;  Nested Loop  (cost=15239.64..85406.72 rows=271274 width=29)
        Join Filter: (hc.lft &gt;= hp.lft)
        -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hc  (cost=0.00..8.54 rows=1 width=4)
              Index Cond: (id = 1000000)
        -&gt;  Bitmap Heap Scan on t_hierarchy hp  (cost=15239.64..73190.86 rows=813822 width=29)
              Recheck Cond: (hc.lft &lt;= hp.rgt)
              -&gt;  Bitmap Index Scan on ix_hierarchy_rgt  (cost=0.00..15036.18 rows=813822 width=0)
                    Index Cond: (hc.lft &lt;= hp.rgt)
</pre>
</div>
<p>This query returns much fewer rows than the previous one (only <strong>9</strong> rows instead of almost <strong>200,000</strong>), but due to its nature it is much more slow and takes almost <strong>2 seconds</strong>.</p>
<p>This is because we search the other way round in this case: instead of looking for indexed value within the range of constants, we need to search the constant against the list of ranges.</p>
<p>Ranges cannot be efficiently indexed using <strong>B-Tree</strong> indexes, that&#8217;s why <strong>PostgreSQL</strong> uses only part of the condition (<code>hc.lft &lt;= hp.rgt</code>), builds a bitmap on it, scans the table using this bitmap and filters the values using the second part of the condition (<code>hc.lft &lt;= hp.rgt</code>).</p>
<p>This is quite a costly operations since it requires an index scan (which <strong>PostgreSQL</strong> is not very good at) which returns almost half of all rows.</p>
<h4>Adjacency list</h4>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  h.*, 1 AS level
        FROM    t_hierarchy h
        WHERE   id = 1000000
        UNION ALL
        SELECT  hp.*, level + 1
        FROM    q
        JOIN    t_hierarchy hp
        ON      hp.id = q.parent
        )
SELECT  id, parent, lft, rgt, data
FROM    q
ORDER BY
        level DESC
</pre>
<p><a href="#" onclick="xcollapse('X4855');return false;"><strong>View query details</strong></a><br />
</p>
<div id="X4855" style="display: none; ">
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">2</td>
<td class="int4">0</td>
<td class="int4">585938</td>
<td class="int4">1171874</td>
<td class="varchar">Value 2</td>
</tr>
<tr>
<td class="int4">12</td>
<td class="int4">2</td>
<td class="int4">703126</td>
<td class="int4">820312</td>
<td class="varchar">Value 12</td>
</tr>
<tr>
<td class="int4">63</td>
<td class="int4">12</td>
<td class="int4">750001</td>
<td class="int4">773437</td>
<td class="varchar">Value 63</td>
</tr>
<tr>
<td class="int4">319</td>
<td class="int4">63</td>
<td class="int4">764063</td>
<td class="int4">768749</td>
<td class="varchar">Value 319</td>
</tr>
<tr>
<td class="int4">1599</td>
<td class="int4">319</td>
<td class="int4">766875</td>
<td class="int4">767811</td>
<td class="varchar">Value 1599</td>
</tr>
<tr>
<td class="int4">7999</td>
<td class="int4">1599</td>
<td class="int4">767437</td>
<td class="int4">767623</td>
<td class="varchar">Value 7999</td>
</tr>
<tr>
<td class="int4">39999</td>
<td class="int4">7999</td>
<td class="int4">767549</td>
<td class="int4">767585</td>
<td class="varchar">Value 39999</td>
</tr>
<tr>
<td class="int4">199999</td>
<td class="int4">39999</td>
<td class="int4">767571</td>
<td class="int4">767577</td>
<td class="varchar">Value 199999</td>
</tr>
<tr>
<td class="int4">1000000</td>
<td class="int4">199999</td>
<td class="int4">767576</td>
<td class="int4">767576</td>
<td class="varchar">Value 1000000</td>
</tr>
<tr class="statusbar">
<td colspan="100">9 rows fetched in 0.0006s (0.0044s)</td>
</tr>
</table>
</div>
<pre>
Sort  (cost=872.98..873.23 rows=101 width=238)
  Sort Key: q.level
  CTE q
    -&gt;  Recursive Union  (cost=0.00..867.59 rows=101 width=134)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy h  (cost=0.00..8.54 rows=1 width=130)
                Index Cond: (id = 1000000)
          -&gt;  Nested Loop  (cost=0.00..85.70 rows=10 width=134)
                -&gt;  WorkTable Scan on q  (cost=0.00..0.20 rows=10 width=8)
                -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=130)
                      Index Cond: (hp.id = q.parent)
  -&gt;  CTE Scan on q  (cost=0.00..2.02 rows=101 width=238)
</pre>
</div>
<p>Now this query is literally instant: only <strong>4 ms</strong> which is within the time measurement error range.</p>
<p>Recursion does very good job here: since the hierarchy is limited, traversing the tree upwards takes only <strong>9</strong> index lookups on the <code>PRIMARY KEY </code> and then sorting of <strong>9</strong> values. Both operations are very simple and complete in no time.</p>
<h3>Descendants up to a given level</h3>
<h4>Nested sets</h4>
<p>We will run two queries: one with a node close to the root, the second one with a node far from the root.</p>
<pre class="brush: sql">
SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = ?
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hn
        WHERE   hc.lft BETWEEN hn.lft AND hn.rgt
                AND hn.lft BETWEEN hp.lft AND hp.rgt
        ) &lt;= 3
</pre>
<p><a href="#" onclick="xcollapse('X8533');return false;"><strong>View query details for node 42</strong></a><br />
</p>
<div id="X8533" style="display: none; ">
<pre class="brush: sql">
SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = 42
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hn
        WHERE   hc.lft BETWEEN hn.lft AND hn.rgt
                AND hn.lft BETWEEN hp.lft AND hp.rgt
        ) &lt;= 3
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">8</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="varchar">Value 42</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">42</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="varchar">Value 211</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">211</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="varchar">Value 1056</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">211</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="varchar">Value 1057</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">211</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="varchar">Value 1058</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">211</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="varchar">Value 1059</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">211</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="varchar">Value 1060</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">42</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="varchar">Value 212</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">212</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="varchar">Value 1061</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">212</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="varchar">Value 1062</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">212</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="varchar">Value 1063</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">212</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="varchar">Value 1064</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">212</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="varchar">Value 1065</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">42</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="varchar">Value 213</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">213</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="varchar">Value 1066</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">213</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="varchar">Value 1067</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">213</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="varchar">Value 1068</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">213</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="varchar">Value 1069</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">213</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="varchar">Value 1070</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">42</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="varchar">Value 214</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">214</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="varchar">Value 1071</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">214</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="varchar">Value 1072</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">214</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="varchar">Value 1073</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">214</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="varchar">Value 1074</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">214</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="varchar">Value 1075</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">42</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="varchar">Value 215</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">215</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="varchar">Value 1076</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">215</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="varchar">Value 1077</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">215</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="varchar">Value 1078</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">215</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="varchar">Value 1079</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">215</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="varchar">Value 1080</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0156s (120.6055s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=0.00..6875628456.64 rows=90425 width=29)
  Join Filter: ((SubPlan 1) &lt;= 3)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 42)
  -&gt;  Index Scan using ix_hierarchy_lft on t_hierarchy hc  (cost=0.00..535180.52 rows=271274 width=29)
        Index Cond: ((hc.lft &gt;= hp.lft) AND (hc.lft &lt;= hp.rgt))
  SubPlan 1
    -&gt;  Aggregate  (cost=25343.70..25343.71 rows=1 width=0)
          -&gt;  Index Scan using ix_hierarchy_lft on t_hierarchy hn  (cost=0.00..25333.52 rows=4069 width=0)
                Index Cond: (($0 &gt;= lft) AND (lft &gt;= $1) AND (lft &lt;= $2))
                Filter: ($0 &lt;= rgt)
</pre>
</div>
<p><a href="#" onclick="xcollapse('X854');return false;"><strong>View query details for node 31,415</strong></a><br />
</p>
<div id="X854" style="display: none; ">
<pre class="brush: sql">
SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data
FROM    t_hierarchy hp
JOIN    t_hierarchy hc
ON      hc.lft BETWEEN hp.lft AND hp.rgt
WHERE   hp.id = 31415
        AND
        (
        SELECT  COUNT(*)
        FROM    t_hierarchy hn
        WHERE   hc.lft BETWEEN hn.lft AND hn.rgt
                AND hn.lft BETWEEN hp.lft AND hp.rgt
        ) &lt;= 3
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">31415</td>
<td class="int4">6282</td>
<td class="int4">445651</td>
<td class="int4">445687</td>
<td class="varchar">Value 31415</td>
</tr>
<tr>
<td class="int4">157076</td>
<td class="int4">31415</td>
<td class="int4">445652</td>
<td class="int4">445658</td>
<td class="varchar">Value 157076</td>
</tr>
<tr>
<td class="int4">785381</td>
<td class="int4">157076</td>
<td class="int4">445653</td>
<td class="int4">445653</td>
<td class="varchar">Value 785381</td>
</tr>
<tr>
<td class="int4">785382</td>
<td class="int4">157076</td>
<td class="int4">445654</td>
<td class="int4">445654</td>
<td class="varchar">Value 785382</td>
</tr>
<tr>
<td class="int4">785383</td>
<td class="int4">157076</td>
<td class="int4">445655</td>
<td class="int4">445655</td>
<td class="varchar">Value 785383</td>
</tr>
<tr>
<td class="int4">785384</td>
<td class="int4">157076</td>
<td class="int4">445656</td>
<td class="int4">445656</td>
<td class="varchar">Value 785384</td>
</tr>
<tr>
<td class="int4">785385</td>
<td class="int4">157076</td>
<td class="int4">445657</td>
<td class="int4">445657</td>
<td class="varchar">Value 785385</td>
</tr>
<tr>
<td class="int4">157077</td>
<td class="int4">31415</td>
<td class="int4">445659</td>
<td class="int4">445665</td>
<td class="varchar">Value 157077</td>
</tr>
<tr>
<td class="int4">785386</td>
<td class="int4">157077</td>
<td class="int4">445660</td>
<td class="int4">445660</td>
<td class="varchar">Value 785386</td>
</tr>
<tr>
<td class="int4">785387</td>
<td class="int4">157077</td>
<td class="int4">445661</td>
<td class="int4">445661</td>
<td class="varchar">Value 785387</td>
</tr>
<tr>
<td class="int4">785388</td>
<td class="int4">157077</td>
<td class="int4">445662</td>
<td class="int4">445662</td>
<td class="varchar">Value 785388</td>
</tr>
<tr>
<td class="int4">785389</td>
<td class="int4">157077</td>
<td class="int4">445663</td>
<td class="int4">445663</td>
<td class="varchar">Value 785389</td>
</tr>
<tr>
<td class="int4">785390</td>
<td class="int4">157077</td>
<td class="int4">445664</td>
<td class="int4">445664</td>
<td class="varchar">Value 785390</td>
</tr>
<tr>
<td class="int4">157078</td>
<td class="int4">31415</td>
<td class="int4">445666</td>
<td class="int4">445672</td>
<td class="varchar">Value 157078</td>
</tr>
<tr>
<td class="int4">785391</td>
<td class="int4">157078</td>
<td class="int4">445667</td>
<td class="int4">445667</td>
<td class="varchar">Value 785391</td>
</tr>
<tr>
<td class="int4">785392</td>
<td class="int4">157078</td>
<td class="int4">445668</td>
<td class="int4">445668</td>
<td class="varchar">Value 785392</td>
</tr>
<tr>
<td class="int4">785393</td>
<td class="int4">157078</td>
<td class="int4">445669</td>
<td class="int4">445669</td>
<td class="varchar">Value 785393</td>
</tr>
<tr>
<td class="int4">785394</td>
<td class="int4">157078</td>
<td class="int4">445670</td>
<td class="int4">445670</td>
<td class="varchar">Value 785394</td>
</tr>
<tr>
<td class="int4">785395</td>
<td class="int4">157078</td>
<td class="int4">445671</td>
<td class="int4">445671</td>
<td class="varchar">Value 785395</td>
</tr>
<tr>
<td class="int4">157079</td>
<td class="int4">31415</td>
<td class="int4">445673</td>
<td class="int4">445679</td>
<td class="varchar">Value 157079</td>
</tr>
<tr>
<td class="int4">785396</td>
<td class="int4">157079</td>
<td class="int4">445674</td>
<td class="int4">445674</td>
<td class="varchar">Value 785396</td>
</tr>
<tr>
<td class="int4">785397</td>
<td class="int4">157079</td>
<td class="int4">445675</td>
<td class="int4">445675</td>
<td class="varchar">Value 785397</td>
</tr>
<tr>
<td class="int4">785398</td>
<td class="int4">157079</td>
<td class="int4">445676</td>
<td class="int4">445676</td>
<td class="varchar">Value 785398</td>
</tr>
<tr>
<td class="int4">785399</td>
<td class="int4">157079</td>
<td class="int4">445677</td>
<td class="int4">445677</td>
<td class="varchar">Value 785399</td>
</tr>
<tr>
<td class="int4">785400</td>
<td class="int4">157079</td>
<td class="int4">445678</td>
<td class="int4">445678</td>
<td class="varchar">Value 785400</td>
</tr>
<tr>
<td class="int4">157080</td>
<td class="int4">31415</td>
<td class="int4">445680</td>
<td class="int4">445686</td>
<td class="varchar">Value 157080</td>
</tr>
<tr>
<td class="int4">785401</td>
<td class="int4">157080</td>
<td class="int4">445681</td>
<td class="int4">445681</td>
<td class="varchar">Value 785401</td>
</tr>
<tr>
<td class="int4">785402</td>
<td class="int4">157080</td>
<td class="int4">445682</td>
<td class="int4">445682</td>
<td class="varchar">Value 785402</td>
</tr>
<tr>
<td class="int4">785403</td>
<td class="int4">157080</td>
<td class="int4">445683</td>
<td class="int4">445683</td>
<td class="varchar">Value 785403</td>
</tr>
<tr>
<td class="int4">785404</td>
<td class="int4">157080</td>
<td class="int4">445684</td>
<td class="int4">445684</td>
<td class="varchar">Value 785404</td>
</tr>
<tr>
<td class="int4">785405</td>
<td class="int4">157080</td>
<td class="int4">445685</td>
<td class="int4">445685</td>
<td class="varchar">Value 785405</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0017s (0.0523s)</td>
</tr>
</table>
</div>
<pre>
Nested Loop  (cost=0.00..6875628456.64 rows=90425 width=29)
  Join Filter: ((SubPlan 1) &lt;= 3)
  -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hp  (cost=0.00..8.54 rows=1 width=8)
        Index Cond: (id = 31415)
  -&gt;  Index Scan using ix_hierarchy_lft on t_hierarchy hc  (cost=0.00..535180.52 rows=271274 width=29)
        Index Cond: ((hc.lft &gt;= hp.lft) AND (hc.lft &lt;= hp.rgt))
  SubPlan 1
    -&gt;  Aggregate  (cost=25343.70..25343.71 rows=1 width=0)
          -&gt;  Index Scan using ix_hierarchy_lft on t_hierarchy hn  (cost=0.00..25333.52 rows=4069 width=0)
                Index Cond: (($0 &gt;= lft) AND (lft &gt;= $1) AND (lft &lt;= $2))
                Filter: ($0 &lt;= rgt)
</pre>
</div>
<p>We see that the second query is reasonably fast (completes in <strong>50 ms</strong>).</p>
<p>However, the first query (which is in fact more often used) takes <strong>120.6 seconds</strong>, or more than <strong>2 minutes</strong>!</p>
<p>This is because the query should count all ancestors for all descendants that are within the given node.</p>
<p>It&#8217;s fast for the nodes that are further from the root (since they don&#8217;t have lots of descendants), but it may become a real problem when trying to obtain, say, children and grandchildren of a root node.</p>
<p>And this is the task most online catalogs begin their work with: they need to show first-level categories and subcategories. <strong>2 minutes</strong> is way too much for this.</p>
<h3>Adjacency list</h3>
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, parent, lft, rgt, data, ARRAY[id] AS level
        FROM    t_hierarchy hc
        WHERE   id = ?
        UNION ALL
        SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data, q.level || hc.id
        FROM    q
        JOIN    t_hierarchy hc
        ON      hc.parent = q.id
        WHERE   array_upper(level, 1) &lt; 3
        )
SELECT  id, parent, lft, rgt, data
FROM    q
ORDER BY
        level
</pre>
<p>Note the <code>ORDER BY</code> and <code>level</code> constructs. They are intended to preserve the tree-like ordering. Arrays are ordered lexicographically in <strong>PostgreSQL</strong> and each <code>level</code> contains the <q>breadcrumbs</q> from the root node to the current node.</p>
<p>Here are the query results:</p>
<p><a href="#" onclick="xcollapse('X3995');return false;"><strong>View query details for node 42</strong></a><br />
</p>
<div id="X3995" style="display: none; ">
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, parent, lft, rgt, data, ARRAY[id] AS level
        FROM    t_hierarchy hc
        WHERE   id = 42
        UNION ALL
        SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data, q.level || hc.id
        FROM    q
        JOIN    t_hierarchy hc
        ON      hc.parent = q.id
        WHERE   array_upper(level, 1) &lt; 3
        )
SELECT  id, parent, lft, rgt, data
FROM    q
ORDER BY
        level
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">42</td>
<td class="int4">8</td>
<td class="int4">257814</td>
<td class="int4">281250</td>
<td class="varchar">Value 42</td>
</tr>
<tr>
<td class="int4">211</td>
<td class="int4">42</td>
<td class="int4">257815</td>
<td class="int4">262501</td>
<td class="varchar">Value 211</td>
</tr>
<tr>
<td class="int4">1056</td>
<td class="int4">211</td>
<td class="int4">257816</td>
<td class="int4">258752</td>
<td class="varchar">Value 1056</td>
</tr>
<tr>
<td class="int4">1057</td>
<td class="int4">211</td>
<td class="int4">258753</td>
<td class="int4">259689</td>
<td class="varchar">Value 1057</td>
</tr>
<tr>
<td class="int4">1058</td>
<td class="int4">211</td>
<td class="int4">259690</td>
<td class="int4">260626</td>
<td class="varchar">Value 1058</td>
</tr>
<tr>
<td class="int4">1059</td>
<td class="int4">211</td>
<td class="int4">260627</td>
<td class="int4">261563</td>
<td class="varchar">Value 1059</td>
</tr>
<tr>
<td class="int4">1060</td>
<td class="int4">211</td>
<td class="int4">261564</td>
<td class="int4">262500</td>
<td class="varchar">Value 1060</td>
</tr>
<tr>
<td class="int4">212</td>
<td class="int4">42</td>
<td class="int4">262502</td>
<td class="int4">267188</td>
<td class="varchar">Value 212</td>
</tr>
<tr>
<td class="int4">1061</td>
<td class="int4">212</td>
<td class="int4">262503</td>
<td class="int4">263439</td>
<td class="varchar">Value 1061</td>
</tr>
<tr>
<td class="int4">1062</td>
<td class="int4">212</td>
<td class="int4">263440</td>
<td class="int4">264376</td>
<td class="varchar">Value 1062</td>
</tr>
<tr>
<td class="int4">1063</td>
<td class="int4">212</td>
<td class="int4">264377</td>
<td class="int4">265313</td>
<td class="varchar">Value 1063</td>
</tr>
<tr>
<td class="int4">1064</td>
<td class="int4">212</td>
<td class="int4">265314</td>
<td class="int4">266250</td>
<td class="varchar">Value 1064</td>
</tr>
<tr>
<td class="int4">1065</td>
<td class="int4">212</td>
<td class="int4">266251</td>
<td class="int4">267187</td>
<td class="varchar">Value 1065</td>
</tr>
<tr>
<td class="int4">213</td>
<td class="int4">42</td>
<td class="int4">267189</td>
<td class="int4">271875</td>
<td class="varchar">Value 213</td>
</tr>
<tr>
<td class="int4">1066</td>
<td class="int4">213</td>
<td class="int4">267190</td>
<td class="int4">268126</td>
<td class="varchar">Value 1066</td>
</tr>
<tr>
<td class="int4">1067</td>
<td class="int4">213</td>
<td class="int4">268127</td>
<td class="int4">269063</td>
<td class="varchar">Value 1067</td>
</tr>
<tr>
<td class="int4">1068</td>
<td class="int4">213</td>
<td class="int4">269064</td>
<td class="int4">270000</td>
<td class="varchar">Value 1068</td>
</tr>
<tr>
<td class="int4">1069</td>
<td class="int4">213</td>
<td class="int4">270001</td>
<td class="int4">270937</td>
<td class="varchar">Value 1069</td>
</tr>
<tr>
<td class="int4">1070</td>
<td class="int4">213</td>
<td class="int4">270938</td>
<td class="int4">271874</td>
<td class="varchar">Value 1070</td>
</tr>
<tr>
<td class="int4">214</td>
<td class="int4">42</td>
<td class="int4">271876</td>
<td class="int4">276562</td>
<td class="varchar">Value 214</td>
</tr>
<tr>
<td class="int4">1071</td>
<td class="int4">214</td>
<td class="int4">271877</td>
<td class="int4">272813</td>
<td class="varchar">Value 1071</td>
</tr>
<tr>
<td class="int4">1072</td>
<td class="int4">214</td>
<td class="int4">272814</td>
<td class="int4">273750</td>
<td class="varchar">Value 1072</td>
</tr>
<tr>
<td class="int4">1073</td>
<td class="int4">214</td>
<td class="int4">273751</td>
<td class="int4">274687</td>
<td class="varchar">Value 1073</td>
</tr>
<tr>
<td class="int4">1074</td>
<td class="int4">214</td>
<td class="int4">274688</td>
<td class="int4">275624</td>
<td class="varchar">Value 1074</td>
</tr>
<tr>
<td class="int4">1075</td>
<td class="int4">214</td>
<td class="int4">275625</td>
<td class="int4">276561</td>
<td class="varchar">Value 1075</td>
</tr>
<tr>
<td class="int4">215</td>
<td class="int4">42</td>
<td class="int4">276563</td>
<td class="int4">281249</td>
<td class="varchar">Value 215</td>
</tr>
<tr>
<td class="int4">1076</td>
<td class="int4">215</td>
<td class="int4">276564</td>
<td class="int4">277500</td>
<td class="varchar">Value 1076</td>
</tr>
<tr>
<td class="int4">1077</td>
<td class="int4">215</td>
<td class="int4">277501</td>
<td class="int4">278437</td>
<td class="varchar">Value 1077</td>
</tr>
<tr>
<td class="int4">1078</td>
<td class="int4">215</td>
<td class="int4">278438</td>
<td class="int4">279374</td>
<td class="varchar">Value 1078</td>
</tr>
<tr>
<td class="int4">1079</td>
<td class="int4">215</td>
<td class="int4">279375</td>
<td class="int4">280311</td>
<td class="varchar">Value 1079</td>
</tr>
<tr>
<td class="int4">1080</td>
<td class="int4">215</td>
<td class="int4">280312</td>
<td class="int4">281248</td>
<td class="varchar">Value 1080</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0017s (0.0054s)</td>
</tr>
</table>
</div>
<pre>
Sort  (cost=290.87..291.42 rows=221 width=266)
  Sort Key: q.level
  CTE q
    -&gt;  Recursive Union  (cost=0.00..277.84 rows=221 width=61)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hc  (cost=0.00..8.54 rows=1 width=29)
                Index Cond: (id = 42)
          -&gt;  Nested Loop  (cost=0.00..26.49 rows=22 width=61)
                -&gt;  WorkTable Scan on q  (cost=0.00..0.25 rows=3 width=36)
                      Filter: (array_upper(level, 1) &lt; 3)
                -&gt;  Index Scan using ix_hierarchy_parent on t_hierarchy hc  (cost=0.00..8.64 rows=7 width=29)
                      Index Cond: (hc.parent = q.id)
  -&gt;  CTE Scan on q  (cost=0.00..4.42 rows=221 width=266)
</pre>
</div>
<p><a href="#" onclick="xcollapse('X1706');return false;"><strong>View query details for node 31,415</strong></a><br />
</p>
<div id="X1706" style="display: none; ">
<pre class="brush: sql">
WITH    RECURSIVE
        q AS
        (
        SELECT  id, parent, lft, rgt, data, ARRAY[id] AS level
        FROM    t_hierarchy hc
        WHERE   id = 31415
        UNION ALL
        SELECT  hc.id, hc.parent, hc.lft, hc.rgt, hc.data, q.level || hc.id
        FROM    q
        JOIN    t_hierarchy hc
        ON      hc.parent = q.id
        WHERE   array_upper(level, 1) &lt; 3
        )
SELECT  id, parent, lft, rgt, data
FROM    q
ORDER BY
        level
</pre>
<div class="terminal">
<table class="terminal">
<tr>
<th>id</th>
<th>parent</th>
<th>lft</th>
<th>rgt</th>
<th>data</th>
</tr>
<tr>
<td class="int4">31415</td>
<td class="int4">6282</td>
<td class="int4">445651</td>
<td class="int4">445687</td>
<td class="varchar">Value 31415</td>
</tr>
<tr>
<td class="int4">157076</td>
<td class="int4">31415</td>
<td class="int4">445652</td>
<td class="int4">445658</td>
<td class="varchar">Value 157076</td>
</tr>
<tr>
<td class="int4">785381</td>
<td class="int4">157076</td>
<td class="int4">445653</td>
<td class="int4">445653</td>
<td class="varchar">Value 785381</td>
</tr>
<tr>
<td class="int4">785382</td>
<td class="int4">157076</td>
<td class="int4">445654</td>
<td class="int4">445654</td>
<td class="varchar">Value 785382</td>
</tr>
<tr>
<td class="int4">785383</td>
<td class="int4">157076</td>
<td class="int4">445655</td>
<td class="int4">445655</td>
<td class="varchar">Value 785383</td>
</tr>
<tr>
<td class="int4">785384</td>
<td class="int4">157076</td>
<td class="int4">445656</td>
<td class="int4">445656</td>
<td class="varchar">Value 785384</td>
</tr>
<tr>
<td class="int4">785385</td>
<td class="int4">157076</td>
<td class="int4">445657</td>
<td class="int4">445657</td>
<td class="varchar">Value 785385</td>
</tr>
<tr>
<td class="int4">157077</td>
<td class="int4">31415</td>
<td class="int4">445659</td>
<td class="int4">445665</td>
<td class="varchar">Value 157077</td>
</tr>
<tr>
<td class="int4">785386</td>
<td class="int4">157077</td>
<td class="int4">445660</td>
<td class="int4">445660</td>
<td class="varchar">Value 785386</td>
</tr>
<tr>
<td class="int4">785387</td>
<td class="int4">157077</td>
<td class="int4">445661</td>
<td class="int4">445661</td>
<td class="varchar">Value 785387</td>
</tr>
<tr>
<td class="int4">785388</td>
<td class="int4">157077</td>
<td class="int4">445662</td>
<td class="int4">445662</td>
<td class="varchar">Value 785388</td>
</tr>
<tr>
<td class="int4">785389</td>
<td class="int4">157077</td>
<td class="int4">445663</td>
<td class="int4">445663</td>
<td class="varchar">Value 785389</td>
</tr>
<tr>
<td class="int4">785390</td>
<td class="int4">157077</td>
<td class="int4">445664</td>
<td class="int4">445664</td>
<td class="varchar">Value 785390</td>
</tr>
<tr>
<td class="int4">157078</td>
<td class="int4">31415</td>
<td class="int4">445666</td>
<td class="int4">445672</td>
<td class="varchar">Value 157078</td>
</tr>
<tr>
<td class="int4">785391</td>
<td class="int4">157078</td>
<td class="int4">445667</td>
<td class="int4">445667</td>
<td class="varchar">Value 785391</td>
</tr>
<tr>
<td class="int4">785392</td>
<td class="int4">157078</td>
<td class="int4">445668</td>
<td class="int4">445668</td>
<td class="varchar">Value 785392</td>
</tr>
<tr>
<td class="int4">785393</td>
<td class="int4">157078</td>
<td class="int4">445669</td>
<td class="int4">445669</td>
<td class="varchar">Value 785393</td>
</tr>
<tr>
<td class="int4">785394</td>
<td class="int4">157078</td>
<td class="int4">445670</td>
<td class="int4">445670</td>
<td class="varchar">Value 785394</td>
</tr>
<tr>
<td class="int4">785395</td>
<td class="int4">157078</td>
<td class="int4">445671</td>
<td class="int4">445671</td>
<td class="varchar">Value 785395</td>
</tr>
<tr>
<td class="int4">157079</td>
<td class="int4">31415</td>
<td class="int4">445673</td>
<td class="int4">445679</td>
<td class="varchar">Value 157079</td>
</tr>
<tr>
<td class="int4">785396</td>
<td class="int4">157079</td>
<td class="int4">445674</td>
<td class="int4">445674</td>
<td class="varchar">Value 785396</td>
</tr>
<tr>
<td class="int4">785397</td>
<td class="int4">157079</td>
<td class="int4">445675</td>
<td class="int4">445675</td>
<td class="varchar">Value 785397</td>
</tr>
<tr>
<td class="int4">785398</td>
<td class="int4">157079</td>
<td class="int4">445676</td>
<td class="int4">445676</td>
<td class="varchar">Value 785398</td>
</tr>
<tr>
<td class="int4">785399</td>
<td class="int4">157079</td>
<td class="int4">445677</td>
<td class="int4">445677</td>
<td class="varchar">Value 785399</td>
</tr>
<tr>
<td class="int4">785400</td>
<td class="int4">157079</td>
<td class="int4">445678</td>
<td class="int4">445678</td>
<td class="varchar">Value 785400</td>
</tr>
<tr>
<td class="int4">157080</td>
<td class="int4">31415</td>
<td class="int4">445680</td>
<td class="int4">445686</td>
<td class="varchar">Value 157080</td>
</tr>
<tr>
<td class="int4">785401</td>
<td class="int4">157080</td>
<td class="int4">445681</td>
<td class="int4">445681</td>
<td class="varchar">Value 785401</td>
</tr>
<tr>
<td class="int4">785402</td>
<td class="int4">157080</td>
<td class="int4">445682</td>
<td class="int4">445682</td>
<td class="varchar">Value 785402</td>
</tr>
<tr>
<td class="int4">785403</td>
<td class="int4">157080</td>
<td class="int4">445683</td>
<td class="int4">445683</td>
<td class="varchar">Value 785403</td>
</tr>
<tr>
<td class="int4">785404</td>
<td class="int4">157080</td>
<td class="int4">445684</td>
<td class="int4">445684</td>
<td class="varchar">Value 785404</td>
</tr>
<tr>
<td class="int4">785405</td>
<td class="int4">157080</td>
<td class="int4">445685</td>
<td class="int4">445685</td>
<td class="varchar">Value 785405</td>
</tr>
<tr class="statusbar">
<td colspan="100">31 rows fetched in 0.0017s (0.0054s)</td>
</tr>
</table>
</div>
<pre>
Sort  (cost=290.87..291.42 rows=221 width=266)
  Sort Key: q.level
  CTE q
    -&gt;  Recursive Union  (cost=0.00..277.84 rows=221 width=61)
          -&gt;  Index Scan using pk_hierarchy_id on t_hierarchy hc  (cost=0.00..8.54 rows=1 width=29)
                Index Cond: (id = 31415)
          -&gt;  Nested Loop  (cost=0.00..26.49 rows=22 width=61)
                -&gt;  WorkTable Scan on q  (cost=0.00..0.25 rows=3 width=36)
                      Filter: (array_upper(level, 1) &lt; 3)
                -&gt;  Index Scan using ix_hierarchy_parent on t_hierarchy hc  (cost=0.00..8.64 rows=7 width=29)
                      Index Cond: (hc.parent = q.id)
  -&gt;  CTE Scan on q  (cost=0.00..4.42 rows=221 width=266)
</pre>
</div>
<p>As we can see, both queries complete in a little more than <strong>5 ms</strong> (instantly) and this time does not depend on the proximity to the root node.</p>
<h3>Summary</h3>
<p>We have compared the three most common queries that are usually issued against the hierarachical data:</p>
<ol>
<li>Find all descendants of a given node</li>
<li>Find all ancestors of a given node</li>
<li>Find all descendants of a given node up to a certain depth</li>
</ol>
<p>The <strong>nested set</strong> model the fastest for the first query (<strong>0.05 s</strong>), however, <strong>adjacency list</strong> shows good performance and is very fast too (selecting <strong>200,000</strong> rows is a matter of less than <strong>0.1 second</strong>).</p>
<p>For the second query, <strong>adjacency list</strong> is much faster, however, the <strong>nested sets</strong> are still usable.</p>
<p>Finally, for the third query, <strong>nested sets</strong> model shows dependency on the node.</p>
<p>For a node that has few descendants, the query is rather fast, however, for a node close to the root (to say nothing of the root itself) this query is intolerably slow.</p>
<p>Adjacency list shows superb performance on both nodes.</p>
<h3>Conclusion</h3>
<p>Given the said above and taking into account that the nested sets model is much harder to manage, we can conclude that <strong>adjacency list model</strong> should be used to manage hierarchical data in <strong>PostgreSQL 8.4</strong>.</p>
<p>Nested sets model was a very smart invention to manage hierarchical data in an environment that allowed no recursion. But now, when recursive queries are finally available, <strong>adjacency list model</strong> is just better.</p>
<p>It yields excellent performance on all three types of queries, outperforms nested sets in two of three most used queries and is extremely simple to manage.</p>
<p><strong>To be continued.</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
