EXPLAIN EXTENDED

How to create fast database queries

PostgreSQL: emulating ROW_NUMBER

Comments enabled. I *really* need your comment

Note: this article concerns PostgreSQL 8.3 and below.

PostgreSQL 8.4 introduces window functions.

Window function ROW_NUMBER() implements the functionality in question more efficiently.

In one of the previous articles:

, I described emulating Oracle's pseudocolumn ROWNUM in PostgreSQL.

Now, we'll extend this query to emulate ROW_NUMBER.

A quick reminder: ROW_NUMBER is an analytical function in ANSI SQL 2003 supported by Oracle and MS SQL Server.

It enumerates each row in a resultset, but, unlike ROWNUM, may take two additional parameters: PARTITION BY and ORDER BY.

PARTITION BY splits a rowset into several partitions, each of them being numbered with its own sequence starting from 1.

ORDER BY defines the order the rows are numbered within each partition. This order may differ from the order the rows are returned in.

This function helps building queries which allow to select N rows for each partition.

Let's create a sample table and see how we do it in PostgreSQL:

CREATE TABLE t_limiter (
id INT NOT NULL PRIMARY KEY,
grouper INT NOT NULL,
value VARCHAR(50) NOT NULL
);

CREATE INDEX ix_limiter_grouper_id ON t_limiter (grouper, id);

BEGIN TRANSACTION;

INSERT
INTO    t_limiter (id, grouper, value)
SELECT  s, (s - 1) % 15 + 1, 'Value ' || s
FROM    generate_series(1, 10000) s;

COMMIT;

ANALYZE t_limiter;

This table has 10,000 rows and 15 DISTINCT values in column grouper. We will select 2 rows for each grouper, ordered by id in descending order.

The principle is similar to the one we used to emulate ROWNUM:

  1. Select DISTINCT grouper's
  2. For each grouper, build an array of table rows within this grouper.
  3. Join this array with a sequence of numbers of same length returned by generate_series. This numbers will serve as ROW_NUMBER's
  4. For each ROW_NUMBER, return the corresponding array's member and convert it back into the tuple.
  5. Filter the values with ROW_NUMBER <= 2

The query will look like this:

SELECT  (arr[row_number]).*
FROM    (
SELECT  grouper, arr,
generate_series(1, array_upper(arr, 1)) AS row_number
FROM    (
SELECT  grouper,
ARRAY(
SELECT  li
FROM    t_limiter li
WHERE   li.grouper = lo.grouper
ORDER BY
li.grouper DESC, li.id DESC
) AS arr
FROM    (
SELECT  grouper
FROM    t_limiter l
GROUP BY
grouper
ORDER BY
grouper
) lo
) lo2
) lo3
WHERE   row_number &lt;= 2
id grouper value
9991 1 Value 9991
9976 1 Value 9976
9992 2 Value 9992
9977 2 Value 9977
9993 3 Value 9993
9978 3 Value 9978
9994 4 Value 9994
9979 4 Value 9979
9995 5 Value 9995
9980 5 Value 9980
9996 6 Value 9996
9981 6 Value 9981
9997 7 Value 9997
9982 7 Value 9982
9998 8 Value 9998
9983 8 Value 9983
9999 9 Value 9999
9984 9 Value 9984
10000 10 Value 10000
9985 10 Value 9985
9986 11 Value 9986
9971 11 Value 9971
9987 12 Value 9987
9972 12 Value 9972
9988 13 Value 9988
9973 13 Value 9973
9989 14 Value 9989
9974 14 Value 9974
9990 15 Value 9990
9975 15 Value 9975
30 rows fetched in 0.0011s (12.3588s)
Subquery Scan lo3  (cost=189.44..3991.30 rows=5 width=36)
  Filter: (lo3.row_number <= 2)
  ->  Subquery Scan lo  (cost=189.44..3991.11 rows=15 width=4)
        ->  Sort  (cost=189.44..189.48 rows=15 width=4)
              Sort Key: l.grouper
              ->  HashAggregate  (cost=189.00..189.15 rows=15 width=4)
                    ->  Seq Scan on t_limiter l  (cost=0.00..164.00 rows=10000 width=4)
        SubPlan
          ->  Sort  (cost=125.05..126.71 rows=667 width=40)
                Sort Key: li.id
                ->  Bitmap Heap Scan on t_limiter li  (cost=21.42..93.76 rows=667 width=40)
                      Recheck Cond: (grouper = $0)
                      ->  Bitmap Index Scan on ix_limiter_grouper_id  (cost=0.00..21.25 rows=667 width=0)
                            Index Cond: (grouper = $0)
          ->  Sort  (cost=125.05..126.71 rows=667 width=40)
                Sort Key: li.id
                ->  Bitmap Heap Scan on t_limiter li  (cost=21.42..93.76 rows=667 width=40)
                      Recheck Cond: (grouper = $0)
                      ->  Bitmap Index Scan on ix_limiter_grouper_id  (cost=0.00..21.25 rows=667 width=0)
                            Index Cond: (grouper = $0)

This query works, but is quite inefficient, since it takes 12 seconds to complete.

This is because PostgreSQL is quite slow on handling large arrays. Subscripting an array in PostgreSQL takes exponential time.

We can optimize this query a little by pushing the predicate into the inner subquery. Instead of constructing the large array and then filtering the results, we will make the query so that no more than two rows will ever get into the array.

This is made possible by PostgreSQL's ability to use correlated values in a LIMIT clause of a subquery:

SELECT  (arr[row_number]).*
FROM    (
SELECT  grouper, arr,
generate_series(1, array_upper(arr, 1)) AS row_number
FROM    (
SELECT  grouper,
ARRAY(
SELECT  li
FROM    t_limiter li
WHERE   li.grouper = lo.grouper
ORDER BY
li.grouper DESC, li.id DESC
LIMIT lim
) AS arr
FROM    (
SELECT  2 AS lim, grouper
FROM    t_limiter l
GROUP BY
grouper
ORDER BY
grouper
) lo
) lo2
) lo3

id grouper value
9991 1 Value 9991
9976 1 Value 9976
9992 2 Value 9992
9977 2 Value 9977
9993 3 Value 9993
9978 3 Value 9978
9994 4 Value 9994
9979 4 Value 9979
9995 5 Value 9995
9980 5 Value 9980
9996 6 Value 9996
9981 6 Value 9981
9997 7 Value 9997
9982 7 Value 9982
9998 8 Value 9998
9983 8 Value 9983
9999 9 Value 9999
9984 9 Value 9984
10000 10 Value 10000
9985 10 Value 9985
9986 11 Value 9986
9971 11 Value 9971
9987 12 Value 9987
9972 12 Value 9972
9988 13 Value 9988
9973 13 Value 9973
9989 14 Value 9989
9974 14 Value 9974
9990 15 Value 9990
9975 15 Value 9975
30 rows fetched in 0.0015s (0.0097s)
Subquery Scan lo3  (cost=189.44..1043.96 rows=15 width=36)
  ->  Subquery Scan lo  (cost=189.44..1043.81 rows=15 width=8)
        ->  Sort  (cost=189.44..189.48 rows=15 width=4)
              Sort Key: l.grouper
              ->  HashAggregate  (cost=189.00..189.15 rows=15 width=4)
                    ->  Seq Scan on t_limiter l  (cost=0.00..164.00 rows=10000 width=4)
        SubPlan
          ->  Limit  (cost=0.00..28.47 rows=67 width=40)
                ->  Index Scan Backward using ix_limiter_grouper_id on t_limiter li  (cost=0.00..283.43 rows=667 width=40)
                      Index Cond: (grouper = $0)
          ->  Limit  (cost=0.00..28.47 rows=67 width=40)
                ->  Index Scan Backward using ix_limiter_grouper_id on t_limiter li  (cost=0.00..283.43 rows=667 width=40)
                      Index Cond: (grouper = $0)

Since the array operations are performed on the smaller arrays, the overall query time is as low as 10 ms.

We can even use dynamic conditions on ROW_NUMBER.

Let's build the query so that it returns 2 rows for odd grouper's, and 1 row for even ones:

SELECT  row_number, (arr[row_number]).*
FROM    (
SELECT  grouper, arr,
generate_series(1, array_upper(arr, 1)) AS row_number
FROM    (
SELECT  grouper,
ARRAY(
SELECT  li
FROM    t_limiter li
WHERE   li.grouper = lo.grouper
ORDER BY
li.grouper DESC, li.id DESC
LIMIT lim
) AS arr
FROM    (
SELECT  CASE grouper % 2 WHEN 0 THEN 1 ELSE 2 END AS lim, grouper
FROM    t_limiter l
GROUP BY
grouper
ORDER BY
grouper
) lo
) lo2
) lo3

row_number id grouper value
1 9991 1 Value 9991
2 9976 1 Value 9976
1 9992 2 Value 9992
1 9993 3 Value 9993
2 9978 3 Value 9978
1 9994 4 Value 9994
1 9995 5 Value 9995
2 9980 5 Value 9980
1 9996 6 Value 9996
1 9997 7 Value 9997
2 9982 7 Value 9982
1 9998 8 Value 9998
1 9999 9 Value 9999
2 9984 9 Value 9984
1 10000 10 Value 10000
1 9986 11 Value 9986
2 9971 11 Value 9971
1 9987 12 Value 9987
1 9988 13 Value 9988
2 9973 13 Value 9973
1 9989 14 Value 9989
1 9990 15 Value 9990
2 9975 15 Value 9975
23 rows fetched in 0.0010s (0.0095s)
Subquery Scan lo3  (cost=189.52..1044.04 rows=15 width=36)
  ->  Subquery Scan lo  (cost=189.52..1043.89 rows=15 width=8)
        ->  Sort  (cost=189.52..189.56 rows=15 width=4)
              Sort Key: l.grouper
              ->  HashAggregate  (cost=189.00..189.22 rows=15 width=4)
                    ->  Seq Scan on t_limiter l  (cost=0.00..164.00 rows=10000 width=4)
        SubPlan
          ->  Limit  (cost=0.00..28.47 rows=67 width=40)
                ->  Index Scan Backward using ix_limiter_grouper_id on t_limiter li  (cost=0.00..283.43 rows=667 width=40)
                      Index Cond: (grouper = $0)
          ->  Limit  (cost=0.00..28.47 rows=67 width=40)
                ->  Index Scan Backward using ix_limiter_grouper_id on t_limiter li  (cost=0.00..283.43 rows=667 width=40)
                      Index Cond: (grouper = $0)

The LIMIT clause of the correlated subquery takes a parameter from the outer subquery and limits the number of returned rows accordingly.

Written by Quassnoi

May 11th, 2009 at 11:00 pm

Posted in PostgreSQL

Leave a Reply