EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: 5 Claims About SQL, Explained. You're welcome to read and comment on it.

Oracle: ROW_NUMBER vs ROWNUM

with 2 comments

From Stack Overflow:

I have an SQL query that looks something like this:

SELECT  *
FROM    (
        SELECT  t.*, row_number() OVER (ORDER BY ID) rn
        FROM    mytable t
        )
WHERE   rn BETWEEN :start and :end

Essentially, it's the ORDER BY part that's slowing things down. If I were to remove it, the EXPLAIN cost goes down by an order of magnitude (over 1,000 times).

I've tried this:

SELECT  t.*, row_number() OVER (ORDER BY ID) rn
FROM    mytable t
WHERE   rownum BETWEEN :start and :end

, but this doesn't give correct results.

Is there any easy way to speed this up? Or will I have to spend some more time with the EXPLAIN tool?

First, just a quick reminder on how ROWNUM works. From Oracle's documentation:

Conditions testing for ROWNUM values greater than a positive integer are always false. For example, this query returns no rows:

SELECT  *
FROM    employees
WHERE   ROWNUM > 1

The first row fetched is assigned a ROWNUM of 1 and makes the condition false. The second row to be fetched is now the first row and is also assigned a ROWNUM of 1 and makes the condition false. All rows subsequently fail to satisfy the condition, so no rows are returned.

That's why the second query should look like this:

SELECT  *
FROM    (
        SELECT  t.*, ROWNUM AS rn
        FROM    mytable t
        ORDER BY
                paginator, id
        )
WHERE   rn BETWEEN :start and :end

Now, let's see the performance. To do this, we'll create a sample table:

CREATE TABLE mytable (
        id NUMBER(10) NOT NULL,
        paginator NUMBER(10) NOT NULL,
        value VARCHAR2(50)
)
/
ALTER TABLE mytable
ADD CONSTRAINT pk_mytable_id PRIMARY KEY (id)
/

CREATE INDEX ix_mytable_paginator_id ON mytable(paginator, id)
/

INSERT
INTO    mytable(id, paginator, value)
SELECT  level, level / 10000, 'Value ' || level
FROM    dual
CONNECT BY
        level <= 1000000
/

COMMIT
/

BEGIN
        DBMS_STATS.gather_schema_stats('"20090506_rownum"');
END;
/

This query has 1,000,000 records and an index on (paginator, id).

I deliberately made the paginator non-UNIQUE to demonstrate that the paging query should always include a unique column set into the sort. Otherwise, a record can be selected twice on two different pages, or not selected at all.

Let's query the table with both queries, returning 10 values from 900,001 to 900,010, and see which one performs better.

First, the ROW_NUMBER():

SELECT  *
FROM    (
        SELECT  t.*, ROW_NUMBER() OVER (ORDER BY paginator, id) AS rn
        FROM    mytable t
        )
WHERE   rn BETWEEN 900001 AND 900010

ID PAGINATOR VALUE RN
900001 90 Value 900001 900001
900002 90 Value 900002 900002
900003 90 Value 900003 900003
900004 90 Value 900004 900004
900005 90 Value 900005 900005
900006 90 Value 900006 900006
900007 90 Value 900007 900007
900008 90 Value 900008 900008
900009 90 Value 900009 900009
900010 90 Value 900010 900010
10 rows fetched in 0.0005s (0.8594s)
SELECT STATEMENT 
 VIEW 
  WINDOW NOSORT STOPKEY
   TABLE ACCESS BY INDEX ROWID, 20090506_rownum.MYTABLE
    INDEX FULL SCAN, 20090506_rownum.IX_MYTABLE_PAGINATOR_ID

And the ROWNUM:

SELECT  *
FROM    (
        SELECT  t.*, ROWNUM AS rn
        FROM    (
                SELECT  *
                FROM    mytable
                ORDER BY
                        paginator, id
                ) t
        )
WHERE   rn BETWEEN 900001 AND 900010


ID PAGINATOR VALUE RN
900001 90 Value 900001 900001
900002 90 Value 900002 900002
900003 90 Value 900003 900003
900004 90 Value 900004 900004
900005 90 Value 900005 900005
900006 90 Value 900006 900006
900007 90 Value 900007 900007
900008 90 Value 900008 900008
900009 90 Value 900009 900009
900010 90 Value 900010 900010
10 rows fetched in 0.0005s (0.7058)
SELECT STATEMENT 
 VIEW 
  COUNT 
   VIEW 
    TABLE ACCESS BY INDEX ROWID, 20090506_rownum.MYTABLE
     INDEX FULL SCAN, 20090506_rownum.IX_MYTABLE_PAGINATOR_ID

We can see that ROW_NUMBER() works for 850 ms, while the ROWNUM one for only 700 ms. ROWNUM is slightly more efficient.

But let's look closer into the plan for ROW_NUMBER(). We see that Oracle is smart enough, first, to avoid sorting on (paginator, id) (since an index is available), and, second, to use a STOPKEY condition which ceases scanning as soon as it finds enough values.

The ROWNUM query uses the index too, but it does not employ STOPKEY condition, it just counts.

How can we enable STOPKEY for a ROWNUM query?

Unfortunately, Oracle's optimizer cannot understand in this case that RN is an alias for ROWNUM in the inner subquery. That's why we'll need to rewrite the query a little so that we will have a limiting condition on ROWNUM the outer subquery and STOPKEY will became usable:

SELECT  *
FROM    (
        SELECT  t.*, ROWNUM AS rn
        FROM    (
                SELECT  *
                FROM    mytable
                ORDER BY
                        paginator, id
                ) t
        )
WHERE   rn >= 900001
        AND rownum <= 10

ID PAGINATOR VALUE RN
900001 90 Value 900001 900001
900002 90 Value 900002 900002
900003 90 Value 900003 900003
900004 90 Value 900004 900004
900005 90 Value 900005 900005
900006 90 Value 900006 900006
900007 90 Value 900007 900007
900008 90 Value 900008 900008
900009 90 Value 900009 900009
900010 90 Value 900010 900010
10 rows fetched in 0.0005s (0.4714s)
SELECT STATEMENT 
 COUNT STOPKEY
  VIEW 
   COUNT 
    VIEW 
     TABLE ACCESS BY INDEX ROWID, 20090506_rownum.MYTABLE
      INDEX FULL SCAN, 20090506_rownum.IX_MYTABLE_PAGINATOR_ID

We now make lower filtering on RN (which is an alias for ROWNUM from the inner subquery), and upper filtering on ROWNUM in the outer subquery, which counts not the number of total rows returned, but the number of rows satisfying the first condition.

In this case Oracle will use the STOPKEY, and the query now runs for only 471 ms, twice as fast as the original one.

But if ROW_NUMBER and ROWNUM use essentially the same plan, why the latter one is so much faster?

This is because Oracle is very, very old.

ROWNUM was introduced in Oracle 6 that was released in 1988. You remember 1988? You needed your PC to be IBM compatible to run Oracle.

And by that time it already had ROWNUM, so says its ARJ-compressed manual (you remember ARJ?).

Large HTML online manuals for PostgreSQL and MySQL, by the way, still don't mention any ROWNUM, it needs to be emulated (read my yesterday's article PostgreSQL: row numbers to see how to do it).

OK. So ROWNUM is 1988, that means it's both simple and efficient. Just a pure counter.

Analytic functions, on the other hand, were introduced in Oracle 9i in 2001. Y2K was already a history by that time, there was nothing to be afraid of, and ROW_NUMBER was implemented in not so efficient way.

It is of course powerful and flexible, but slow for simple tasks like pagination.

Written by Quassnoi

May 6th, 2009 at 11:00 pm

Posted in Oracle

2 Responses to 'Oracle: ROW_NUMBER vs ROWNUM'

Subscribe to comments with RSS or TrackBack to 'Oracle: ROW_NUMBER vs ROWNUM'.

  1. I really enjoyed reading this article.

    Very well written….

    Thanks for sharing…

    Leo James

    5 Nov 13 at 23:35

  2. Thanks a lot. Very pure article.

    Zaim AKINÖZ

    1 Feb 14 at 01:18

Leave a Reply