EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: Happy New Year!. You're welcome to read and comment on it.

Selecting lowest value

From Stack Overflow:

Say I have the following data:

Name Value
Small 10
Medium 100
Large 1000

Imagine that these represent the volumes of boxes.

I have some items that I want to put in the boxes, and I want the smallest box possible.

I need an SQL query that will:

  1. Return the row with the smallest row greater than my query parameter
  2. If there is no such row, then return the largest row

I’m on Oracle 11g, so any special Oracle goodness is OK.

There certainly is a special Oracle goodness, and I’ll cover it in this article.

Let’s create the sample table and fill it with 1,000,000 rows to illustrate the point:

Table creation details

Now, what do we have here?

As we need to find the least row greater than a given value, it’s a good work for an INDEX SCAN along with a COUNT STOPKEY.

We can make it with ORDER BY and rownum = 1:

SELECT  *
FROM    (
        SELECT  *
        FROM    t_orderer
        WHERE   value > 314159
        ORDER BY
                value
        )
WHERE   rownum = 1
ID VALUE
314160 314160
1 row fetched in 0.0001s (0.0009s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
    INDEX RANGE SCAN, 20090424_lowest.IX_ORDERER_VALUE

This, though, will not return anything if there are no rows with value greater than the parameter provided.

If it is the case, we’ll just need to select the row with a maximal value:

SELECT  *
FROM    t_orderer
WHERE   rownum = 1
ORDER BY
        value DESC
ID VALUE
1000000 1000000
1 row fetched in 0.0001s (0.0006s)
SELECT STATEMENT
 COUNT STOPKEY
  TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
   INDEX FULL SCAN DESCENDING, 20090424_lowest.IX_ORDERER_VALUE

As you can see, both these queries use the index on value and are virtually instant.

Now, we need to unite them.

Unfortunately, Oracle does not allow using ORDER BY inside a UNION ALL.

That’s why we’ll need to do the following:

  1. Wrap these queries into inline views
  2. UNION ALL these views
  3. Wrap the result into a subquery
  4. Select the first row from the UNION ALL

This will return the lowest value greater than a given parameter if the former exists; or the maximal value if it does not.

Here’s the result:

WITH    lowest AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                WHERE   value > 314159
                ORDER BY
                        value
                )
        WHERE   rownum = 1
        ),
        maximal AS (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                ORDER BY
                        value DESC
                )
        WHERE   rownum = 1
        )
SELECT  *
FROM    (
        SELECT  *
        FROM    lowest
        UNION ALL
        SELECT  *
        FROM    maximal
        )
WHERE   rownum = 1
ID VALUE
314160 314160
1 row fetched in 0.0001s (0.0007s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX RANGE SCAN, 20090424_lowest.IX_ORDERER_VALUE
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX FULL SCAN DESCENDING, 20090424_lowest.IX_ORDERER_VALUE

We see that the inner query plans remained the same and just one final UNION-ALL with a COUNT STOPKEY is added to the top.

Let’s check how this query works for a parameter greater than the MAX(value):

WITH    lowest AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                WHERE   value > 31415926
                ORDER BY
                        value
                )
        WHERE   rownum = 1
        ),
        maximal AS (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                ORDER BY
                        value DESC
                )
        WHERE   rownum = 1
        )
SELECT  *
FROM    (
        SELECT  *
        FROM    lowest
        UNION ALL
        SELECT  *
        FROM    maximal
        )
WHERE   rownum = 1
ID VALUE
1000000 1000000
1 row fetched in 0.0001s (0.0006s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX RANGE SCAN, 20090424_lowest.IX_ORDERER_VALUE
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX FULL SCAN DESCENDING, 20090424_lowest.IX_ORDERER_VALUE

Right answer and it’s instant again.

One more side note: Oracle is smart enough to avoid the second query execution if the first one succeeds.

Let’s change the second subquery (which selects the maximal value) so that index becomes unusable, and run the query with a big parameter (which makes the maximal value to be returned from the second subquery):

WITH    lowest AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                WHERE   value > 31415926
                ORDER BY
                        value
                )
        WHERE   rownum = 1
        ),
        maximal AS (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                ORDER BY
                        (value + 1) DESC
                        -- The ORDER BY condition above makes the index unusable
                )
        WHERE   rownum = 1
        )
SELECT  *
FROM    (
        SELECT  *
        FROM    lowest
        UNION ALL
        SELECT  *
        FROM    maximal
        )
WHERE   rownum = 1
ID VALUE
1000000 1000000
1 row fetched in 0.0001s (0.5781s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX RANGE SCAN, 20090424_lowest.IX_ORDERER_VALUE
    VIEW
     COUNT STOPKEY
      VIEW
       SORT ORDER BY STOPKEY
        TABLE ACCESS FULL, 20090424_lowest.T_ORDERER

The plan for the second subquery now includes SORT, that’s why it runs for more than 500 ms.

But if we change the parameter to a small one (which is satisfied by the first subquery):

WITH    lowest AS
        (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                WHERE   value > 314159
                ORDER BY
                        value
                )
        WHERE   rownum = 1
        ),
        maximal AS (
        SELECT  *
        FROM    (
                SELECT  *
                FROM    t_orderer
                ORDER BY
                        (value + 1) DESC
                        -- The ORDER BY condition above makes the index unusable
                )
        WHERE   rownum = 1
        )
SELECT  *
FROM    (
        SELECT  *
        FROM    lowest
        UNION ALL
        SELECT  *
        FROM    maximal
        )
WHERE   rownum = 1
ID VALUE
314160 314160
1 row fetched in 0.0001s (0.0007s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    VIEW
     COUNT STOPKEY
      VIEW
       TABLE ACCESS BY INDEX ROWID, 20090424_lowest.T_ORDERER
        INDEX RANGE SCAN, 20090424_lowest.IX_ORDERER_VALUE
    VIEW
     COUNT STOPKEY
      VIEW
       SORT ORDER BY STOPKEY
        TABLE ACCESS FULL, 20090424_lowest.T_ORDERER

, the first subquery returns the result instantly, and the second subquery doesn’t even gets a change to be run.

That’s why the whole query is as fast as the initial one.

Written by Quassnoi

April 24th, 2009 at 11:00 pm

Posted in Oracle

Comments are closed.