EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: Shared Plan and Algorithm Network Cache (SPANC). You're welcome to read and comment on it.

Ordering NULLs

From Stack Overflow:

In our product we have a generic search engine, and trying to optimize the search performance.

A lot of the tables used in the queries allow NULL values.

Should we redesign our table to disallow NULL values for optimization or not?

NULL values surely affect some optimizer decisions in Oracle.

It’s widely known that Oracle won’t include a column into index if all key values are NULL.

This means that if we don’t apply a filter which would imply non-NULL condition on the column (like, any comparison operator in the WHERE clause, or a comparison operation in the ON condition of a JOIN clause), the index is not guaranteed to return all values, and a FULL SCAN will be required on the table.

It is less evident that ORDER BY queries can be affected too.

Let’s create a sample table and see how:

CREATE TABLE t_null (
        id NUMBER(10) NOT NULL,
        orderer NUMBER(10)
        )
/
ALTER TABLE t_null ADD CONSTRAINT pk_null_id PRIMARY KEY (id)
/
CREATE INDEX t_null_orderer ON t_null (orderer)
/
INSERT
INTO    t_null
SELECT  level, DECODE(level, 1, NULL, level)
FROM    dual
CONNECT BY
        level <= 1000000
/
COMMIT
/

This table has 1,000,000 rows.

The column orderer, which is indexed, allows NULL‘s.

The values of orderer are equal to ones of id, except for the very first row (one with id of 1), for which the value of orderer is NULL.

If we run this query:

SELECT  orderer
FROM    (
        SELECT  orderer
        FROM    t_null
        ORDER BY
                orderer NULLS FIRST
        )
WHERE   rownum <= 10
ORDERER
NULL
2
3
4
5
6
7
8
9
10
10 rows fetched in 0.0003s (0.3957s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   SORT ORDER BY STOPKEY
    TABLE ACCESS FULL, 20090619_null.T_NULL

, we see that it runs for almost 0.4 seconds.

It takes so long because we select every row from the table.

The index is not scanned because an index scan would not select the NULL row, since it isn’t even mentioned somewhere in the index. That’s why Oracle needs to use full table scan and sort the values.

If we change the query a little:

SELECT  orderer
FROM    (
        SELECT  orderer
        FROM    t_null
        WHERE   orderer IS NOT NULL
        ORDER BY
                orderer
        )
WHERE   rownum <= 10
ORDERER
2
3
4
5
6
7
8
9
10
11
10 rows fetched in 0.0002s (0.0006s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER

, the query becomes instant.

This is because we added an additional condition here which tells Oracle we don’t need NULL values.

Oracle gets the hint and starts to use the index since it sees now that the index contains all values we are interested at.

But what if we still need the NULL values?

We can do the following trick:

  1. Write a query to select the NULL values (unordered)
  2. Write a query to select the non-NULL values (ordered)
  3. UNION ALL these queries, putting the NULL query first if we want NULLS FIRST, or second if we want NULLS LAST

This will use index to retrieve the values sorted on non-NULL values, and then just append or prepend the NULL‘s to the resultset. No SORT will be used.

Let’s try it:

SELECT  orderer
FROM    (
        SELECT  orderer
        FROM    t_null
        WHERE   orderer IS NULL
        UNION ALL
        SELECT  orderer
        FROM    (
                SELECT  orderer
                FROM    t_null
                WHERE   orderer IS NOT NULL
                ORDER BY
                        orderer
                )
        )
WHERE   rownum <= 10
ORDERER
NULL
2
3
4
5
6
7
8
9
10
10 rows fetched in 0.0003s (0.2215s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    TABLE ACCESS FULL, 20090619_null.T_NULL
    VIEW
     INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER

This is the synonym for the first query (with NULLS FIRST), but works for only 0.22 seconds, twice as fast as the original query.

And what if we want NULLS LAST?

Then we just change the order of the queries in UNION ALL:

SELECT  orderer
FROM    (
        SELECT  orderer
        FROM    (
                SELECT  orderer
                FROM    t_null
                WHERE   orderer IS NOT NULL
                ORDER BY
                        orderer
                )
        UNION ALL
        SELECT  orderer
        FROM    t_null
        WHERE   orderer IS NULL
        )
WHERE   rownum <= 10
ORDERER
2
3
4
5
6
7
8
9
10
11
10 rows fetched in 0.0003s (0.0006s)
SELECT STATEMENT
 COUNT STOPKEY
  VIEW
   UNION-ALL
    VIEW
     INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER
    TABLE ACCESS FULL, 20090619_null.T_NULL

This query is instant because we have a condition to select only first 10 rows, and COUNT STOPKEY is chosen by the optimizer to serve this condition.

In a UNION ALL, Oracle will not evaluate the second query unless the first one is emptied out.

When we emulated NULLS FIRST, Oracle had to perform full table scan to make sure all NULL values are selected. This takes quite long.

But when we emulate NULLS LAST, Oracle just needs to select first 10 values out of the index and the stop.

This is almost instant and makes the whole query complete without even peeking into the table itself, since these 10 values are everything we need and they can be taken from the index.

Written by Quassnoi

June 19th, 2009 at 11:00 pm

Posted in Oracle

Comments are closed.