EXPLAIN EXTENDED

How to create fast database queries

Ordering NULLs

Comments enabled. I *really* need your comment

From Stack Overflow:

In our product we have a generic search engine, and trying to optimize the search performance.

A lot of the tables used in the queries allow NULL values.

Should we redesign our table to disallow NULL values for optimization or not?

NULL values surely affect some optimizer decisions in Oracle.

It's widely known that Oracle won't include a column into index if all key values are NULL.

This means that if we don't apply a filter which would imply non-NULL condition on the column (like, any comparison operator in the WHERE clause, or a comparison operation in the ON condition of a JOIN clause), the index is not guaranteed to return all values, and a FULL SCAN will be required on the table.

It is less evident that ORDER BY queries can be affected too.

Let's create a sample table and see how:

CREATE TABLE t_null (
id NUMBER(10) NOT NULL,
orderer NUMBER(10)
)
/
ALTER TABLE t_null ADD CONSTRAINT pk_null_id PRIMARY KEY (id)
/
CREATE INDEX t_null_orderer ON t_null (orderer)
/
INSERT
INTO t_null
SELECT level, DECODE(level, 1, NULL, level)
FROM dual
CONNECT BY
level <= 1000000 / COMMIT / [/sourcecode] This table has 1,000,000 rows.

The column orderer, which is indexed, allows NULL's.

The values of orderer are equal to ones of id, except for the very first row (one with id of 1), for which the value of orderer is NULL.

If we run this query:

SELECT orderer
FROM (
SELECT orderer
FROM t_null
ORDER BY
orderer NULLS FIRST
)
WHERE rownum <= 10 [/sourcecode]

ORDERER
NULL
2
3
4
5
6
7
8
9
10
10 rows fetched in 0.0003s (0.3957s)
SELECT STATEMENT 
 COUNT STOPKEY
  VIEW 
   SORT ORDER BY STOPKEY
    TABLE ACCESS FULL, 20090619_null.T_NULL

, we see that it runs for almost 0.4 seconds.

It takes so long because we select every row from the table.

The index is not scanned because an index scan would not select the NULL row, since it isn't even mentioned somewhere in the index. That's why Oracle needs to use full table scan and sort the values.

If we change the query a little:

SELECT orderer
FROM (
SELECT orderer
FROM t_null
WHERE orderer IS NOT NULL
ORDER BY
orderer
)
WHERE rownum <= 10 [/sourcecode]

ORDERER
2
3
4
5
6
7
8
9
10
11
10 rows fetched in 0.0002s (0.0006s)
SELECT STATEMENT 
 COUNT STOPKEY
  VIEW 
   INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER

, the query becomes instant.

This is because we added an additional condition here which tells Oracle we don't need NULL values.

Oracle gets the hint and starts to use the index since it sees now that the index contains all values we are interested at.

But what if we still need the NULL values?

We can do the following trick:

  1. Write a query to select the NULL values (unordered)
  2. Write a query to select the non-NULL values (ordered)
  3. UNION ALL these queries, putting the NULL query first if we want NULLS FIRST, or second if we want NULLS LAST

This will use index to retrieve the values sorted on non-NULL values, and then just append or prepend the NULL's to the resultset. No SORT will be used.

Let's try it:

SELECT orderer
FROM (
SELECT orderer
FROM t_null
WHERE orderer IS NULL
UNION ALL
SELECT orderer
FROM (
SELECT orderer
FROM t_null
WHERE orderer IS NOT NULL
ORDER BY
orderer
)
)
WHERE rownum <= 10 [/sourcecode]

ORDERER
NULL
2
3
4
5
6
7
8
9
10
10 rows fetched in 0.0003s (0.2215s)
SELECT STATEMENT 
 COUNT STOPKEY
  VIEW 
   UNION-ALL 
    TABLE ACCESS FULL, 20090619_null.T_NULL
    VIEW 
     INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER

This is the synonym for the first query (with NULLS FIRST), but works for only 0.22 seconds, twice as fast as the original query.

And what if we want NULLS LAST?

Then we just change the order of the queries in UNION ALL:

SELECT orderer
FROM (
SELECT orderer
FROM (
SELECT orderer
FROM t_null
WHERE orderer IS NOT NULL
ORDER BY
orderer
)
UNION ALL
SELECT orderer
FROM t_null
WHERE orderer IS NULL
)
WHERE rownum <= 10 [/sourcecode]

ORDERER
2
3
4
5
6
7
8
9
10
11
10 rows fetched in 0.0003s (0.0006s)
SELECT STATEMENT 
 COUNT STOPKEY
  VIEW 
   UNION-ALL 
    VIEW 
     INDEX FULL SCAN, 20090619_null.T_NULL_ORDERER
    TABLE ACCESS FULL, 20090619_null.T_NULL

This query is instant because we have a condition to select only first 10 rows, and COUNT STOPKEY is chosen by the optimizer to serve this condition.

In a UNION ALL, Oracle will not evaluate the second query unless the first one is emptied out.

When we emulated NULLS FIRST, Oracle had to perform full table scan to make sure all NULL values are selected. This takes quite long.

But when we emulate NULLS LAST, Oracle just needs to select first 10 values out of the index and the stop.

This is almost instant and makes the whole query complete without even peeking into the table itself, since these 10 values are everything we need and they can be taken from the index.

Written by Quassnoi

June 19th, 2009 at 11:00 pm

Posted in Oracle

Leave a Reply