EXPLAIN EXTENDED

How to create fast database queries

NVL vs. COALESCE

with 2 comments

From Stack Overflow:

Are there non obvious differences between NVL and COALESCE in Oracle?

The obvious differences are that COALESCE will return the first non-NULL item in its parameter list whereas NVL only takes two parameters and returns the first if it is not NULL, otherwise it returns the second.

It seems that NVL may just be a base case version of COALESCE.

Am I missing something?

In case of two agruments of same type, these functions are basically the same: they both return first non-NULL values.

However, they are implemented differently:

  • NVL always evaluates both arguments
  • COALESCE doesn't evaluate anything after it finds the first non-NULL argument

This behavior can be utilized.

Here's a real world example from a restaurant management system I developed about 10 years ago.

A bill that is presented to a customer consists of the items he or she ordered.

Majority of the dishes ordered are plain menu items: something that has its price printed in the menu.

However, there are some special things which are not included to the menu: a customer may want to order something exclusive and pay for it.

The manager should estimate the cost of ingredients, time and effort to make this dish, then calculate the price and name it: everything in 30 seconds.

These things are of course come overpriced, but the customers that make such orders don't seem to care much about money.

In the bill that appears as special order. When a manager enters it into the computer, she should also enter the price she had calculated.

These prices then get back to the server and are stored in a table (not the one where regular menu prices are stored).

At the end of month, a report to the month's receipts should be generated.

Here are the tables containing the data relevant to the reports. They of course are simplified, but the principle remains the same.

CREATE TABLE t_regular
        (
        item NUMBER(10) NOT NULL,
        price NUMBER(10, 4) NOT NULL
        )
/
ALTER TABLE t_regular
ADD CONSTRAINT pk_regular_item
PRIMARY KEY (item)
/
CREATE TABLE t_bill
        (
        id NUMBER(10) NOT NULL,
        item NUMBER(10),
        quantity NUMBER(10) NOT NULL
        )
/
ALTER TABLE t_bill
ADD CONSTRAINT pk_bill_id
PRIMARY KEY (id)
/
CREATE TABLE t_special
        (
        bill NUMBER(10) NOT NULL,
        price NUMBER(10, 4) NOT NULL
        )
/
ALTER TABLE t_special
ADD CONSTRAINT pk_special_order
PRIMARY KEY (bill)
/
BEGIN
        DBMS_RANDOM.seed(20090620);
END;
/
INSERT
INTO    t_regular (item, price)
SELECT  level, (DBMS_RANDOM.value() * 99 + 1) * 10
FROM    dual
CONNECT BY
        level <= 100
/
INSERT
INTO    t_bill (id, item, quantity)
SELECT  id,
        CASE WHEN special < 0.0001 THEN NULL ELSE FLOOR(DBMS_RANDOM.value() * 99) + 1 END,
        CASE WHEN special < 0.0001 THEN 1 ELSE FLOOR(DBMS_RANDOM.value() * 9) + 1 END
FROM    (
        SELECT  /*+ NO_MERGE */
                level AS id,
                DBMS_RANDOM.value() AS special
        FROM    dual
        CONNECT BY
                level <= 1000000
        )
/
INSERT
INTO    t_special (bill, price)
SELECT  id, (FLOOR(DBMS_RANDOM.value() * 500) + 499) * 10
FROM    t_bill
WHERE   item IS NULL
/
COMMIT
/

Table t_bill contains the items biiled. There are 1,000,000 rows in this table.

If item is not NULL, it's a menu item, otherwise it's a special item.

The type of the item in each row is chosen randomly, and there turned out to be 102 special items in the table (marked as NULL's).

Table t_regular contains the prices for the menu items. There are 100 rows in this table.

Table t_special contains the prices for the special items. There are 102 rows in this table (that's as many rows as there are special entries in the bills).

To build a report, we can use this following query:

SELECT  SUM(NVL(r.price * b.quantity, s.price))
FROM    t_bill b
LEFT JOIN
        t_regular r
ON      r.item = b.item
LEFT JOIN
        t_special s
ON      s.bill = b.id

SUM(NVL(R.PRICE*B.QUANTITY,S.PRICE))
2794357806,0261
1 row fetched in 0.0001s (1.2125s)
SELECT STATEMENT 
 SORT AGGREGATE
  HASH JOIN RIGHT OUTER
   TABLE ACCESS FULL, 20090620_price.T_SPECIAL
   HASH JOIN RIGHT OUTER
    TABLE ACCESS FULL, 20090620_price.T_REGULAR
    TABLE ACCESS FULL, 20090620_price.T_BILL

Depending on the item's speciality, it takes either a regular price from t_regular, or the special price for this very bill from t_special.

This uses HASH JOIN over the two tables and runs for 1.2 seconds.

Since there are only few entries in the t_special, we could probably benefit if we could query it only if we need to. There is no need in probing each of 1,000,000 rows against t_special: instead, we just need to probe only those rows with a NULL in item.

We can rewrite the query using a subquery in SELECT clause instead of a JOIN:

SELECT  SUM(
        NVL(
        r.price * b.quantity,
        (
        SELECT  price
        FROM    t_special s
        WHERE   s.bill = b.id
        )
        )
        ) AS total
FROM    t_bill b
LEFT JOIN
        t_regular r
ON      r.item = b.item

TOTAL
2794357806,0261
1 row fetched in 0.0001s (1.6718s)
SELECT STATEMENT 
 TABLE ACCESS BY INDEX ROWID, 20090620_price.T_SPECIAL
  INDEX UNIQUE SCAN, 20090620_price.PK_SPECIAL_ORDER
 SORT AGGREGATE
  HASH JOIN RIGHT OUTER
   TABLE ACCESS FULL, 20090620_price.T_REGULAR
   TABLE ACCESS FULL, 20090620_price.T_BILL

But this is even less efficient: it runs for almost 1.7 seconds.

This is because NVL always evaluates both arguments: it runs the subquery for each row in the resultset, even if the first argument is not a NULL.

This effectively turns the query into a plain NESTED LOOPS JOIN which is leff efficient for an unfiltered query.

But if we just replace NVL with COALESCE in the query above:

SELECT  SUM(
        COALESCE(
        r.price * b.quantity,
        (
        SELECT  price
        FROM    t_special s
        WHERE   s.bill = b.id
        )
        )
        ) AS total
FROM    t_bill b
LEFT JOIN
        t_regular r
ON      r.item = b.item

TOTAL
2794357806,0261
1 row fetched in 0.0001s (0.4562s)
SELECT STATEMENT 
 TABLE ACCESS BY INDEX ROWID, 20090620_price.T_SPECIAL
  INDEX UNIQUE SCAN, 20090620_price.PK_SPECIAL_ORDER
 SORT AGGREGATE
  HASH JOIN RIGHT OUTER
   TABLE ACCESS FULL, 20090620_price.T_REGULAR
   TABLE ACCESS FULL, 20090620_price.T_BILL

, we see that despite the plan remaining the same as for NVL, the query time decreased significantly: only 0.45 seconds: 4 times as fast as NVL, 3 times as fast as JOIN.

This is because the subquery is run not every time, but only when the first argument to COALESCE is a NULL.

This doesn't happen often, and the query completes much faster.

Written by Quassnoi

June 20th, 2009 at 11:00 pm

Posted in Oracle

2 Responses to 'NVL vs. COALESCE'

Subscribe to comments with RSS

  1. I like this answer but it makes me wonder, is there a case where NVL is better than COALESCE or is it just “easier” to understand and shorter to type?

    Hogan

    18 Feb 19 at 20:30

  2. @Hogan: the Oracle optimizer can expand the NVL into an OR condition so if evaluation of the arguments is not expensive NVL can be much faster. Something like mytable.field = NVL(:param, value) can be expanded into (:param IS NULL AND mytable.field = value) OR (mytable.field = :param) which might in fact work better.

    Quassnoi

    18 Feb 19 at 20:57

Leave a Reply