EXPLAIN EXTENDED

How to create fast database queries

Oracle: nested SUM

with one comment

From Stack Overflow:

Suppose I have this table

A B C D

Datatypes are not important.

I want to do this:

SELECT  a AS a1, b AS b1, c AS c1,
(
SELECT  SUM(d)
FROM    source
WHERE   a = a1
AND b = b1
) AS total
FROM    source
GROUP BY
a, b, c

, but I can't find a way (SQL Developer keeps complaining with FROM clause not found.)

Is there a way? Is it possible?

This is of course possible if we just alias the query and prepend the alias to the field:

SELECT  a, b, c,
(
SELECT  SUM(d)
FROM    source si
WHERE   si.a = so.a
AND si.b = so.b
CONNECT BY
16 >= level
)
FROM    source so
GROUP BY
a, b, c

This works well on this sample set of data:

WITH    source AS
(
SELECT  FLOOR(MOD(level - 1, 8) / 4) + 1 AS a,
FLOOR(MOD(level - 1, 4) / 2) + 1 AS b,
MOD(level - 1, 2) + 1 AS c,
level AS d
FROM    dual
CONNECT BY
16 >= level
)
SELECT  a, b, c,
(
SELECT  SUM(d)
FROM    source si
WHERE   si.a = so.a
AND si.b = so.b
)
FROM    source so
GROUP BY
a, b, c

 
View query details

, but it needs to reevaluate the subquery for each group.

In Oracle, there is a better way to do this query: nesting an aggregate SUM inside an analytical SUM.

What this query does is in fact the following:

  • Calculate the groupwise SUM(d) for each (a, b)
  • Retrieve the distinct values of a, b, c
  • For each (a, b, c), return the SUM(d) for the corresponding (a, b)

This means that total will be the same for each (a, b), thous the values of c may differ.

However, the SUM is associative, that is sum of groupwise sums is the same as the sum of separate values.

To calculate the SUM(d) for a given (a, b) we can calculate the partial sums for (a, b, c) and add them together. Then we need return this value for each (a, b) (which can be multiple).

To return an aggregate value along with each of the records that contribute to it, the analytical functions are used (those with OVER clause).

It is widely known that analytical functions, as well as aggregate functions, cannot be nested.

However, one can nest an aggregate function within an analytical one. This is possible because aggregate functions work over the source-level values before GROUP BY, while analytical ones work with SELECT-level expressions (after the GROUP BY)

Here's the query to do this:

WITH    source AS
(
SELECT  FLOOR(MOD(level - 1, 8) / 4) + 1 AS a,
FLOOR(MOD(level - 1, 4) / 2) + 1 AS b,
MOD(level - 1, 2) + 1 AS c,
level AS d
FROM    dual
CONNECT BY
level <= 16
        )
SELECT  a, b, c, SUM(SUM(d)) OVER (PARTITION BY a, b)
FROM    source
GROUP BY
        a, b, c

View query details

This query returns the same results.

The innermost SUMs are calculated (a, b, c)-wise (like in GROUP BY), while the outermost ones are calculated (a, b)-wise (like in PARTITION BY).

Since ordering by (a, b, c) implies ordering by (a, b), both these groupings are performed with a single SORT.

Written by Quassnoi

October 8th, 2009 at 11:00 pm

Posted in Oracle

One Response to 'Oracle: nested SUM'

Subscribe to comments with RSS

  1. Excellent! I stumbled across this example (3 years after it was published) and it was exactly what I needed!

    Paul

    12 Feb 13 at 03:25

Leave a Reply