Oracle: nested SUM
From Stack Overflow:
Suppose I have this table
A B C D Datatypes are not important.
I want to do this:
SELECT a AS a1, b AS b1, c AS c1, ( SELECT SUM(d) FROM source WHERE a = a1 AND b = b1 ) AS total FROM source GROUP BY a, b, c, but I can't find a way (SQL Developer keeps complaining with
.)FROM
clause not foundIs there a way? Is it possible?
This is of course possible if we just alias the query and prepend the alias to the field:
SELECT a, b, c, ( SELECT SUM(d) FROM source si WHERE si.a = so.a AND si.b = so.b CONNECT BY 16 >= level ) FROM source so GROUP BY a, b, c
This works well on this sample set of data:
WITH source AS ( SELECT FLOOR(MOD(level - 1, 8) / 4) + 1 AS a, FLOOR(MOD(level - 1, 4) / 2) + 1 AS b, MOD(level - 1, 2) + 1 AS c, level AS d FROM dual CONNECT BY 16 >= level ) SELECT a, b, c, ( SELECT SUM(d) FROM source si WHERE si.a = so.a AND si.b = so.b ) FROM source so GROUP BY a, b, c
, but it needs to reevaluate the subquery for each group.
In Oracle, there is a better way to do this query: nesting an aggregate SUM
inside an analytical SUM
.
What this query does is in fact the following:
- Calculate the groupwise
SUM(d)
for each(a, b)
- Retrieve the distinct values of
a
,b
,c
- For each
(a, b, c)
, return theSUM(d)
for the corresponding(a, b)
This means that total
will be the same for each (a, b)
, thous the values of c
may differ.
However, the SUM
is associative, that is sum of groupwise sums is the same as the sum of separate values.
To calculate the SUM(d)
for a given (a, b)
we can calculate the partial sums for (a, b, c)
and add them together. Then we need return this value for each (a, b)
(which can be multiple).
To return an aggregate value along with each of the records that contribute to it, the analytical functions are used (those with OVER
clause).
It is widely known that analytical functions, as well as aggregate functions, cannot be nested.
However, one can nest an aggregate function within an analytical one. This is possible because aggregate functions work over the source-level values before GROUP BY
, while analytical ones work with SELECT
-level expressions (after the GROUP BY
)
Here's the query to do this:
WITH source AS ( SELECT FLOOR(MOD(level - 1, 8) / 4) + 1 AS a, FLOOR(MOD(level - 1, 4) / 2) + 1 AS b, MOD(level - 1, 2) + 1 AS c, level AS d FROM dual CONNECT BY level <= 16 ) SELECT a, b, c, SUM(SUM(d)) OVER (PARTITION BY a, b) FROM source GROUP BY a, b, c
This query returns the same results.
The innermost SUM
s are calculated (a, b, c)
-wise (like in GROUP BY
), while the outermost ones are calculated (a, b)
-wise (like in PARTITION BY
).
Since ordering by (a, b, c)
implies ordering by (a, b)
, both these groupings are performed with a single SORT
.
Excellent! I stumbled across this example (3 years after it was published) and it was exactly what I needed!
Paul
12 Feb 13 at 03:25