GROUP_CONCAT in Oracle 10g
MySQL has a nice aggregate function called GROUP_CONCAT
, which concatenates all strings in a group in given order, separating them with a given separator.
In one of the previous articles, I wrote about emulating this function in PostgreSQL.
Now, we'll try to emulate this function in Oracle 10g.
There are numerous solutions using Oracle's hierarchical function SYS_CONNECT_BY_PATH
, but they're bad in the following ways:
- Not all queries can be rewritten using
CONNECT BY
- Even if they can,
CONNECT BY
performs poorly on some conditions
Instead, we will emulate this function using Oracle's MODEL
clause.
MODEL
clause is a nice feature of Oracle 10g that allows you to work with a resultset as if it were an Excel workbook. You can basically do things like this:
SELECT value, FORMULA('=R[-3]C[-1]') AS value_three_rows_back FROM mytable
Note: the code above is not a real query, this is just a human-readable illustration of what can be done, to give you the idea. The real syntax is more complex.
Now, let's generate some sample data. I will not create and fill real tables, instead I'll just use a generated resultset to demonstrate this solution's ability to work on subqueries as well.
WITH q AS ( SELECT grouper, val, ROW_NUMBER() OVER (PARTITION BY grouper ORDER BY rno) AS rn FROM ( SELECT 'test ' || level AS val, FLOOR((level - 1) / 5) AS grouper, rownum AS rno FROM dual CONNECT BY level <= 100 ) qi ) SELECT * FROM q
GROUPER | VAL | RN | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | test 1 | 1 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0 | test 2 | 2 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0 | test 3 | 3 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0 | test 4 | 4 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0 | test 5 | 5 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | test 6 | 1 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | test 7 | 2 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
18 | test 94 | 4 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
18 | test 95 | 5 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
19 | test 96 | 1 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
19 | test 97 | 2 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
19 | test 98 | 3 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
19 | test 99 | 4 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
19 | test 100 | 5 |
We have a set of grouped and ordered text values which we need to join using a given separator.
Here's the Excel algorithm to do it. For the principle to be better understood, we are treating our resultset as an Excel workbook.
- Partition the result into several worksheets, putting values from each group into the first and the second columns of the group's worksheet.
A B C text 1 text 1 text 2 text 2 text 3 text 3 text 4 text 4 text 5 text 5 A B C text 6 text 6 text 7 text 7 text 8 text 8 text 9 text 9 text 10 text 10 - On each worksheet, put a concatenation formula to all but the first row in the second column:
A B C text 1 text1 text 2 =R[-1]C & ', ' & RC[-1] text 3 =R[-1]C & ', ' & RC[-1] text 4 =R[-1]C & ', ' & RC[-1] text 5 =R[-1]C & ', ' & RC[-1] - Finally, put a formula to each row of the third column that check if it's a last row of the worksheet:
A B C text 1 text1 =ISBLANK(R[1]C[-1]) text 2 =R[-1]C & ', ' & RC[-1] =ISBLANK(R[1]C[-1]) text 3 =R[-1]C & ', ' & RC[-1] =ISBLANK(R[1]C[-1]) text 4 =R[-1]C & ', ' & RC[-1] =ISBLANK(R[1]C[-1]) text 5 =R[-1]C & ', ' & RC[-1] =ISBLANK(R[1]C[-1])
From all this, we get the following values:
A | B | C |
---|---|---|
text 1 | text1 | 1 |
text 2 | text1, text2 | 1 |
text 3 | text1, text2, text3 | 1 |
text 4 | text1, text2, text3, text4 | 1 |
text 5 | text1, text2, text3, text4, text5 | 0 |
Now, we just need to select the last row from each worksheet (it's marked with a FALSE
in the third column).
So, how do we do it in Oracle?
We use the Oracle's MODEL
clause.
PARTITION
clause in it defines worksheets, DIMENSION BY
defines row numbers, MEASURES
defines columns.
If Oracle encounters a row with grouper = 5
and rn = 3
, it puts the corresponding values into the 5th worksheet on the 3rd row: val
to the first column (called val
), val
again to the second column (called group_concat
), and a 0
to the third one (called mark
).
RULES UPDATE
define the actions we described in the steps above. They are quite self-explanatory, and can be seen in detail in the Excel tables above.
Finally, we select group_concat
(which contains the concatenated values) from the rows that have mark = 1
(that is, last row of each worksheet).
Here's the query and the result:
WITH q AS ( SELECT qi.*, ROW_NUMBER() OVER (PARTITION BY grouper ORDER BY rno) AS rn FROM ( SELECT 'test ' || level AS val, FLOOR((level - 1) / 5) AS grouper, rownum AS rno FROM dual CONNECT BY level <= 100 ) qi ) SELECT * FROM ( SELECT * FROM q MODEL PARTITION BY (grouper) DIMENSION BY (rn) MEASURES (val, val AS group_concat, 0 AS mark) RULES UPDATE ( group_concat[rn > 1] ORDER BY rn = group_concat[CV() - 1] || ', ' || val[CV()], mark[ANY] ORDER BY rn = PRESENTV(mark[CV() + 1], 0, 1) ) ) WHERE mark = 1 ORDER BY grouper
GROUPER | GROUP_CONCAT |
---|---|
0 | test 1, test 2, test 3, test 4, test 5 |
1 | test 6, test 7, test 8, test 9, test 10 |
2 | test 11, test 12, test 13, test 14, test 15 |
3 | test 16, test 17, test 18, test 19, test 20 |
4 | test 21, test 22, test 23, test 24, test 25 |
5 | test 26, test 27, test 28, test 29, test 30 |
6 | test 31, test 32, test 33, test 34, test 35 |
7 | test 36, test 37, test 38, test 39, test 40 |
8 | test 41, test 42, test 43, test 44, test 45 |
9 | test 46, test 47, test 48, test 49, test 50 |
10 | test 51, test 52, test 53, test 54, test 55 |
11 | test 56, test 57, test 58, test 59, test 60 |
12 | test 61, test 62, test 63, test 64, test 65 |
13 | test 66, test 67, test 68, test 69, test 70 |
14 | test 71, test 72, test 73, test 74, test 75 |
15 | test 76, test 77, test 78, test 79, test 80 |
16 | test 81, test 82, test 83, test 84, test 85 |
17 | test 86, test 87, test 88, test 89, test 90 |
18 | test 91, test 92, test 93, test 94, test 95 |
19 | test 96, test 97, test 98, test 99, test 100 |
The query returns the records concatenated group-wise, just like MySQL's GROUP_CONCAT
does.
This is awesome. Brilliant!
kftgr
25 Jan 13 at 05:34