EXPLAIN EXTENDED

How to create fast database queries

GROUP_CONCAT in SQL Server

with one comment

I'm finally back from my vacation. Tunisia's great: dates, Carthage, sea and stuff.

Now, to the questions.

Mahen asks:

Create a table called Group:

Group
id prodname
1 X
1 Y
1 Z
2 A
2 B
2 C

The resultset should look like this:

id prodname
1 X,Y,Z
2 A,B,C

Can you please help me to solve the above problem using a recursive CTE?

This is out good old friend, GROUP_CONCAT. It's an aggregate function that returns all strings within a group, concatenated. It's somewhat different from the other aggregate functions, because, first, dealing with the concatenated string can be quite a tedious task for the groups with lots of records (large strings tend to overflow), and, second, the result depends on the order of the arguments (which is normally not the case for the aggregate functions). It's not a part of a standard SQL and as for now is implemented only by MySQL with some extra vendor-specific keywords (like ORDER BY within the argument list).

This functionality, however, is often asked for and I have written some articles about implementing this in PostgreSQL and Oracle.

Now, let's see how to do it in SQL Server.

Usually, SQL Server's FOR XML clause is exploited to concatenate the strings. To do this, we obtain a list of group identifiers and for each group, retrieve all it's product names with a subquery appended with FOR XML PATH(''). This makes a single XML column out of the recordset:

WITH    q (id, prodname) AS
        (
        SELECT  1, 'X'
        UNION ALL
        SELECT  1, 'Y'
        UNION ALL
        SELECT  1, 'Z'
        UNION ALL
        SELECT  2, 'A'
        UNION ALL
        SELECT  2, 'B'
        UNION ALL
        SELECT  2, 'C'
        )
SELECT  *
FROM    (
        SELECT  DISTINCT id
        FROM    q
        ) qo
CROSS APPLY
        (
        SELECT  CASE ROW_NUMBER() OVER(ORDER BY prodname) WHEN 1 THEN '' ELSE ', ' END + qi.prodname
        FROM    q qi
        WHERE   qi.id = qo.id
        ORDER BY
                prodname
        FOR XML PATH ('')
        ) qi(r)
id r
1 X, Y, Z
2 A, B, C

This solution works, but converting to and from XML is not the best way to deal with the strings: things like ampersands, angle brackets, line feeds etc. get mangled and require some additional effort to cope with.

However, this functionality can really be implemented using a recursive CTE.

To do this, we need to do the following:

  1. Assign a group-wise row number and a group-wise count to each record (in required order)
  2. Select the first record (that with the row number 1) from each group in the anchor part of the CTE
  3. Recursively append the next record to the previous record. The next record can be obtained by joining on rn = rn + 1
  4. Finally, select the only last record from each group (whose row number is equal to the group-wise count). It will contain the accumulated string.

Here's how we do it:

WITH    q (id, prodname) AS
        (
        SELECT  1, 'X'
        UNION ALL
        SELECT  1, 'Y'
        UNION ALL
        SELECT  1, 'Z'
        UNION ALL
        SELECT  2, 'A'
        UNION ALL
        SELECT  2, 'B'
        UNION ALL
        SELECT  2, 'C'
        ),
        qs(id, prodname, rn, cnt) AS
        (
        SELECT  id, prodname,
                ROW_NUMBER() OVER (PARTITION BY id ORDER BY prodname),
                COUNT(*) OVER (PARTITION BY id)
        FROM    q
        ),
        t (id, prodname, gc, rn, cnt) AS
        (
        SELECT  id, prodname,
                CAST(prodname AS NVARCHAR(MAX)), rn, cnt
        FROM    qs
        WHERE   rn = 1
        UNION ALL
        SELECT  qs.id, qs.prodname,
                CAST(t.gc + ', ' + qs.prodname AS NVARCHAR(MAX)),
                qs.rn, qs.cnt
        FROM    t
        JOIN    qs
        ON      qs.id = t.id
                AND qs.rn = t.rn + 1
        )
SELECT  id, gc
FROM    t
WHERE   rn = cnt
OPTION (MAXRECURSION 0)
id gc
2 A, B, C
1 X, Y, Z

As we can see, this only deals with native NVARCHAR and is free from XML conversions.

Hope that helps.


I'm always glad to answer the questions regarding database queries.

Ask me a question

Written by Quassnoi

June 21st, 2010 at 11:00 pm

Posted in SQL Server

One Response to 'GROUP_CONCAT in SQL Server'

Subscribe to comments with RSS

  1. what if i have the follwing count of emp_id for job_id ??

    and wat to display all emp_id

    job_id employee
    AC_MGR 1
    AD_ASST 1
    AD_PRES 1
    AD_VP 2
    FI_MGR 1
    HR_REP 1
    IT_PROG 5
    MK_MAN 1
    MK_REP 1
    PR_REP 1
    PU_CLERK 5
    PU_MAN 1
    SA_MAN 5
    SA_REP 30
    SH_CLERK 20
    ST_CLERK 19
    ST_MAN 5

    pravin

    17 Sep 15 at 19:34

Leave a Reply