SQL Server: Selecting records holding group-wise maximum (with ties) at EXPLAIN EXTENDED

SQL Server: Selecting records holding group-wise maximum (with ties)

Comments enabled. I *really* need your comment

A feedback on the yesterday's article on selecting records holding group-wise maximums in SQL Server.

Gonçalo asks:

Regarding the post I mention on the subject, wouldn't it be easier to obtain the result you're after using the SQL Server specific SELECT TOP x WITH TIES?

This clause (which is indeed specific to SQL Server) returns ties. TOP x WITH TIES means return TOP x records, plus all records holding the same values of the ORDER BY expressions as the last (x'th) record returned by the TOP.

With a little effort this can be used for one of the queries I wrote about yesterday, namely the one that returns all records holding MIN(orderer) within each group.

However, the efficiency of this solution varies greatly depending on the cardinality of both ordering and grouping columns.

Let's create a sample table:

Table creation details

CREATE SCHEMA [20091201_ties]
CREATE TABLE t_distinct (
id INT NOT NULL PRIMARY KEY,
orderer INT NOT NULL,
lorderer INT NOT NULL,
glow INT NOT NULL,
ghigh INT NOT NULL,
stuffing VARCHAR(200) NOT NULL
)
GO
CREATE INDEX ix_distinct_glow_id ON [20091201_ties].t_distinct (glow);
CREATE INDEX ix_distinct_ghigh_id ON [20091201_ties].t_distinct (ghigh);
CREATE INDEX ix_distinct_glow_orderer_id ON [20091201_ties].t_distinct (glow, orderer);
CREATE INDEX ix_distinct_ghigh_orderer_id ON [20091201_ties].t_distinct (ghigh, orderer);
CREATE INDEX ix_distinct_glow_lorderer_id ON [20091201_ties].t_distinct (glow, lorderer);
CREATE INDEX ix_distinct_ghigh_lorderer_id ON [20091201_ties].t_distinct (ghigh, lorderer);
BEGIN TRANSACTION
SELECT  RAND(20091201)
DECLARE @cnt INT
SET @cnt = 1
WHILE @cnt &lt;= 1000000
BEGIN
        INSERT
        INTO    [20091201_ties].t_distinct (id, orderer, lorderer, glow, ghigh, stuffing)
        VALUES  (
                @cnt,
                FLOOR(RAND() * 10) + 1,
                FLOOR(RAND() * 10000) + 1,
                (@cnt - 1) % 10 + 1,
                (@cnt - 1) % 10000 + 1,
                REPLICATE('*', 200)
                )
        SET @cnt = @cnt + 1
END
COMMIT
GO

Just like in the articles before that this table contains 1,000,000 records with the following fields:

id is the PRIMARY KEY
orderer is filled with random values from 1 to 10
glow is a low cardinality grouping field (10 distinct values)
ghigh is a high cardinality grouping field (10,000 distinct values)
stuffing is an asterisk-filled VARCHAR(200) column added to emulate payload of the actual tables

Additionally, there is one more field: lorderer. Just like orderer, it is filled with random values, but has much greater cardinality: the values are 1 to 10,000.

Now, let's try to select the records holding MIN(orderer) using the two approaches.

Low cardinality orderers

Using DENSE_RANK

SELECT  COUNT(*) AS cnt, SUM(LEN(stuffing)) AS psum
FROM    (
SELECT  d.*, DENSE_RANK() OVER (PARTITION BY glow ORDER BY orderer) AS dr
FROM    [20091201_ties].t_distinct d
) dd
WHERE   dr = 1

View query details

cnt	psum
99558	19911600
1 row fetched in 0.0000s (41.0302s)

Table 't_distinct'. Scan count 3, logical reads 34423, physical reads 141, read-ahead reads 31072, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 5765 ms,  elapsed time = 41031 ms.

  |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[globalagg1007],0), [Expr1004]=CASE WHEN [globalagg1009]=(0) THEN NULL ELSE [globalagg1011] END))
       |--Stream Aggregate(DEFINE:([globalagg1007]=SUM([partialagg1006]), [globalagg1009]=SUM([partialagg1008]), [globalagg1011]=SUM([partialagg1010])))
            |--Parallelism(Gather Streams)
                 |--Stream Aggregate(DEFINE:([partialagg1006]=Count(*), [partialagg1008]=COUNT_BIG([Expr1005]), [partialagg1010]=SUM([Expr1005])))
                      |--Filter(WHERE:([Expr1002]=(1)))
                           |--Compute Scalar(DEFINE:([Expr1005]=len([test].[20091201_ties].[t_distinct].[stuffing] as [d].[stuffing])))
                                |--Parallelism(Distribute Streams, RoundRobin Partitioning)
                                     |--Sequence Project(DEFINE:([Expr1002]=dense_rank))
                                          |--Segment
                                               |--Segment
                                                    |--Parallelism(Gather Streams, ORDER BY:([d].[glow] ASC, [d].[orderer] ASC))
                                                         |--Sort(ORDER BY:([d].[glow] ASC, [d].[orderer] ASC))
                                                              |--Clustered Index Scan(OBJECT:([test].[20091201_ties].[t_distinct].[PK__t_distinct__7EB7AD3A] AS [d]))

As in the yesterday's table, this query runs for 41 second. The plan used by the engine involves partitioning, sorting and gathering the streams, but the real problem is that the table is large and a fullscan is used. Just too much data to be read.

Now, let's try the same query using CROSS APPLY / TOP … WITH TIES.

Using CROSS APPLY / TOP … WITH TIES

SELECT  COUNT(*) AS cnt, SUM(LEN(stuffing)) AS psum
FROM    (
SELECT  DISTINCT glow
FROM    [20091201_ties].t_distinct d
) dd
CROSS APPLY
(
SELECT  TOP 1 WITH TIES
stuffing
FROM    [20091201_ties].t_distinct di
WHERE   di.glow = dd.glow
ORDER BY
orderer
) ds

View query details

cnt	psum
99558	19911600
1 row fetched in 0.0000s (116.9189s)

Table 't_distinct'. Scan count 11, logical reads 456326, physical reads 1281, read-ahead reads 67626, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 1687 ms,  elapsed time = 116924 ms.

  |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1015],0), [Expr1005]=CASE WHEN [Expr1016]=(0) THEN NULL ELSE [Expr1017] END))
       |--Stream Aggregate(DEFINE:([Expr1015]=Count(*), [Expr1016]=COUNT_BIG([Expr1006]), [Expr1017]=SUM([Expr1006])))
            |--Nested Loops(Inner Join, OUTER REFERENCES:([d].[glow]))
                 |--Stream Aggregate(GROUP BY:([d].[glow]))
                 |    |--Index Scan(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_id] AS [d]), ORDERED FORWARD)
                 |--Top(TOP EXPRESSION:((1)))
                      |--Compute Scalar(DEFINE:([Expr1006]=len([test].[20091201_ties].[t_distinct].[stuffing] as [di].[stuffing])))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([di].[id], [Expr1014]) WITH ORDERED PREFETCH)
                                |--Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_orderer_id] AS [di]), SEEK:([di].[glow]=[test].[20091201_ties].[t_distinct].[glow] as [d].[glow]) ORDERED FORWARD)
                                |--Clustered Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[PK__t_distinct__7EB7AD3A] AS [di]), SEEK:([di].[id]=[test].[20091201_ties].[t_distinct].[id] as [di].[id]) LOOKUP ORDERED FORWARD)

This query runs for… um, 117 seconds. Thrice as slow.

Again: there are too many records. The TOP uses the index to return first rows, which is normally fast but not when there are 10,000 records to return. Using the index implies the need to lookup the values of stuffing (which we use in this query) in the actual table. You can see it in the plan above as a Nested Loops branch which joins the index (Index Seek) and the actual table (Clustered Index Seek). The overhead required for joining the index and the table by far outweighs all benefits of having preordered data.

However, this is only true for the orderers with low cardinality.

High cardinality orderers

Let's try the same approaches but this time let's use them to return the records holding MIN(lorderer). This field has much higher cardinality and there are few records per group that hold the MIN(lorderer) value.

Using DENSE_RANK

SELECT  COUNT(*) AS cnt, SUM(LEN(stuffing)) AS psum
FROM    (
SELECT  d.*, DENSE_RANK() OVER (PARTITION BY glow ORDER BY lorderer) AS dr
FROM    [20091201_ties].t_distinct d
) dd
WHERE   dr = 1

View query details

cnt	psum
103	20600
1 row fetched in 0.0000s (40.7021s)

Table 't_distinct'. Scan count 3, logical reads 34423, physical reads 32, read-ahead reads 26743, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 6249 ms,  elapsed time = 40703 ms.

  |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[globalagg1007],0), [Expr1004]=CASE WHEN [globalagg1009]=(0) THEN NULL ELSE [globalagg1011] END))
       |--Stream Aggregate(DEFINE:([globalagg1007]=SUM([partialagg1006]), [globalagg1009]=SUM([partialagg1008]), [globalagg1011]=SUM([partialagg1010])))
            |--Parallelism(Gather Streams)
                 |--Stream Aggregate(DEFINE:([partialagg1006]=Count(*), [partialagg1008]=COUNT_BIG([Expr1005]), [partialagg1010]=SUM([Expr1005])))
                      |--Filter(WHERE:([Expr1002]=(1)))
                           |--Compute Scalar(DEFINE:([Expr1005]=len([test].[20091201_ties].[t_distinct].[stuffing] as [d].[stuffing])))
                                |--Parallelism(Distribute Streams, RoundRobin Partitioning)
                                     |--Sequence Project(DEFINE:([Expr1002]=dense_rank))
                                          |--Segment
                                               |--Segment
                                                    |--Parallelism(Gather Streams, ORDER BY:([d].[glow] ASC, [d].[lorderer] ASC))
                                                         |--Sort(ORDER BY:([d].[glow] ASC, [d].[lorderer] ASC))
                                                              |--Clustered Index Scan(OBJECT:([test].[20091201_ties].[t_distinct].[PK__t_distinct__7EB7AD3A] AS [d]))

40 seconds.

The performance is just as poor. Since we have exactly same full table scan with the exactly same million rows to sort, it's quite predictable.

Using CROSS APPLY / TOP … WITH TIES

SELECT  COUNT(*) AS cnt, SUM(LEN(stuffing)) AS psum
FROM    (
SELECT  DISTINCT glow
FROM    [20091201_ties].t_distinct d
) dd
CROSS APPLY
(
SELECT  TOP 1 WITH TIES
stuffing
FROM    [20091201_ties].t_distinct di
WHERE   di.glow = dd.glow
ORDER BY
lorderer
) ds

View query details

cnt	psum
103	20600
1 row fetched in 0.0002s (0.2499s)

Table 't_distinct'. Scan count 11, logical reads 3506, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 250 ms,  elapsed time = 250 ms.

  |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1015],0), [Expr1005]=CASE WHEN [Expr1016]=(0) THEN NULL ELSE [Expr1017] END))
       |--Stream Aggregate(DEFINE:([Expr1015]=Count(*), [Expr1016]=COUNT_BIG([Expr1006]), [Expr1017]=SUM([Expr1006])))
            |--Nested Loops(Inner Join, OUTER REFERENCES:([d].[glow]))
                 |--Stream Aggregate(GROUP BY:([d].[glow]))
                 |    |--Index Scan(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_id] AS [d]), ORDERED FORWARD)
                 |--Top(TOP EXPRESSION:((1)))
                      |--Compute Scalar(DEFINE:([Expr1006]=len([test].[20091201_ties].[t_distinct].[stuffing] as [di].[stuffing])))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([di].[id], [Expr1014]) WITH ORDERED PREFETCH)
                                |--Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_lorderer_id] AS [di]), SEEK:([di].[glow]=[test].[20091201_ties].[t_distinct].[glow] as [d].[glow]) ORDERED FORWARD)
                                |--Clustered Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[PK__t_distinct__7EB7AD3A] AS [di]), SEEK:([di].[id]=[test].[20091201_ties].[t_distinct].[id] as [di].[id]) LOOKUP ORDERED FORWARD)

Now, it's completely other story. With this ordering condition, we have 10 ties per group (instead of 10,000). The index is a real benefit now, and the query completes in only 250 ms.

If we replace DISTINCT with a recursive CTE (the approach described in the previous article), the query can be improved yet a little more:

WITH    rows AS
(
SELECT  MIN(glow) AS glow
FROM    [20091201_ties].t_distinct
UNION ALL
SELECT  glow
FROM    (
SELECT  di.glow, ROW_NUMBER() OVER (ORDER BY di.glow) AS rn
FROM    rows r
JOIN    [20091130_groupwise].t_distinct di
ON      di.glow &gt; r.glow
WHERE   r.glow IS NOT NULL
) q
WHERE   rn = 1
)
SELECT  COUNT(*) AS cnt, SUM(LEN(stuffing)) AS psum
FROM    rows r
CROSS APPLY
(
SELECT  TOP 1 WITH TIES
stuffing
FROM    [20091201_ties].t_distinct di
WHERE   di.glow = r.glow
ORDER BY
lorderer
) ds

View query details

cnt	psum
103	20600
1 row fetched in 0.0002s (0.0051s)

Table 't_distinct'. Scan count 11, logical reads 388, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'Worktable'. Scan count 2, logical reads 62, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 't_distinct'. Scan count 10, logical reads 65, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 5 ms.

  |--Compute Scalar(DEFINE:([Expr1011]=CONVERT_IMPLICIT(int,[Expr1030],0), [Expr1012]=CASE WHEN [Expr1031]=(0) THEN NULL ELSE [Expr1032] END))
       |--Stream Aggregate(DEFINE:([Expr1030]=Count(*), [Expr1031]=COUNT_BIG([Expr1014]), [Expr1032]=SUM([Expr1014])))
            |--Nested Loops(Inner Join, OUTER REFERENCES:([Recr1008]))
                 |--Index Spool(WITH STACK)
                 |    |--Concatenation
                 |         |--Compute Scalar(DEFINE:([Expr1025]=(0)))
                 |         |    |--Stream Aggregate(DEFINE:([Expr1003]=MIN([test].[20091201_ties].[t_distinct].[glow])))
                 |         |         |--Top(TOP EXPRESSION:((1)))
                 |         |              |--Index Scan(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_id]), ORDERED FORWARD)
                 |         |--Assert(WHERE:(CASE WHEN [Expr1027]>(100) THEN (0) ELSE NULL END))
                 |              |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1027], [Recr1004]))
                 |                   |--Compute Scalar(DEFINE:([Expr1027]=[Expr1026]+(1)))
                 |                   |    |--Table Spool(WITH STACK)
                 |                   |--Filter(WHERE:([Expr1007]=(1)))
                 |                        |--Top(TOP EXPRESSION:(CASE WHEN (1) IS NULL OR (1)<(0) THEN (0) ELSE (1) END))
                 |                             |--Sequence Project(DEFINE:([Expr1007]=row_number))
                 |                                  |--Compute Scalar(DEFINE:([Expr1024]=(1)))
                 |                                       |--Segment
                 |                                            |--Filter(WHERE:(STARTUP EXPR([Recr1004] IS NOT NULL)))
                 |                                                 |--Index Seek(OBJECT:([test].[20091130_groupwise].[t_distinct].[ix_distinct_glow_id] AS [di]), SEEK:([di].[glow] > [Recr1004]) ORDERED FORWARD)
                 |--Top(TOP EXPRESSION:((1)))
                      |--Compute Scalar(DEFINE:([Expr1014]=len([test].[20091201_ties].[t_distinct].[stuffing] as [di].[stuffing])))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([di].[id], [Expr1029]) WITH ORDERED PREFETCH)
                                |--Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[ix_distinct_glow_lorderer_id] AS [di]), SEEK:([di].[glow]=[Recr1008]) ORDERED FORWARD)
                                |--Clustered Index Seek(OBJECT:([test].[20091201_ties].[t_distinct].[PK__t_distinct__7EB7AD3A] AS [di]), SEEK:([di].[id]=[test].[20091201_ties].[t_distinct].[id] as [di].[id]) LOOKUP ORDERED FORWARD)

5 ms.

Now that's what I call performance.

Summary

To retrieve all records holding group-wise minimums or maximums (including ties), two approaches can be used in SQL Server:

Use DENSE_RANK partitioned by the grouper and return the rows holding the dense rank of 1
Use DISTINCT or a recursive CTE to return a list of groupers and CROSS APPLY a query with TOP 1 WITH TIES to each grouper

The first method requires sorting on (grouper, orderer), while the second one requires index scan (and implied index lookups) for each grouper.

Sorting the large amounts of data is more efficient than using an index lookup to retrieve them. However, index scan processes only the portion of data (within the tie) while sorting requires them all.

For low cardinality orderers (the values inside the MIN and MAX), too much records comprise a tie, and index scan is less efficient than a table scan with sorting. For these orderers, DENSE_RANK solution is more efficient.

For high cardinality orderers, the index scan returns only a small portion of records and it's almost instant. For these orderers, a CROSS APPLY / TOP 1 WITH TIES is more efficient.

Hope that helps.

I'm always glad to answer the questions regarding database queries.

Ask me a question

Written by Quassnoi

December 1st, 2009 at 11:00 pm

Posted in SQL Server

EXPLAIN EXTENDED