EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: 5 Claims About SQL, Explained. You're welcome to read and comment on it.

Generating XML in subqueries

with one comment

From Stack Overflow:

I have a query that looks like

SELECT  P.Column1, P.Column2, P.Column3,
        (
        SELECT  A.ColumnX, A.ColumnY
        FROM    dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A
        WHERE   A.Key = P.Key
        FOR XML AUTO, TYPE  
        ),
        (
        SELECT  B.ColumnX, B.ColumnY,
        FROM    dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B
        WHERE   B.Key = P.Key
        FOR XML AUTO, TYPE  
        )
FROM    (
        /* joined tables here */
        ) AS P
FOR XML AUTO, ROOT('ROOT')

P has about 5,000 rows.

A and B, about 4,000 rows each.

This query has a runtime performance of more than 10 minutes.

Changing it to this, however:

SELECT  P.Column1,
        P.Column2,
        P.Column3
INTO    #P
FROM    (
        /* joined tables here
        ) p

SELECT  A.ColumnX,
        A.ColumnY
INTO    #A     
FROM    dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A

SELECT  B.ColumnX, B.ColumnY
INTO    #B     
FROM    dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B

SELECT  P.Column1, P.Column2, P.Column3,
        (
        SELECT  A.ColumnX, A.ColumnY
        FROM    #A AS A
        WHERE   A.Key = P.Key
        FOR XML AUTO, TYPE  
        ),
        (
        SELECT  B.ColumnX, B.ColumnY,
        FROM    #B AS B
        WHERE   B.Key = P.Key
        FOR XML AUTO, TYPE  
        )
FROM    #P AS P
FOR XML AUTO, ROOT('ROOT')

, has a performance of about 4 seconds.

This makes not a lot of sense, as it would seem the cost to insert into a temp table and then do the join should be higher by default.

My inclination is that SQL is doing the wrong type of "join" with the subquery, but maybe I've missed it, there's no way to specify the join type to use with correlated subqueries.

Is there a way to achieve this without using #temptable's or @table_variable's via indexes and/or hints?

The reason for this is obvious: SQL Server needs to reevaluate the function each time the subquery is called, which is long.

When the subquery result is cached in the temp table, SQL Server will just build an index over this table in an Eager Spool, and use this index in the joins.

But is it possible to increase performance without using temporary tables?

Let's create sample functions, tables and see:

CREATE SCHEMA [20090528_tvf]
CREATE TABLE t_integer (
        num INT NOT NULL,
        CONSTRAINT PK_integer_num PRIMARY KEY (num)
)
GO
CREATE FUNCTION [20090528_tvf].fn_num(@maxval INT)
RETURNS @retval TABLE (num INT NOT NULL)
AS
BEGIN
        INSERT
        INTO    @retval
        SELECT  num + @maxval
        FROM    t_integer
        RETURN
END
GO
DECLARE @cnt INT;
BEGIN TRANSACTION
SET @cnt = 1;
WHILE @cnt <= 10000
BEGIN
        INSERT
        INTO    [20090528_tvf].t_integer
        VALUES  (@cnt)
        SET @cnt = @cnt + 1
END
COMMIT
GO

We have a table of 10,000 first integers and a table valued function.

The function returns a list of the integers from the table, offset by the input parameter.

This is what contained in the table [20090528_tvf].t_integer:

SELECT  TOP 10 *
FROM    [20090528_tvf].t_integer
num
1
2
3
4
5
6
7
8
9
10

And this is what [20090528_tvf].fn_num returns, being called with parameter equal 10,000:

SELECT  TOP 10 *
FROM    [20090528_tvf].fn_num(10000)
num
10001
10002
10003
10004
10005
10006
10007
10008
10009
10010
10 rows fetched in 0.0003s (0.0484s)

Now, let's try to repeat the original query.

We'll take a parameter of 9,900 so that there will be only 10 matching rows:

WITH    fo AS
        (
        SELECT  num
        FROM    [20090528_tvf].fn_num(9990) f
        )
SELECT  *
FROM    (
        SELECT  (
                SELECT  num
                FROM    fo
                WHERE   fo.num = i.num
                FOR XML AUTO, ELEMENTS ABSENT
                ) AS x
        FROM    [20090528_tvf].t_integer i
) q
WHERE   x IS NOT NULL
FOR XML AUTO

<ROOT><q x="<fo num="9991"/>"/><q x="<fo num="9992"/>"/><q x="<fo num="9993"/>"/><q x="<fo num="9994"/>"/><q x="<fo num="9995"/>"/><q x="<fo num="9996"/>"/><q x="<fo num="9997"/>"/><q x="<fo num="9998"/>"/><q x="<fo num="9999"/>"/><q x="<fo num="10000"/>"/></ROOT>
Table '#68336F3E'. Scan count 10000, logical reads 170000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 't_integer'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 8156 ms,  elapsed time = 8163 ms. 

As was promised, it's very long: more than 8 seconds.

This is because SQL Server reevaluates the function even being wrapped into the CTE.

How can we make SQL Server force caching the results of the subquery? Temporary tables are nice, but some ORM's, for instance, won't let user execute anything that is not a SELECT query.

One of the ways, which is not so elegant, but may be the only way in case when the temp tables are inavailable, is reusing the query plan.

We'll just need to inject Eager Spool between the resultset produced by the function.

Eager Spool takes all results of the appropriate query at once, and builds a temporary index over them in the tempdb.

The index is destroyed after the query completes.

Here's the way to do this:

WITH    fo AS
        (
        SELECT  num
        FROM    [20090528_tvf].fn_num(9990) f
        )
SELECT  *
FROM    (
        SELECT  (
                SELECT  num
                FROM    fo
                WHERE   fo.num = i.num
                FOR XML AUTO, ELEMENTS ABSENT
                ) AS x
        FROM    [20090528_tvf].t_integer i
) q
WHERE   x IS NOT NULL
FOR XML AUTO, ROOT('ROOT')
OPTION (USE PLAN N'
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0" Build="9.00.3042.00" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple>
          <StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" />
          <QueryPlan CachedPlanSize="12" CompileTime="1" CompileCPU="1" CompileMemory="128">
            <RelOp AvgRowSize="1055" EstimateCPU="0.001" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.108018">
              <OutputList>
                <ColumnReference Column="Expr1003" />
              </OutputList>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1003" />
                    <ScalarOperator ScalarString="[Expr1003]">
                      <Identifier>
                        <ColumnReference Column="Expr1003" />
                      </Identifier>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="1055" EstimateCPU="0.010001" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Sequence" NodeId="1" Parallel="false" PhysicalOp="Sequence" EstimatedTotalSubtreeCost="0.107018">
                  <OutputList>
                    <ColumnReference Column="Expr1003" />
                  </OutputList>
                  <Sequence>
                    <RelOp AvgRowSize="9" EstimateCPU="1.157E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Table-valued function" NodeId="2" Parallel="false" PhysicalOp="Table-valued function" EstimatedTotalSubtreeCost="1.157E-06">
                      <OutputList />
                      <TableValuedFunction>
                        <DefinedValues />
                        <Object Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" />
                        <ParameterList>
                          <ScalarOperator ScalarString="(9990)">
                            <Const ConstValue="(9990)" />
                          </ScalarOperator>
                        </ParameterList>
                      </TableValuedFunction>
                    </RelOp>
                    <RelOp AvgRowSize="1055" EstimateCPU="0.0418" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Inner Join" NodeId="4" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.0970159">
                      <OutputList>
                        <ColumnReference Column="Expr1003" />
                      </OutputList>
                      <NestedLoops Optimized="false">
                        <OuterReferences>
                          <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
                        </OuterReferences>
                        <RelOp AvgRowSize="11" EstimateCPU="0.011157" EstimateIO="0.0149769" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Clustered Index Scan" NodeId="5" Parallel="false" PhysicalOp="Clustered Index Scan" EstimatedTotalSubtreeCost="0.0261339">
                          <OutputList>
                            <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
                          </OutputList>
                          <IndexScan Ordered="false" ForcedIndex="false" NoExpandHint="false">
                            <DefinedValues>
                              <DefinedValue>
                                <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
                              </DefinedValue>
                            </DefinedValues>
                            <Object Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Index="[PK_integer_num]" Alias="[i]" />
                          </IndexScan>
                        </RelOp>
                        <RelOp AvgRowSize="1055" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="9999" EstimateRewinds="0" EstimateRows="1" LogicalOp="UDX" NodeId="6" Parallel="false" PhysicalOp="UDX" EstimatedTotalSubtreeCost="0.029082">
                          <OutputList>
                            <ColumnReference Column="Expr1003" />
                          </OutputList>
                          <MemoryFractions Input="0" Output="1" />
                          <Extension UDXName="FOR XML">
                            <DefinedValues>
                              <DefinedValue>
                                <ColumnReference Column="Expr1003" />
                              </DefinedValue>
                            </DefinedValues>
                            <UsedUDXColumns>
                              <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" />
                            </UsedUDXColumns>
                            <RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="5" Parallel="false" PhysicalOp="Index Spool" EstimatedTotalSubtreeCost="0">
                              <OutputList>
                                <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" />
                              </OutputList>
                              <Spool>
                                <SeekPredicate>
                                  <Prefix ScanType="EQ">
                                    <RangeColumns>
                                      <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" />
                                    </RangeColumns>
                                    <RangeExpressions>
                                      <ScalarOperator ScalarString="[test].[20090528_tvf].[t_integer].[num] as [i].[num]">
                                        <Identifier>
                                          <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
                                        </Identifier>
                                      </ScalarOperator>
                                    </RangeExpressions>
                                  </Prefix>
                                </SeekPredicate>
                                <RelOp AvgRowSize="11" EstimateCPU="7.96E-05" EstimateIO="0.0032035" EstimateRebinds="0" EstimateRewinds="9999" EstimateRows="1" LogicalOp="Table Scan" NodeId="7" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.014282">
                                  <OutputList>
                                    <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" />
                                  </OutputList>
                                  <TableScan Ordered="false" ForcedIndex="false" NoExpandHint="false">
                                    <DefinedValues>
                                      <DefinedValue>
                                        <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" />
                                      </DefinedValue>
                                    </DefinedValues>
                                    <Object Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" />
                                    <Predicate>
                                      <ScalarOperator ScalarString="[test].[20090528_tvf].[fn_num].[num] as [f].[num]=[test].[20090528_tvf].[t_integer].[num] as [i].[num]">
                                        <Compare CompareOp="EQ">
                                          <ScalarOperator>
                                            <Identifier>
                                              <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" />
                                            </Identifier>
                                          </ScalarOperator>
                                          <ScalarOperator>
                                            <Identifier>
                                              <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
                                            </Identifier>
                                          </ScalarOperator>
                                        </Compare>
                                      </ScalarOperator>
                                    </Predicate>
                                  </TableScan>
                                </RelOp>
                              </Spool>
                            </RelOp>
                          </Extension>
                        </RelOp>
                      </NestedLoops>
                    </RelOp>
                  </Sequence>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>
'
)

<ROOT><q x="<fo num="9991"/>"/><q x="<fo num="9992"/>"/><q x="<fo num="9993"/>"/><q x="<fo num="9994"/>"/><q x="<fo num="9995"/>"/><q x="<fo num="9996"/>"/><q x="<fo num="9997"/>"/><q x="<fo num="9998"/>"/><q x="<fo num="9999"/>"/><q x="<fo num="10000"/>"/></ROOT>
Table 'Worktable'. Scan count 10000, logical reads 40112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table '#68336F3E'. Scan count 1, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 't_integer'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 109 ms,  elapsed time = 110 ms. 

We just took the existing plan generated with sp_create_plan_guide and wrapped the RelOp that returns function rows into then following lines:

<RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="5" Parallel="false" PhysicalOp="Index Spool" EstimatedTotalSubtreeCost="0">
  <OutputList>
    <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" />
  </OutputList>
  <Spool>
    <SeekPredicate>
      <Prefix ScanType="EQ">
        <RangeColumns>
          <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" />
        </RangeColumns>
        <RangeExpressions>
          <ScalarOperator ScalarString="[test].[20090528_tvf].[t_integer].[num] as [i].[num]">
            <Identifier>
              <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" />
            </Identifier>
          </ScalarOperator>
        </RangeExpressions>
      </Prefix>
    </SeekPredicate>
    <!--

    Original RelOp here

    -->
  </Spool>
</RelOp>

This added an Eager Spool to the plan, and the query executes now in only 110 ms.

This is not the most elegant way, but it's a working solution to prevent a costly CTE from being reevaluated.

Written by Quassnoi

May 28th, 2009 at 11:00 pm

Posted in SQL Server

One Response to 'Generating XML in subqueries'

Subscribe to comments with RSS or TrackBack to 'Generating XML in subqueries'.

  1. Hi,

    This is an entertaining post, showing a novel use of manually-edited plan guides, but it is inaccurate in places.

    The query plan for the problematic execution shows a sequence operator. This calls the fn_num table-valued function *once* returning a fully populated table variable, @retval.

    The @retval table variable is then used in the main query branch on the second input to the sequence. It is this table variable that the manual spool adds an index to. The original inefficiency stems from scanning @retval 10,000 times.

    It is easier (and more supported) to simply add an index to @retval:

    ALTER FUNCTION [20090528_tvf].fn_num (@maxval integer)
    RETURNS @retval TABLE
    (
    num integer PRIMARY KEY
    )
    AS
    BEGIN
    INSERT @retval
    SELECT num + @maxval
    FROM t_integer
    RETURN
    END

    Adding the PRIMARY KEY constraint to the @retval declaration solves the performance problem shown in this post without a manual USE PLAN hint.

    The query is essentially a semi-join, which can be expressed using EXISTS, IN, or INTERSECT. For example:

    SELECT num
    FROM [20090528_tvf].t_integer AS ti
    INTERSECT
    SELECT num FROM [20090528_tvf].fn_num(9990) AS fn
    FOR XML PATH(‘fo’), ROOT(‘q’);

    There are still better solutions that use in-line functions instead, but I wanted to show that the manual index is not necessary here.

    The idea of manually editing an index spool into a plan guide is a very creative and interesting solution though!

    Paul

    Paul White

    28 Oct 12 at 14:54

Leave a Reply