Generating XML in subqueries
From Stack Overflow:
I have a query that looks like
SELECT P.Column1, P.Column2, P.Column3, ( SELECT A.ColumnX, A.ColumnY FROM dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A WHERE A.Key = P.Key FOR XML AUTO, TYPE ), ( SELECT B.ColumnX, B.ColumnY, FROM dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B WHERE B.Key = P.Key FOR XML AUTO, TYPE ) FROM ( /* joined tables here */ ) AS P FOR XML AUTO, ROOT('ROOT')
P
has about 5,000 rows.
A
andB
, about 4,000 rows each.This query has a runtime performance of more than 10 minutes.
Changing it to this, however:
SELECT P.Column1, P.Column2, P.Column3 INTO #P FROM ( /* joined tables here ) p SELECT A.ColumnX, A.ColumnY INTO #A FROM dbo.TableReturningFunc1(@StaticParam1, @StaticParam2) AS A SELECT B.ColumnX, B.ColumnY INTO #B FROM dbo.TableReturningFunc2(@StaticParam1, @StaticParam2) AS B SELECT P.Column1, P.Column2, P.Column3, ( SELECT A.ColumnX, A.ColumnY FROM #A AS A WHERE A.Key = P.Key FOR XML AUTO, TYPE ), ( SELECT B.ColumnX, B.ColumnY, FROM #B AS B WHERE B.Key = P.Key FOR XML AUTO, TYPE ) FROM #P AS P FOR XML AUTO, ROOT('ROOT'), has a performance of about 4 seconds.
This makes not a lot of sense, as it would seem the cost to insert into a temp table and then do the join should be higher by default.
My inclination is that SQL is doing the wrong type of "join" with the subquery, but maybe I've missed it, there's no way to specify the join type to use with correlated subqueries.
Is there a way to achieve this without using
#temptable
's or@table_variable
's via indexes and/or hints?
The reason for this is obvious: SQL Server needs to reevaluate the function each time the subquery is called, which is long.
When the subquery result is cached in the temp table, SQL Server will just build an index over this table in an Eager Spool
, and use this index in the joins.
But is it possible to increase performance without using temporary tables?
Let's create sample functions, tables and see:
CREATE SCHEMA [20090528_tvf] CREATE TABLE t_integer ( num INT NOT NULL, CONSTRAINT PK_integer_num PRIMARY KEY (num) ) GO CREATE FUNCTION [20090528_tvf].fn_num(@maxval INT) RETURNS @retval TABLE (num INT NOT NULL) AS BEGIN INSERT INTO @retval SELECT num + @maxval FROM t_integer RETURN END GO DECLARE @cnt INT; BEGIN TRANSACTION SET @cnt = 1; WHILE @cnt <= 10000 BEGIN INSERT INTO [20090528_tvf].t_integer VALUES (@cnt) SET @cnt = @cnt + 1 END COMMIT GO
We have a table of 10,000 first integers and a table valued function.
The function returns a list of the integers from the table, offset by the input parameter.
This is what contained in the table [20090528_tvf].t_integer
:
SELECT TOP 10 * FROM [20090528_tvf].t_integer
num |
---|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
And this is what [20090528_tvf].fn_num
returns, being called with parameter equal 10,000:
SELECT TOP 10 * FROM [20090528_tvf].fn_num(10000)
num | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10001 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10002 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10003 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10004 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10005 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10006 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10007 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10008 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10009 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10010 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10 rows fetched in 0.0003s (0.0484s) |
Now, let's try to repeat the original query.
We'll take a parameter of 9,900
so that there will be only 10 matching rows:
WITH fo AS ( SELECT num FROM [20090528_tvf].fn_num(9990) f ) SELECT * FROM ( SELECT ( SELECT num FROM fo WHERE fo.num = i.num FOR XML AUTO, ELEMENTS ABSENT ) AS x FROM [20090528_tvf].t_integer i ) q WHERE x IS NOT NULL FOR XML AUTO
<ROOT><q x="<fo num="9991"/>"/><q x="<fo num="9992"/>"/><q x="<fo num="9993"/>"/><q x="<fo num="9994"/>"/><q x="<fo num="9995"/>"/><q x="<fo num="9996"/>"/><q x="<fo num="9997"/>"/><q x="<fo num="9998"/>"/><q x="<fo num="9999"/>"/><q x="<fo num="10000"/>"/></ROOT>
Table '#68336F3E'. Scan count 10000, logical reads 170000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 't_integer'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 8156 ms, elapsed time = 8163 ms.
As was promised, it's very long: more than 8 seconds.
This is because SQL Server reevaluates the function even being wrapped into the CTE.
How can we make SQL Server force caching the results of the subquery? Temporary tables are nice, but some ORM
's, for instance, won't let user execute anything that is not a SELECT
query.
One of the ways, which is not so elegant, but may be the only way in case when the temp tables are inavailable, is reusing the query plan.
We'll just need to inject Eager Spool
between the resultset produced by the function.
Eager Spool
takes all results of the appropriate query at once, and builds a temporary index over them in the tempdb.
The index is destroyed after the query completes.
Here's the way to do this:
WITH fo AS ( SELECT num FROM [20090528_tvf].fn_num(9990) f ) SELECT * FROM ( SELECT ( SELECT num FROM fo WHERE fo.num = i.num FOR XML AUTO, ELEMENTS ABSENT ) AS x FROM [20090528_tvf].t_integer i ) q WHERE x IS NOT NULL FOR XML AUTO, ROOT('ROOT') OPTION (USE PLAN N' <ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0" Build="9.00.3042.00" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"> <BatchSequence> <Batch> <Statements> <StmtSimple> <StatementSetOptions ANSI_NULLS="false" ANSI_PADDING="false" ANSI_WARNINGS="false" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="false" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="false" /> <QueryPlan CachedPlanSize="12" CompileTime="1" CompileCPU="1" CompileMemory="128"> <RelOp AvgRowSize="1055" EstimateCPU="0.001" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.108018"> <OutputList> <ColumnReference Column="Expr1003" /> </OutputList> <ComputeScalar> <DefinedValues> <DefinedValue> <ColumnReference Column="Expr1003" /> <ScalarOperator ScalarString="[Expr1003]"> <Identifier> <ColumnReference Column="Expr1003" /> </Identifier> </ScalarOperator> </DefinedValue> </DefinedValues> <RelOp AvgRowSize="1055" EstimateCPU="0.010001" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Sequence" NodeId="1" Parallel="false" PhysicalOp="Sequence" EstimatedTotalSubtreeCost="0.107018"> <OutputList> <ColumnReference Column="Expr1003" /> </OutputList> <Sequence> <RelOp AvgRowSize="9" EstimateCPU="1.157E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Table-valued function" NodeId="2" Parallel="false" PhysicalOp="Table-valued function" EstimatedTotalSubtreeCost="1.157E-06"> <OutputList /> <TableValuedFunction> <DefinedValues /> <Object Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" /> <ParameterList> <ScalarOperator ScalarString="(9990)"> <Const ConstValue="(9990)" /> </ScalarOperator> </ParameterList> </TableValuedFunction> </RelOp> <RelOp AvgRowSize="1055" EstimateCPU="0.0418" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Inner Join" NodeId="4" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.0970159"> <OutputList> <ColumnReference Column="Expr1003" /> </OutputList> <NestedLoops Optimized="false"> <OuterReferences> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </OuterReferences> <RelOp AvgRowSize="11" EstimateCPU="0.011157" EstimateIO="0.0149769" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="10000" LogicalOp="Clustered Index Scan" NodeId="5" Parallel="false" PhysicalOp="Clustered Index Scan" EstimatedTotalSubtreeCost="0.0261339"> <OutputList> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </OutputList> <IndexScan Ordered="false" ForcedIndex="false" NoExpandHint="false"> <DefinedValues> <DefinedValue> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </DefinedValue> </DefinedValues> <Object Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Index="[PK_integer_num]" Alias="[i]" /> </IndexScan> </RelOp> <RelOp AvgRowSize="1055" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="9999" EstimateRewinds="0" EstimateRows="1" LogicalOp="UDX" NodeId="6" Parallel="false" PhysicalOp="UDX" EstimatedTotalSubtreeCost="0.029082"> <OutputList> <ColumnReference Column="Expr1003" /> </OutputList> <MemoryFractions Input="0" Output="1" /> <Extension UDXName="FOR XML"> <DefinedValues> <DefinedValue> <ColumnReference Column="Expr1003" /> </DefinedValue> </DefinedValues> <UsedUDXColumns> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" /> </UsedUDXColumns> <RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="5" Parallel="false" PhysicalOp="Index Spool" EstimatedTotalSubtreeCost="0"> <OutputList> <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" /> </OutputList> <Spool> <SeekPredicate> <Prefix ScanType="EQ"> <RangeColumns> <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" /> </RangeColumns> <RangeExpressions> <ScalarOperator ScalarString="[test].[20090528_tvf].[t_integer].[num] as [i].[num]"> <Identifier> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </Identifier> </ScalarOperator> </RangeExpressions> </Prefix> </SeekPredicate> <RelOp AvgRowSize="11" EstimateCPU="7.96E-05" EstimateIO="0.0032035" EstimateRebinds="0" EstimateRewinds="9999" EstimateRows="1" LogicalOp="Table Scan" NodeId="7" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.014282"> <OutputList> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" /> </OutputList> <TableScan Ordered="false" ForcedIndex="false" NoExpandHint="false"> <DefinedValues> <DefinedValue> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" /> </DefinedValue> </DefinedValues> <Object Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" /> <Predicate> <ScalarOperator ScalarString="[test].[20090528_tvf].[fn_num].[num] as [f].[num]=[test].[20090528_tvf].[t_integer].[num] as [i].[num]"> <Compare CompareOp="EQ"> <ScalarOperator> <Identifier> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[fn_num]" Alias="[f]" Column="num" /> </Identifier> </ScalarOperator> <ScalarOperator> <Identifier> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </Identifier> </ScalarOperator> </Compare> </ScalarOperator> </Predicate> </TableScan> </RelOp> </Spool> </RelOp> </Extension> </RelOp> </NestedLoops> </RelOp> </Sequence> </RelOp> </ComputeScalar> </RelOp> </QueryPlan> </StmtSimple> </Statements> </Batch> </BatchSequence> </ShowPlanXML> ' )
<ROOT><q x="<fo num="9991"/>"/><q x="<fo num="9992"/>"/><q x="<fo num="9993"/>"/><q x="<fo num="9994"/>"/><q x="<fo num="9995"/>"/><q x="<fo num="9996"/>"/><q x="<fo num="9997"/>"/><q x="<fo num="9998"/>"/><q x="<fo num="9999"/>"/><q x="<fo num="10000"/>"/></ROOT>
Table 'Worktable'. Scan count 10000, logical reads 40112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#68336F3E'. Scan count 1, logical reads 17, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 't_integer'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 109 ms, elapsed time = 110 ms.
We just took the existing plan generated with sp_create_plan_guide
and wrapped the RelOp
that returns function rows into then following lines:
<RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="1" LogicalOp="Eager Spool" NodeId="5" Parallel="false" PhysicalOp="Index Spool" EstimatedTotalSubtreeCost="0"> <OutputList> <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" /> </OutputList> <Spool> <SeekPredicate> <Prefix ScanType="EQ"> <RangeColumns> <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#f]" Alias="[fi]" Column="num" /> </RangeColumns> <RangeExpressions> <ScalarOperator ScalarString="[test].[20090528_tvf].[t_integer].[num] as [i].[num]"> <Identifier> <ColumnReference Database="[test]" Schema="[20090528_tvf]" Table="[t_integer]" Alias="[i]" Column="num" /> </Identifier> </ScalarOperator> </RangeExpressions> </Prefix> </SeekPredicate> </Spool> </RelOp>
This added an Eager Spool
to the plan, and the query executes now in only 110 ms.
This is not the most elegant way, but it's a working solution to prevent a costly CTE from being reevaluated.
Hi,
This is an entertaining post, showing a novel use of manually-edited plan guides, but it is inaccurate in places.
The query plan for the problematic execution shows a sequence operator. This calls the fn_num table-valued function *once* returning a fully populated table variable, @retval.
The @retval table variable is then used in the main query branch on the second input to the sequence. It is this table variable that the manual spool adds an index to. The original inefficiency stems from scanning @retval 10,000 times.
It is easier (and more supported) to simply add an index to @retval:
ALTER FUNCTION [20090528_tvf].fn_num (@maxval integer)
RETURNS @retval TABLE
(
num integer PRIMARY KEY
)
AS
BEGIN
INSERT @retval
SELECT num + @maxval
FROM t_integer
RETURN
END
Adding the PRIMARY KEY constraint to the @retval declaration solves the performance problem shown in this post without a manual USE PLAN hint.
The query is essentially a semi-join, which can be expressed using EXISTS, IN, or INTERSECT. For example:
SELECT num
FROM [20090528_tvf].t_integer AS ti
INTERSECT
SELECT num FROM [20090528_tvf].fn_num(9990) AS fn
FOR XML PATH(‘fo’), ROOT(‘q’);
There are still better solutions that use in-line functions instead, but I wanted to show that the manual index is not necessary here.
The idea of manually editing an index spool into a plan guide is a very creative and interesting solution though!
Paul
Paul White
28 Oct 12 at 14:54