EXPLAIN EXTENDED

How to create fast database queries

Descending indexes

with 2 comments

From Stack Overflow:

When you create an index on a column or number of columns in MS SQL Server (I'm using version 2005), you can specify that the index on each column be either ascending or descending.

I'm having a hard time understanding why this choice is even here. Using binary sort techniques, wouldn't a lookup be just as fast either way?

What difference does it make which order I choose?

This matters when the composite indexes are used.

Let's create a sample table:

Table creation details

We created a composite index on t_orderer (column1, column2 DESC).

Now, let's try to select some rows with a given column1 value, ordered by column2 in both directions:

Ascending:

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT  TOP 10 *
FROM    [20090427_order].t_orderer
WHERE   column1 = 10
ORDER BY
column1 ASC, column2 ASC

id column1 column2
90000 10 1
90001 10 2
90002 10 3
90003 10 4
90004 10 5
90005 10 6
90006 10 7
90007 10 8
90008 10 9
90009 10 10
10 rows fetched in 0.0005s (0.0008s)
Table 't_orderer'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms. 

Descending:

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT  TOP 10 *
FROM    [20090427_order].t_orderer
WHERE   column1 = 10
ORDER BY
column1 ASC, column2 DESC

id column1 column2
99999 10 10000
99998 10 9999
99997 10 9998
99996 10 9997
99995 10 9996
99994 10 9995
99993 10 9994
99992 10 9993
99991 10 9992
99990 10 9991
10 rows fetched in 0.0005s (0.0007s)
Table 't_orderer'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms. 

The query plan shows the same for both queries:

INDEX SEEK TOP

At first blush, there is no difference: index is used with same efficiency for both directions.

But let's look closer into the query above.

We see a condition in the WHERE clause: column1 = 10.

This condition covers a continuously ordered index range: (10, -INFINITY) < (column1, column2) < (10, +INFINITY).

This index range can be used to retrieve ordered value in both directions.

Now, let's change the query a little:

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT  TOP 10 *
FROM    [20090427_order].t_orderer
WHERE   column1 BETWEEN 10 AND 100
ORDER BY
column1 ASC, column2 ASC

id column1 column2
90000 10 1
90001 10 2
90002 10 3
90003 10 4
90004 10 5
90005 10 6
90006 10 7
90007 10 8
90008 10 9
90009 10 10
10 rows fetched in 0.0005s (0.4460s)
Table 't_orderer'. Scan count 3, logical reads 1932, physical reads 0, read-ahead reads 1327, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 516 ms,  elapsed time = 446 ms. 

Now, we have same resultset returned as in the query above, but execution time is much greater, and the plan now says:

INDEX SEEK TOP N SORT GATHER STREAMS TOP

What happened?

We created index on (column1, column2 DESC). That means the column2 is sorted in opposite order to column1 in the index.

In the index, the row (10, 1) is immediately followed by (11, 10000), then by (11, 9999), etc.

In our initial query, any order could be satisifed by the index, since only a single value of column1 is to be selected. The optimizer takes this into account.

But in this query, INDEX SCAN does not yield the order on column1 ASC, column2 ASC anymore. That's why an additional SORT is required.

Now, if we use the index sorting condition in ORDER BY:

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT  TOP 10 *
FROM    [20090427_order].t_orderer
WHERE   column1 BETWEEN 10 AND 100
ORDER BY
column1 ASC, column2 DESC

id column1 column2
99999 10 10000
99998 10 9999
99997 10 9998
99996 10 9997
99995 10 9996
99994 10 9995
99993 10 9994
99992 10 9993
99991 10 9992
99990 10 9991
10 rows fetched in 0.0005s (0.0007s)
Table 't_orderer'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms. 

the plan returns to the efficient one:

INDEX SEEK TOP

and the query is instant again.

Note that we can also reverse the ORDER BY clause completely, and the index is still usable:

SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT  TOP 10 *
FROM    [20090427_order].t_orderer
WHERE   column1 BETWEEN 10 AND 100
ORDER BY
column1 DESC, column2 ASC

id column1 column2
990000 100 1
990001 100 2
990002 100 3
990003 100 4
990004 100 5
990005 100 6
990006 100 7
990007 100 8
990008 100 9
990009 100 10
10 rows fetched in 0.0005s (0.0007s)
Table 't_orderer'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms. 

Since an index can be traversed in either direction with same efficiency, this query is also instant.

Written by Quassnoi

April 27th, 2009 at 11:00 pm

Posted in SQL Server

2 Responses to 'Descending indexes'

Subscribe to comments with RSS

  1. Nice post!
    BTW, what SQL client you used?

    Sam

    16 Oct 12 at 08:06

  2. @Sam: thanks!

    I use SSMS for testing and a custom perl script for posting.

    It runs the queries on my dev server and produces WordPress-formatted output which I copy and paste into the post.

    Quassnoi

    16 Oct 12 at 10:29

Leave a Reply