EXPLAIN EXTENDED

How to create fast database queries

Date range with possible NULL values

with 2 comments

From Stack Overflow:

Using SQL Server 2008.

I have a stored proc which has start and end date as input parameters for date range.

Looking for a single SQL query which has a BETWEEN predicate on start and end date in the WHERE clause which can handle both cases where the dates are either both NULL or both have values.

This can be done by writing an IS NULL check in the expression, like this:

WHERE   (@startDate IS NULL OR date >= @startDate)
AND (@endDate IS NULL OR date >= @endDate)

or by using a COALESCE on the value of the date itself (so that the range always matches):

WHERE   date BETWEEN ISNULL(@startDate, date) AND ISNULL(@endDate, date)

However, none of these conditions uses an index on date efficiently.

Let's create a sample table ans see:

Table creation details

This table has 1,000,000 records stuffed with 200 bytes of data each.

IS NULL OR equal

DECLARE @startDate DATETIME
DECLARE @endDate DATETIME
SET @startDate = '20091001'
SET @endDate = NULL
SELECT  SUM(LEN(stuffing))
FROM    [20091005_range].t_events
WHERE   (@startDate IS NULL OR date >= @startDate)
AND (@endDate IS NULL OR date >= @endDate)

View query details

This query uses a Clustered Index Scan on the whole table. The records are quite large, that's why it takes much time to scan them all, and the query takes 5.6 seconds to complete.

ISNULL

DECLARE @startDate DATETIME
DECLARE @endDate DATETIME
SET @startDate = '20091001'
SET @endDate = NULL
SELECT  SUM(LEN(stuffing)) AS total
FROM    [20091005_range].t_events
WHERE   date BETWEEN ISNULL(@startDate, date) AND ISNULL(@endDate, date)

View query details

This is just like the previous query: a very expensive Clustered Index Scan.

MIN / MAX

Both conditions above are not sargable. SQL Server cannot define a single range to use it with an index.

To make the condition sargable we could possibly use some predefined constants and use them instead of the value in ISNULL or COALESCE.

But we better use the actual minimal and maximal values of date for this purpose.

MIN and MAX are almost free on an indexed field and we can easily use them to show the actual values of the range bounds or even returning them in the OUTPUT parameters of the stored proc.

Here's the query to do this:

DECLARE @startDate DATETIME
DECLARE @endDate DATETIME
SET @startDate = '20091001'
SET @endDate = NULL

;
WITH    limits AS
(
SELECT  COALESCE(@startDate, MIN(date)) AS startDate,
COALESCE(@endDate, MAX(date)) AS endDate
FROM    [20091005_range].t_events
)
SELECT  startDate, endDate, total
FROM    limits
CROSS APPLY
(
SELECT  SUM(LEN(stuffing)) AS total
FROM    [20091005_range].t_events
WHERE   date BETWEEN startDate and endDate
) totals

View query details

We see not only the result, but the actual values of startDate and endDate that were substituted.

Note that the index range is additionally sorted by id. Since index records are much smaller in size than the rows, it's worth sorting them by id to make the row lookups sequential (LOOKUP ORDERED FORWARD). This saves much time on traversing the B-Tree that stores the clustered records.

This query takes only 50 ms which is 100 times as fast.

Written by Quassnoi

October 5th, 2009 at 11:00 pm

Posted in SQL Server

2 Responses to 'Date range with possible NULL values'

Subscribe to comments with RSS

  1. i need query for either both values are null or one may be null or both dates have values

    d.karthick

    2 Jan 15 at 15:48

  2. @d.karthick: have you read the article?

    Quassnoi

    4 Jan 15 at 04:31

Leave a Reply