Archive for August 11th, 2009
From Stack Overflow:
I have a table
Athat has a
DATETIMEcolumns besides some more other columns.
I have another table
Bthat has one
DATETIMEcolumn (call it
This is in SQL Server 2005.
How to best set up the indexes etc to get the following:SELECT * FROM a, b WHERE a.startDate >= b.dates AND a.endDate < b.dates
Both tables have several thousand records.
Queries of this kind involve searching a table for a range that contains a given constant and plain B-Tree indexes are not good for this query.
A B-Tree index (or any other index that relies on sorting) is efficient for searching a column in a range between a minimal and maximal value (an unknown value between two constants).
However, this query is just the opposite: we search for a constaint between two unknown values.
If we create the indexes on
endDate and issue a query with a condition like this:
startDate < = value AND endDate > value
, this will just give two range condition on two distinct indexes:
-INF < = startDate <= value
value < endDate <= +INF
Each of these ranges is too large to be efficient for searching, though their intersection can be quite small.
There is a special class of indexes: R-Tree indexes, which handle the conditions as described above just perfectly. They are mostly used to query 2-dimensional GIS data, that's why in MySQL they are called
In both these systems they can only be defined over geometry data which implies 2-dimensional objects (
LineStrings and such).
There is a way to use these indexes to speed up the query described above (see the article behind the link on how to do it in MySQL):
However, SQL Server 2005 does not support SPATIAL indexes.
In theory, an R-Tree index could be built over 1-dimensional data just as well, but no database of the major four supports it (as for now).
However, quite a close structure can be implemented using plain B-Tree indexes, and in this article I’ll show how to do it.
Let’s create a sample table:
Read the rest of this entry »