Archive for May 19th, 2010
Things SQL needs: determining range cardinality
What is the problem with this query?
SELECT * FROM orders WHERE quantity <= 4 AND urgency <= 4
The problem is indexing strategy, of course. Which columns should we index?
If we index quantity
, the optimizer will be able to use the index to filter on it. However, filtering on urgency
will require scanning all records with quantity < 4
and applying the urgency
filter to each record found.
Same with urgency
. We can use range access on urgency
using an index, but this will require filtering on quantity
.
Why, create a composite index!
, some will say.
Unfortunately, that won't help much.
A composite B-Tree index maintains what is called a lexicographical order of the records. This means that an index on (quantity, urgency)
will sort on quantity
, and only if the quantities are equal, it will take the urgency
into account.
The picture below shows how would the records be ordered in such an index:
As we can see, with a single index range scan (i. e. just following the arrows) we cannot select only the records within the dashed rectangle. There is no single index range that could be used to filter on both columns.
Even if we changed the field order in the index, it would just change the direction of the arrows connecting the records:
Read the rest of this entry »