Archive for July 1st, 2009
Overlapping ranges: MySQL
Continuing yesterday's post.
From Stack Overflow:
I have an event table that specifies a date range with
start_date
andend_date
fields.I have another date range, specified in code, that defines the current week as
week_start
andweek_end
.I'd like to query all events for the week.
Now, let's see how do it in MySQL.
In the previous post we saw that SQL Server is not very good in optimizing this construct:
start_date < @week_end AND end_date > @week_start
and using it leads to a full table scan. A more complex condition is more efficient in SQL Server:
(start_date > @week_start AND start_date < @week_end)
OR
(@week_start BETWEEN start_date AND end_date)
, since two different indexes can be applied to both parts of the condition.
However, the second condition is intrinsically not very efficient which is a problem for almost any datetime range analysis.
A plain B-Tree index is efficient for queries like this:
constant1 < column < constant1
, but not for the queries like this:
column1 < constant < column2
The latter condition is in fact rewritten as following by the optimizer:
column1 < constant AND column2 > constant
and the optimizer is forced to choose between two non-optimal options: either stick to one index and do the filtering (using table lookups) or perform two index lookups and merge the results.
If 1,000,000 rows satisfy the first condition and 1,000,000 satisfy the second one, the engine needs to inspect either 1,000,000 rows with table lookups or 2,000,000 rows using indexes, only to find out that the intersection contains but 10 rows.
Fortunately, MySQL supports R-Tree indexes (called SPATIAL in MySQL because they are primarily used to query spatial data).
Indexes of this kind are particularly good for the queries similar to the one above, and with a little effort they can be used to query time data too.
Let's create a sample table:
Read the rest of this entry »