How to create fast database queries

Archive for May 7th, 2009

Checking event dates

Comments enabled. I *really* need your comment

From Stack Overflow:

Suppose the following table structure in Oracle:

start_date DATE,
end_date DATE

Is there a way to query all of the events that fall on a particular day of the week?

For example, I would like to find a query that would find every event that falls on a Monday.

Figuring out if the start_date or end_date falls on a Monday is easy, but I'm not sure how to find it out for the dates between.

This is one of the range predicates which are very unfriendly to plain B-Tree indexes.

But even if there would be a range friendly index (like R-Tree), that would hardly be an improvement. Monday's make up 14.3% of all days, that means that an index if there were any, would have very low selectivity even for one-day intervals.

And if the majority of intervals last for more than one day, the selectivity of the condition yet decreases: 86% of 6-day intervals have a Monday inside.

Given the drawbacks of index scanning and joining on ROWID, we can say that a FULL TABLE SCAN will be a nice access path for this query, and we just need to represent it as an SQL condition (without bothering for its sargability)

We could check that a Monday is between end_date's day-of-week number and the range length's offset from this number:

FROM    "20090507_dates".event
WHERE   6 BETWEEN MOD(start_date - TO_DATE(1, 'J'), 7) AND MOD(start_date - TO_DATE(1, 'J'), 7) + end_date - start_date

This query converts each ranges into a pair of zero-based, Tuesday-based day of week offsets, and returns all records which have day 6 (a Monday) inside the range.

Note that we don't use Oracle's TO_DATE('D') function here: starting day of week depends on NLS_TERRITORY which only leads to confusion.

Now, this query works but looks quite ugly. And if we will check for more complex conditions, it will become even uglier.

What if we need to find all ranges that contain a Friday, 13th? Or a second week's Thursday? The conditions will become unreadable and unmaintainable.

Can we do it in some more elegant way?

What if we just iterate over the days of the range and check each day for the condition? This should be much more simple than inventing the boundaries.

Let's create a sample table and try it:
Read the rest of this entry »

Written by Quassnoi

May 7th, 2009 at 11:00 pm

Posted in Oracle