Archive for October, 2009
Date range with possible NULL values
From Stack Overflow:
Using SQL Server 2008.
I have a stored proc which has start and end date as input parameters for date range.
Looking for a single SQL query which has a
BETWEEN
predicate on start and end date in theWHERE
clause which can handle both cases where the dates are either bothNULL
or both have values.
This can be done by writing an IS NULL
check in the expression, like this:
WHERE (@startDate IS NULL OR date >= @startDate) AND (@endDate IS NULL OR date >= @endDate)
or by using a COALESCE
on the value of the date itself (so that the range always matches):
WHERE date BETWEEN ISNULL(@startDate, date) AND ISNULL(@endDate, date)
However, none of these conditions uses an index on date
efficiently.
Let's create a sample table ans see:
Read the rest of this entry »
Bitwise operations and indexes
From Stack Overflow:
Are the following queries efficient in MySQL:
SELECT * FROM table WHERE field & number = number -- to find values with superset of number's bitsSELECT * FROM table WHERE field | number = number -- to find values with subset of number's bits, if an index for the
field
has been created?If not, is there a way to make it run faster?
An index can be used for the following things:
- To limit the number of records scanned
- To lower the number of row lookups
When doing a full table scan, every record should be fetched and examined. If the table contains say, 1,000,000 records, and each record is 100 bytes long, then 100 Mb worth of data should be processed. These data are usually scanned sequentially.
An index row, on the other hand, contains the key value, the row pointer and some additional information (pointers to the parents and children). On a MyISAM table, each row of the index on an INT
column occupies 10 bytes (4 (sizeof(INT)
) + 6 (default MyISAM pointer size)) plus some overhead (block headers, leaf pointers etc).
Even if we cannot build a range on the index and need to look over all index values, 10 Mb is far less than 100 Mb.
However, scanning the index has two drawbacks:
- Traversing the B-Tree is more slow than a full table scan, since the former is not sequential
- When the condition we search for matches the index key, we should follow the pointer to fetch the unindexed values from the table row. This also takes some time.
Both these things can be optimized:
- Oracle has
INDEX FAST FULL SCAN
access method. It scans the index rows in their physical order, not logical. The key order is not maintained in such a scan, but we not always need it anyway.This makes the index scanning sequential (and taking the constant time)
- PostgreSQL has bitmap access method. PostgreSQL cannot disregard the index order as Oracle does so it has to traverse the tree anyway. But instead of following the pointer to fetch the values it fills a special bitmap with a bit set for each matching row.
The order of bits corresponds to the physical order of rows.
When it's over with setting bits it just reads the bitmap and fetches the rows with the bits set. Since the bits are ordered, the rows fetched are ordered too, so this scan is also sequential. Scanning the bitmap takes constant time (which is usually negligible since bitmaps are very small), pure time for fetching the rows depends on the number of bits set but is no longer than a full scan (since the scan is sequential).
MySQL, however, does neither of these optimizations.