Bitwise operations and indexes
From Stack Overflow:
Are the following queries efficient in MySQL:
SELECT * FROM table WHERE field & number = number -- to find values with superset of number's bitsSELECT * FROM table WHERE field | number = number -- to find values with subset of number's bits, if an index for the
field
has been created?If not, is there a way to make it run faster?
An index can be used for the following things:
- To limit the number of records scanned
- To lower the number of row lookups
When doing a full table scan, every record should be fetched and examined. If the table contains say, 1,000,000 records, and each record is 100 bytes long, then 100 Mb worth of data should be processed. These data are usually scanned sequentially.
An index row, on the other hand, contains the key value, the row pointer and some additional information (pointers to the parents and children). On a MyISAM table, each row of the index on an INT
column occupies 10 bytes (4 (sizeof(INT)
) + 6 (default MyISAM pointer size)) plus some overhead (block headers, leaf pointers etc).
Even if we cannot build a range on the index and need to look over all index values, 10 Mb is far less than 100 Mb.
However, scanning the index has two drawbacks:
- Traversing the B-Tree is more slow than a full table scan, since the former is not sequential
- When the condition we search for matches the index key, we should follow the pointer to fetch the unindexed values from the table row. This also takes some time.
Both these things can be optimized:
- Oracle has
INDEX FAST FULL SCAN
access method. It scans the index rows in their physical order, not logical. The key order is not maintained in such a scan, but we not always need it anyway.This makes the index scanning sequential (and taking the constant time)
- PostgreSQL has bitmap access method. PostgreSQL cannot disregard the index order as Oracle does so it has to traverse the tree anyway. But instead of following the pointer to fetch the values it fills a special bitmap with a bit set for each matching row.
The order of bits corresponds to the physical order of rows.
When it's over with setting bits it just reads the bitmap and fetches the rows with the bits set. Since the bits are ordered, the rows fetched are ordered too, so this scan is also sequential. Scanning the bitmap takes constant time (which is usually negligible since bitmaps are very small), pure time for fetching the rows depends on the number of bits set but is no longer than a full scan (since the scan is sequential).
MySQL, however, does neither of these optimizations.
Let's see what the query time depends on:
- Table scan time is constant. It depends on the row size.
- Index time varies and depends on the following parameters:
- Index scan time depends on whether the condition is sargable:
- Yes, The index scan time depends on the range size.
- No. The index scan time is constant.
- Pointer dereference time depends on the number of rows inside the range
- Index scan time depends on whether the condition is sargable:
Not that the range and the condition are not always the same thing. A condition can be a subset of range.
For instance, for a table with natural numbers a condition like this:
a BETWEEN 0 AND 999 AND a % 100 = 0
will be sargable, but the range and the filter return different subset of values. The range contains 1,000 values, but the filter returns only 10 of them.
MySQL, however, cannot benefit from this difference. It always fetches the row from the table if an unindexed value is used in a SELECT
clause, even if the condition uses only the columns covered by the index.
Say, if MySQL uses the index and fetches the value 1, it could look at this value, see that is does not satisfy the condition (1 % 100 <> 0
) and avoid using the row pointer to fetch the table record.
But MySQL always fetches the table row, no matter what. It performs an inefficient early table lookup.
Formulae
Summarizing all said above, here are the formulae to calculate the time:
Table scan
Time = T * R
T
is the number of rows in the tableR
is the single row scan time
Index scan
Time = R * (K + I + D)
R
is the number of rows in the index range. If the condition is not sargable,R
is equal toT
above (all rows should be scanned)K
is the key scan time. It depend on the size of the index key. It is almost always less thanR
aboveI
is the index traversal time (the extra time to find the next index record)D
is the row pointer dereference time
While K
alone is usually less than R
, I
and D
are quite large, so K + I + D
is almost always greater then R
.
That means that to benefit from the index, we should make R
significantly smaller than T
, i. e we should make our condition sargable and the range part selective.
Sample table
Now, let's create a sample table and see the real numbers:
The table has 1,000,000 records with random values in two columns.
The first column, ivalue
, is indexed, the second one, pvalue
, is not. The values are the same in both columns.
Let's try to query the unindexed column:
SELECT id, pvalue, ivalue, data FROM t_bit WHERE pvalue | 0x0000FFFF = 0x0000FFFF
This uses full table scan and completes in 750 ms
Now, same with the indexed column (and the index forced):
SELECT id, pvalue, ivalue, data FROM t_bit FORCE INDEX (ix_bit_ivalue) WHERE ivalue | 0x0000FFFF = 0x0000FFFF AND ivalue >= 0
This way it takes 11,82 seconds.
Now we can see that scanning an index records is about 15 times as expensive as scanning a MyISAM table record. That means that the index range should contain at most 6% of records to benefit from the index.
How can we make the condition sargable?
Subset
Let's see what happens when we search for a subet of rows (a bitwise OR
).
The result of bitwise OR
of the field and the number should give the number back. That means that the bits set to 0 should be the same in field
and the number
. A field
can have anything in bits set to 1 in number.
If we have first N
bits set to 0 in number this means that the first N
bits should be set to 0 in the field just as well.
But this means that the field should be less or equal to the maximum number with the first bits sets to 0. This number of course should have all other bits set to 1 (that is what makes it the maximum), and, therefore, be by one less than a power of two. And yes, this is a range and an index can be used to limit the number of records to browse.
To calculate this number, we shoud just find this power of two and subtract the 1 from it.
Here's the query:
SELECT id, pvalue, ivalue, data FROM t_bit FORCE INDEX (ix_bit_ivalue) WHERE ivalue | 0x0000FFFF = 0x0000FFFF AND ivalue <= 2 << FLOOR(LOG(2, 0x0000FFFF) + 1) - 1
The index limits the range so that only 16 rows are eligible. Since the index range and the filtering condition are the same in this query, all of there records are returned.
The query takes only 2 ms.
Superset
The superset as the name suggests tend to return more values than a subset. Now, the condition is reversed: the field
should have a 1 in the bits with a 1 in number
, and anything in those with a 0.
This imples that the field
should have al least as many leading bits set as the number
does:
1111 0110 1011 1010 1010 1011 0101 1010 -- number 1111 ???? ???? ???? ???? ???? ???? ???? -- field
The lower bound of the range can be found by taking a logarithm of the power of two closest to the bitwise complement of the number:
SELECT id, pvalue, ivalue, data FROM t_bit FORCE INDEX (ix_bit_ivalue) WHERE ivalue & 0xFFFF0000 = 0xFFFF0000 AND ivalue >= 0xFFFFFFFF & ~((2 << FLOOR(LOG(2, 0xFFFFFFFF & ~0xFFFF0000))) - 1)
This takes only 2 ms too.
We see that adding the range condition to the query improves performance it the field is indexed.
Range
But will any range condition improve the performance?
Since we calculated that scanning a row using index access is 15 times as costly as the table scan, the range should have at least 15 times as few records as the table does.
This means that the range is good only for the numbers with at least 4 bits unset at the beginning (for OR
queries) or at the end (for AND
queries)
Let's check it:
SELECT id, pvalue, ivalue, data FROM t_bit FORCE INDEX (ix_bit_ivalue) WHERE ivalue | 0x09876543 = 0x09876543 AND ivalue <= 2 << FLOOR(LOG(2, 0x09876543) + 1) - 1
This query takes 0.93 second which is about same efficiency as the full table scan (0.75 second), since exactly 4 bits are unset at the beginning of the number we compare to.
If more bits are set to 0 at the beginning, the index scan is more efficient. If less, then the table scan is more efficient.
Same holds for AND
queries too, but the bits should be counted starting from the end.
Thank you for you article! I was looking for a simple way to make requests with bitwise operations faster (i’m using an OR subset) and found my answer!
Jean
9 Feb 13 at 01:20
someone, buy this guy a beer.
dang
14 Jun 14 at 14:03
SELECT id, pvalue, ivalue, data
FROM t_bit FORCE INDEX (ix_bit_ivalue)
WHERE ivalue & 0xFFFF0000 = 0xFFFF0000
AND ivalue >= 0xFFFFFFFF & ~((2 <= 0xFFFF0000;
(4-byte integer implied)
zdave
9 Jul 17 at 04:27
why not simplify
`ivalue & 0xFFFF0000 = 0xFFFF0000
AND ivalue >= 0xFFFFFFFF & ~((2 <= 0xFFFF0000`
(4-byte integer implied)
zdave
9 Jul 17 at 04:30
to
`ivalue >= 0xFFFF0000`
(4-byte integer implied)
zdave
9 Jul 17 at 04:31