June 30, 2010 - EXPLAIN EXTENDED at EXPLAIN EXTENDED

Archive for June 30th, 2010

Indexing for ORDER BY / LIMIT

Comments enabled. I *really* need your comment

Answering questions asked on the site.

Frode Underhill asks:

I have some applications that are logging to a MySQL database table.

The table is pretty standard on the form:

time
BIGINT(20) source
TINYINT(4) severity
ENUM text
VARCHAR(255)

, where source identifies the system that generated the log entry.

There are very many entries in the table (>100 million), of which 99.9999% are debug or info messages.

I'm making an interface for browsing this log, which means I'll be doing queries like
SELECT  *
FROM    log
WHERE   source = 2
AND severity IN (1,2)
AND time &gt; 12345
ORDER BY
time ASC
LIMIT 30
, if I want to find debug or info log entries from a certain point in time, or
SELECT  *
FROM    log
WHERE   source = 2
AND severity IN (1,2)
AND time &lt; 12345 
ORDER BY
        time DESC
LIMIT 30
for finding entries right before a certain time.

How would one go about indexing & querying such a table?

I thought I had it figured out (I pretty much just tried every different combination of columns in an index), but there's always some set of parameters that results in a really slow query.

The problem is that you cannot use a single index both for filtering and ordering if you have a ranged condition (severity IN (1, 2) in this case).

Recently I wrote an article with a proposal to improve SQL optimizer to handle these conditions. If a range has low cardinality (this is, there are few values that con possibly satisfy the range), then the query could be improved by rewriting the range as a series of individual queries, each one using one of the values constituting the range in an equijoin:

Things SQL needs: determining range cardinality

No optimizers can handle this condition automatically yet, so we'll need to emulate it.

Since the severity field is defined as an enum with only 5 values possible, any range condition on this field can be satisfied by no more than 5 distinct values, thus making this table ideal for rewriting the query.

Let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

June 30th, 2010 at 11:00 pm

Posted in MySQL

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

EXPLAIN EXTENDED

Archive for June 30th, 2010

Indexing for ORDER BY / LIMIT

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow

EXPLAIN EXTENDED

Archive for June 30th, 2010

Indexing for ORDER BY / LIMIT

Share this:

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow