February 2011 - EXPLAIN EXTENDED at EXPLAIN EXTENDED

Archive for February, 2011

Things SQL needs: SERIES()

Recently I had to deal with several scenarios which required processing and aggregating continuous series of data.

I believe this could be best illustrated with an example:

id	source	value
1	1	10
2	1	20
3	2	15
4	2	25
5	1	45
6	3	50
7	3	35
8	1	40
9	1	10

The records are ordered by id, and within this order there are continuous series of records which share the same value of source. In the table above, the series are separated by thick lines.

We want to calculate some aggregates across each of the series: MIN, MAX, SUM, AVG, whatever:

source	min	max	sum	avg
1	10	20	30	15.00
2	15	25	40	20.00
1	45	45	45	45.00
3	35	50	85	42.50
1	10	40	50	25.00

This can be used for different things. I used that for:

Reading sensors from a moving elevator (thus tracking its position)
Recording user's activity on a site
Tracking the primary node in a server cluster

, but almost any seasoned database developer can recall a need for such a query.

As you can see, the values of source are repeating so a mere GROUP BY won't work here.

In the systems supporting window functions there is a workaround for that:

Read the rest of this entry »

Written by Quassnoi

February 18th, 2011 at 11:00 pm

Posted in Miscellaneous

Late row lookups: InnoDB

with 8 comments

Answering questions asked on the site.

Aryé asks:

Thanks for your article about late row lookups in MySQL.

I have two questions for you please:

Is this workaround specific to MyISAM engine?

How does PostgreSQL handle this?

The questions concerns a certain workaround for MySQL LIMIT … OFFSET queries like this:

SELECT  *
FROM    mytable
ORDER BY
id
LIMIT   10
OFFSET  10000

which can be improved using a little rewrite:

SELECT  m.*
FROM    (
SELECT  id
FROM    mytable
ORDER BY
id
LIMIT   10
OFFSET  10000
) q
JOIN    mytable m
ON      m.id = q.id
ORDER BY
m.id

For the rationale behind this improvement, please read the original article.

Now, to the questions.

The second questions is easy: PostgreSQL won't pull the fields from the table until it really needs them. If a query involving an ORDER BY along with LIMIT and OFFSET is optimized to use the index for the ORDER BY part, the table lookups won't happen for the records skipped.

Though PostgreSQL does not reflect the table lookups in the EXPLAIN output, a simple test would show us that they are done only LIMIT times, not OFFSET + LIMIT (like MySQL does).

Now, let's try to answer the first question: will this trick improve the queries against an InnoDB table?

To do this, we will create a sample table:

Read the rest of this entry »

Written by Quassnoi

February 11th, 2011 at 11:00 pm

Posted in MySQL

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28

EXPLAIN EXTENDED

Archive for February, 2011

Things SQL needs: SERIES()

Late row lookups: InnoDB

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow

EXPLAIN EXTENDED

Archive for February, 2011

Things SQL needs: SERIES()

Share this:

Late row lookups: InnoDB

Share this:

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow