EXPLAIN EXTENDED

How to create fast database queries

Archive for March 11th, 2009

Analytic functions: optimizing SUM and ROW_NUMBER

Comments enabled. I *really* need your comment

In the previous articles I wrote about emulating numerous analytic function in MySQL.

Using methods described above, it's possible to emulate almost all analytic functions present in Oracle and SQL Server.

Here are these methods in a nutshell:

  • Select all table rows ordered by PARTITION BY columns, then by ORDER BY columns of the analytic function
  • Track the grouing sets by using session variables initialized in the first subquery
  • If the analytic function needs some precalculations to be evaluated (like, count of the rows in the grouping set, sum of the values etc), join the table with the precalculated aggregates
  • Use state session variables to calculate the analytic function and store intermediate values between rows
  • Initialize state session variables whenever the grouping set changes

This may sound confusing, but if you take a look on the examples from the previous articles, it will become clear as a bell.

This methods work and work well, if you need to select all rows from the tables.

But what if you need to implement some filtering? Do we really need to count millions of rows if we need first three? Do we really need to inspect all rows to find a maximum if we have an index?

Of course, no.

Analytic functions can be optimized as well as any other queries.
Read the rest of this entry »

Written by Quassnoi

March 11th, 2009 at 11:00 pm

Posted in MySQL