Archive for August 24th, 2010
20 latest unique records
From Stack Overflow:
I have a logfile which logs the insert/delete/updates from all kinds of tables.
I would like to get an overview of for example the last 20 people which records where updated, ordered by the last update (
datetime DESC
)
A common solution for such a task would be writing an aggregate query with ORDER BY
and LIMIT
:
SELECT person, MAX(ts) AS last_update FROM logfile GROUP BY person ORDER BY last_update DESC LIMIT 20
What's bad in this solution? Performance, as usual.
Since last_update
is an aggregate, it cannot be indexed. And ORDER BY
on unindexed fields results in our good old friend, filesort
.
Note that even in this case the indexes can be used and the full table scan can be avoided: if there is an index on (person, ts)
, MySQL
will tend to use a loose index scan on this index, which can save this query if there are relatively few persons in the table. However, if there are many (which is what we can expect for a log table), loose index scan can even degrade performance and generally will be avoided by MySQL
.
We should use another approach here. Let's create a sample table and test this approach:
Read the rest of this entry »