EXPLAIN EXTENDED

How to create fast database queries

Archive for July 29th, 2009

PostgreSQL 8.4: efficient MIN(COUNT(*))

Comments enabled. I *really* need your comment

Answering questions asked on the site.

Frances asks:

I have a table user_views which contains pages viewed by users.

How do I select the user that had the most and the fewest number of page views?

If you need them at the same time, you'll just have to group by myuser, then order by COUNT(*).

With new PostgreSQL 8.4, it's possible to do this in one CTE, to avoid double evaluation:

WITH    q AS
(
SELECT  myuser, COUNT(*)
FROM    user_pages
GROUP BY
myuser
)
SELECT  *
FROM    (
SELECT  'max'::TEXT AS which, myuser, count
FROM    q
ORDER BY
count DESC
LIMIT 1
) qmax
UNION ALL
SELECT  *
FROM    (
SELECT  'min'::TEXT AS which, myuser, count
FROM    q
ORDER BY
count
LIMIT 1
) qmin

However, it's takes quite a long time if you table is big.

This is probably an answer to the question you asked.

But as a bonus I'd like to tell how to optimize the query for the cases when you need just the user with minimal count of page views.

The main idea here that we should accumulate the least value of COUNT(*) calculated so far, and stop counting when we reach this threshold. This will save us some row lookups and improve the query.

Let's create a sample table:

Read the rest of this entry »

Written by Quassnoi

July 29th, 2009 at 11:00 pm

Posted in PostgreSQL