EXPLAIN EXTENDED

How to create fast database queries

Counting page views

Comments enabled. I *really* need your comment

From Stack Overflow:

I have tables called pages, page_views, page_items and page_votes.

The latter three tables contain a page_id foreign key in order to keep a record of each individual view, item and vote that belongs to a page.

When I query a page, I also want to retrieve COUNT(*) from page_views, COUNT(*) from page_items and SUM(vote) from page_votes.

I have pasted a query below. It retrieves the total number of views.

SELECT  Page.*, COUNT(*) AS views
FROM    pages AS Page 
INNER JOIN
        page_views AS PageView
ON      Page.id = PageView.page_id 
GROUP BY
        Page.id 
ORDER BY
        views DESC   
LIMIT 10 OFFSET 0

I've made various attempts to add items and votes to it, but the result is either a syntax error or views/items/votes returned as an identical and wrong number, probably due to the way I am joining.

How can I add items and votes to this query?

On a first thought, it's simple:

SELECT
        pages.
        (
        SELECT COUNT(*)
        FROM   page_views
        WHERE  page_views.page_id = pages.id
        ) AS views_count,
        (
        SELECT COUNT(*)
        FROM   page_items
        WHERE  page_items.page_id = pages.id
        ) AS items_count,
        COALESCE(
        (
        SELECT SUM(vote)
        FROM   page_votes
        WHERE  page_votes.page_id = pages.id
        ), 0) AS votes_sum
FROM    pages
ORDER BY
        views_count DESC
LIMIT 10
id name views_count items_count votes_sum
68990 Page 68990 20 40 180
4489 Page 4489 19 40 180
21179 Page 21179 19 40 180
9500 Page 9500 18 40 180
10715 Page 10715 18 40 180
26105 Page 26105 18 40 180
28413 Page 28413 18 40 180
29992 Page 29992 18 40 180
12933 Page 12933 18 40 180
34160 Page 34160 18 40 180
10 rows fetched in 0,0011s (1,2446s)

Can we optimize it somehow?

In this query, we select all rows from pages.

For each row, we use the subqueries in the SELECT clause to calculate the aggregates from other tables.

MySQL is even smart enough to see that items_count and votes_sum are used only in SELECT clause, so the values for them will be calculated only for first 10 rows.

But we don't really need the scan on pages. For each page, the view count may be calculated just from page_views, as it contains page_id we join on.

So we can do the following:

  1. Select TOP 10 viewed pages using only page_views
  2. Join pages with the resultset from the step above (will join only 10 rows)
  3. Calculate other aggregates
SELECT  pages.*,
        views_count,
        (
        SELECT COUNT(*)
        FROM   page_items
        WHERE  page_items.page_id = pages.id
        ) AS items_count,
        COALESCE(
        (
        SELECT SUM(vote)
        FROM   page_votes
        WHERE  page_votes.page_id = pages.id
        ), 0) AS votes_sum
FROM    (
        SELECT  page_id, COUNT(*) AS views_count
        FROM    page_views
        GROUP BY
                page_id
        ORDER BY
                views_count DESC, page_id
        LIMIT 10
        ) AS pvd,
        pages
WHERE   pages.id = pvd.page_id
68990 Page 68990 20 40 180
4489 Page 4489 19 40 180
21179 Page 21179 19 40 180
9500 Page 9500 18 40 180
10715 Page 10715 18 40 180
26105 Page 26105 18 40 180
28413 Page 28413 18 40 180
29992 Page 29992 18 40 180
12933 Page 12933 18 40 180
34160 Page 34160 18 40 180
10 rows fetched in 0,0019s (0,5736s)

By joining only the results we need, we gain a 100% increase in performance.

Written by Quassnoi

March 16th, 2009 at 11:00 pm

Posted in MySQL

Leave a Reply