March 2009 - Page 2 of 4 - EXPLAIN EXTENDED at EXPLAIN EXTENDED

Archive for March, 2009

Hierarchical queries in MySQL: finding loops

with 4 comments

This is a series of articles on hierarchical queries in MySQL:

Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops

See also:

Hierarchical data in MySQL: parents and children in one query

Today, we will check our structure for loops.

Loops in hierarchical queries occur when a row contains itself in its ancestry chain.

In general, it's a sign of a database logic flaw and should be avioded, but sometimes it's deliberately used.

The simplest and most easily detectable case is having id equal to parent (meaning that the row is both its father and child). This may be tested by using a simple comparison condition.

But a loop of course can be more complex: a row can be its own grandfather, grand-grandfather etc.

To detect such loops, we will improve our functions a little.
Read the rest of this entry »

Written by Quassnoi

March 21st, 2009 at 11:00 pm

Posted in MySQL

Hierarchical queries in MySQL: finding leaves

Comments enabled. I *really* need your comment

This is a series of articles on hierarchical queries in MySQL:

Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops

See also:

Hierarchical data in MySQL: parents and children in one query

In this article, we will find the leaves of the hierarchy tree.

A row is a leaf in a hierarchy tree if it has no children.

In Oracle, there is a speical pseudocolumn that tests if a given row is a leaf in a CONNECT BY query. Surprisingly, it's called CONNECT_BY_ISLEAF.
Read the rest of this entry »

Written by Quassnoi

March 20th, 2009 at 11:00 pm

Posted in MySQL

Hierarchical queries in MySQL: adding ancestry chains.

with 4 comments

This is a series of articles on hierarchical queries in MySQL:

Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops

See also:

Hierarchical data in MySQL: parents and children in one query

Today, we will improve the query a little more.

If we develop the catalog with categories, subcategories etc, it's useful to have breadcrumb links somewhere in the header of the page.

In Oracle, there is a special function called SYS_CONNECT_BY_PATH(row_expression, delimiter), that returns the ancestry chain of a given node.
Read the rest of this entry »

Written by Quassnoi

March 19th, 2009 at 11:00 pm

Posted in MySQL

Hierarchical queries in MySQL: adding level

with 4 comments

This is a series of articles on hierarchical queries in MySQL:

Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops

See also:

Hierarchical data in MySQL: parents and children in one query

Today I will add some more features to these queries.

We have almost 100,000 rows in the table, 8 hierarchy levels, and our query is written so that it always selects all rows.

Now let's try to present our data in a Yahoo-style catalog. We need only top 2 levels: parent categories and their subcategories.
Read the rest of this entry »

Written by Quassnoi

March 18th, 2009 at 11:00 pm

Posted in MySQL

Hierarchical queries in MySQL

with 52 comments

This is a series of articles on hierarchical queries in MySQL:

Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops

See also:

Hierarchical data in MySQL: parents and children in one query

There is no need in explaining how convenient hierarchical queries are.

A quick reminder: hierarchical data is a parent-child relationship contained in one table.

A typical task is to return values from the table in the following way:

Resultset should be sorted like a tree, that is lexical sort by ancestry chains
Depth level should be returned along with each row

It may sound confusing, but it's very simple in fact, like shown on this Oracle query:
Read the rest of this entry »

Written by Quassnoi

March 17th, 2009 at 11:00 pm

Posted in MySQL

Counting page views

Comments enabled. I *really* need your comment

From Stack Overflow:

I have tables called pages, page_views, page_items and page_votes.

The latter three tables contain a page_id foreign key in order to keep a record of each individual view, item and vote that belongs to a page.

When I query a page, I also want to retrieve COUNT(*) from page_views, COUNT(*) from page_items and SUM(vote) from page_votes.

I have pasted a query below. It retrieves the total number of views.
SELECT  Page.*, COUNT(*) AS views
FROM    pages AS Page 
INNER JOIN
        page_views AS PageView
ON      Page.id = PageView.page_id 
GROUP BY
        Page.id 
ORDER BY
        views DESC   
LIMIT 10 OFFSET 0
I've made various attempts to add items and votes to it, but the result is either a syntax error or views/items/votes returned as an identical and wrong number, probably due to the way I am joining.

How can I add items and votes to this query?

On a first thought, it's simple:
Read the rest of this entry »

Written by Quassnoi

March 16th, 2009 at 11:00 pm

Posted in MySQL

Aggregated hours

Comments enabled. I *really* need your comment

From Stack Overflow:

I would like to know the sum of all web services completions per hour of day.

Obviously, this can be easily done with SUM() and GROUP BY:
SELECT hour, SUM(calls) FROM sample s GROUP BY hour;  
hour SUM(calls)

0 634

1 642

2 633

3 624

4 420

5 479

6 428

7 424

8 473

9 434

10 485

11 567

12 526

13 513

14 555

15 679

16 624

17 796

18 752

19 843

20 827

21 774

22 647

23 533

My problem is that in old sets, the web service calls in the hours [00-11] were already summed up.

The simple statement as listed above would therefore lead to
SELECT hour, SUM(calls) FROM sample s GROUP BY hour;  
hour SUM(calls)

0 6234

12 526

13 513

14 555

15 679

16 624

17 796

18 752

19 843

20 827

21 774

22 647

23 533

This is an undesirable result. To make the old sets [00,12,...,23] comparable to the new sets [00,01,...,23], I would like to have one statement that averages the value of [00] and distributes it over the missing hours, e.g.:

hour SUM(calls)

0 6243/12

1 6243/12

12 526

23 533

I can easily do this using temporary tables or views, but i don't know how to accomplish this without them.

hour	SUM(calls)
0	634
1	642
2	633
3	624
4	420
5	479
6	428
7	424
8	473
9	434
10	485
11	567
12	526
13	513
14	555
15	679
16	624
17	796
18	752
19	843
20	827
21	774
22	647
23	533

hour	SUM(calls)
0	6234
12	526
13	513
14	555
15	679
16	624
17	796
18	752
19	843
20	827
21	774
22	647
23	533

hour	SUM(calls)
0	6243/12
1	6243/12

12	526

23	533

In this rowset, we have a row that needs to be split into 12 another rows.
Read the rest of this entry »

Written by Quassnoi

March 15th, 2009 at 11:00 pm

Posted in MySQL

Deleting duplicates

Comments enabled. I *really* need your comment

Microsoft Knowledge Base has an article KB139444 on how to delete duplicate rows from a table that has no primary key.

Though it works, it's quite an overkill to delete duplicate rows with grouping, temporary tables, deleting and reinserting.

With SQL Server 2005 and above, there is much more elegant solution.

Let's create a table and fill it with duplicate records:

CREATE TABLE t_duplicate (id INT NOT NULL, value VARCHAR(50) NOT NULL)
GO

SET NOCOUNT ON

BEGIN TRANSACTION

DECLARE @i INT
SET @i = 1
WHILE @i < 5000
BEGIN
        INSERT
        INTO    t_duplicate
        VALUES  (@i / 1000 + 1, 'Value ' + CAST(@i AS VARCHAR))
        SET @i = @i + 1
END

COMMIT

SELECT  *
FROM    t_duplicate
&#91;/sourcecode&#93;

<table class="terminal">
<tr><th>id</th><th>value</th></tr>
<tr><td>1</td><td>Value 1</td></tr>
<tr><td>1</td><td>Value 2</td></tr>
<tr><td>1</td><td>Value 3</td></tr>
<tr class="break"><td colspan="100"></td></tr>
<tr><td>5</td><td>Value 4997</td></tr>
<tr><td>5</td><td>Value 4998</td></tr>
<tr><td>5</td><td>Value 4999</td></tr>
</table>

And now let's delete the duplicates:


WITH    q AS
        (
        SELECT  d.*,
                ROW_NUMBER() OVER (PARTITION BY id ORDER BY value) AS rn
        FROM    t_duplicate d
        ) 
DELETE
FROM    q
WHERE   rn > 1

SELECT  *
FROM    t_duplicate

id	value
1	Value 1
2	Value 1000
3	Value 2000
4	Value 3000
5	Value 4000

Done in a single query.

Written by Quassnoi

March 14th, 2009 at 11:00 pm

Posted in SQL Server

Finding incomplete orders

Comments enabled. I *really* need your comment

Imagine we are keeping an online shop and want to find the customers that don't have complete orders.

We'll make the structure of the orders a little bit complex:

Each customer may have a number of baskets
Each basket will have a list of positions in it
Each position has a number of discounts
An order is considered complete, when all entites are present: there is at least one basket, all baskets should have at least one position, and each position should have at least one discount

We will keep the data in four tables as following:
Read the rest of this entry »

Written by Quassnoi

March 13th, 2009 at 11:00 pm

Posted in MySQL

Analytic functions: optimizing LAG, LEAD, FIRST_VALUE, LAST_VALUE

Comments enabled. I *really* need your comment

In the previous article I wrote about optimized emulation of the analytic functions in MySQL.

Now, let's try to optimize LAG, LEAD, FIRST_VALUE and LAST_VALUE.

Imagine we have a table that keeps actions of a PC in an online game. This table has the following design:

Action id
PC id
Current PC level
Current PC score
Action data

For each action, the current level and current score of the PC are keeped. The table, of course, is designed in such a bad way just to illustrate our task :)

Now, for first 2 player characters, we need to know first 2 actions performed on first 2 levels; how much score did these actions yield; and how much score left to reach the next level.

With analytic functions, it would be done the following way:
Read the rest of this entry »

Written by Quassnoi

March 12th, 2009 at 11:00 pm

Posted in MySQL

EXPLAIN EXTENDED

Archive for March, 2009

Hierarchical queries in MySQL: finding loops

Hierarchical queries in MySQL: finding leaves

Hierarchical queries in MySQL: adding ancestry chains.

Hierarchical queries in MySQL: adding level

Hierarchical queries in MySQL

Counting page views

Aggregated hours

Deleting duplicates

Finding incomplete orders

Analytic functions: optimizing LAG, LEAD, FIRST_VALUE, LAST_VALUE

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow

hour	SUM(calls)
0	634
1	642
2	633
3	624
4	420
5	479
6	428
7	424
8	473
9	434
10	485
11	567
12	526
13	513
14	555
15	679
16	624
17	796
18	752
19	843
20	827
21	774
22	647
23	533

hour	SUM(calls)
0	634
1	642
2	633
3	624
4	420
5	479
6	428
7	424
8	473
9	434
10	485
11	567
12	526
13	513
14	555
15	679
16	624
17	796
18	752
19	843
20	827
21	774
22	647
23	533

Archive for March, 2009

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Subscribe

Subscribe by email

Contacts

Should I?

Recent articles

Calendar

Archives

Categories

Stack Overflow

hour	SUM(calls)
0	634
1	642
2	633
3	624
4	420
5	479
6	428
7	424
8	473
9	434
10	485
11	567
12	526
13	513
14	555
15	679
16	624
17	796
18	752
19	843
20	827
21	774
22	647
23	533