# EXPLAIN EXTENDED

How to create fast database queries

My latest article on SQL in general: Happy New Year!. You're welcome to read and comment on it.

## Grouping continuous ranges

From Stack Overflow:

I have a ticketing system.

Now I have to select adjacent places when the user asks for 2 or 3 tickets.

Every ticket has a line and column number. The concept of adjacent places is places in the same line with adjacent columns numbers.

These tickets are in an SQL Server database.

This is a problem known as grouping continuous ranges: finding the continuous ranges of the records (in a certain order) having the same value of the grouping column.

For the table in question, the groups will look like this:

Seats
Row Column Occupied Group
1 1 1 -
1 2 0 1
1 3 0 1
1 4 0 1
1 5 1 -
1 6 1 -
1 7 0 2
1 8 0 2
1 9 1 -
1 10 0 3

To find the spans of free seats with the required length, we need to group the continuous ranges of rows with `occupied = 0` in `column` order (within the rows), calculate the counts of records in these groups and return the groups having sufficient value of `COUNT(*)`.

To group something in SQL, we need to have an expression that would have the same value for all records belonging to the group.

How do we build such an expression for the continuous ranges?

In a movie theater, seats are numbered sequentially, so we can rely on the values of the `column` being continuous. Now, let’s calculate only the records that describe the free seats, and build their row numbers in the `column` order:

Seats
Row Column Occupied ROW_NUMBER
1 2 0 1
1 3 0 2
1 4 0 3
1 7 0 4
1 8 0 5
1 10 0 6

We see that filtering only free seats broke the column numbering order (the columns numbers are not continuous anymore), but the `ROW_NUMBER`s are continuous (by definition). Each occupied seat breaks the continuity of the columns, and the spans of the free seats correspond to the ranges of columns with unbroken continuity.

If the continuity is not broken, `column` and `ROW_NUMBER` both increase by 1 with each record. This means that their difference will be constant:

Seats
Row Column Occupied ROW_NUMBER column – ROW_NUMBER
1 2 0 1 1
1 3 0 2 1
1 4 0 3 1
1 7 0 4 3
1 8 0 5 3
1 10 0 6 4

The value of difference between `column` and `ROW_NUMBER`, therefore, uniquely defines the continuous group the records belong to.

This means we can group on it and do other things (the rest being pure technical).

Let’s create a sample table:
Read the rest of this entry »

Written by Quassnoi

December 30th, 2009 at 11:00 pm

Posted in SQL Server