Grouping continuous ranges at EXPLAIN EXTENDED

I have a ticketing system.

Now I have to select adjacent places when the user asks for 2 or 3 tickets.

Every ticket has a line and column number. The concept of adjacent places is places in the same line with adjacent columns numbers.

These tickets are in an SQL Server database.

Any ideas about this algorithm to search for available adjacent seats?

This is a problem known as grouping continuous ranges: finding the continuous ranges of the records (in a certain order) having the same value of the grouping column.

For the table in question, the groups will look like this:

Seats
Row	Column	Occupied	Group
1	1	1	-
1	2	0	1
1	3	0	1
1	4	0	1
1	5	1	-
1	6	1	-
1	7	0	2
1	8	0	2
1	9	1	-
1	10	0	3

To find the spans of free seats with the required length, we need to group the continuous ranges of rows with occupied = 0 in column order (within the rows), calculate the counts of records in these groups and return the groups having sufficient value of COUNT(*).

To group something in SQL, we need to have an expression that would have the same value for all records belonging to the group.

How do we build such an expression for the continuous ranges?

In a movie theater, seats are numbered sequentially, so we can rely on the values of the column being continuous. Now, let's calculate only the records that describe the free seats, and build their row numbers in the column order:

Seats
Row	Column	ROW_NUMBER
1	2	1
1	3	2
1	4	3
1	7	4
1	8	5
1	10	6

We see that filtering only free seats broke the column numbering order (the columns numbers are not continuous anymore), but the ROW_NUMBERs are continuous (by definition). Each occupied seat breaks the continuity of the columns, and the spans of the free seats correspond to the ranges of columns with unbroken continuity.

If the continuity is not broken, column and ROW_NUMBER both increase by 1 with each record. This means that their difference will be constant:

Seats
Row	Column	ROW_NUMBER	column - ROW_NUMBER
1	2	1	1
1	3	2	1
1	4	3	1
1	7	4	3
1	8	5	3
1	10	6	4

The value of difference between column and ROW_NUMBER, therefore, uniquely defines the continuous group the records belong to.

This means we can group on it and do other things (the rest being pure technical).

Let's create a sample table:

Table creation details

CREATE SCHEMA [20091230_seats]
CREATE TABLE t_seat (
      row INT NOT NULL,
      col INT NOT NULL,
      occupied BIT NOT NULL,
      CONSTRAINT pk_seat_row_col
      PRIMARY KEY  (row, col)
)
GO
BEGIN TRANSACTION
SELECT  RAND(20091230)
DECLARE @cnt INT
SET @cnt = 1
WHILE @cnt <= 100
BEGIN
        INSERT
        INTO    &#91;20091230_seats&#93;.t_seat (row, col, occupied)
        VALUES  (
                (@cnt - 1) / 10 + 1,
                (@cnt - 1) % 10 + 1,
                ROUND(RAND(), 0)
                )
        SET @cnt = @cnt + 1
END
COMMIT
GO
&#91;/sourcecode&#93;
</div>

and select the chart of the places:


SELECT  row, [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]
FROM    (
        SELECT  row, col,
                CASE occupied WHEN 0 THEN ' ' ELSE 'X' END AS occupied
        FROM    [20091230_seats].t_seat
        ) q
PIVOT   (
        MAX(occupied)
        FOR col IN
        ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])
        ) p

row	1	2	3	4	5	6	7	8	9	10
1		X	X					X
2		X		X				X		X
3		X	X	X		X	X			X
4		X			X	X		X
5	X							X		X
6	X		X			X	X		X	X
7		X	X
8	X			X	X	X
9	X			X		X	X	X
10	X	X			X	X		X
10 rows fetched in 0.0011s (0.0012s)

Now, let's select the seat spans having length 5 or more:

WITH    free AS
        (
        SELECT  row, col, ROW_NUMBER() OVER (PARTITION BY row ORDER BY col) AS rn
        FROM    [20091230_seats].t_seat
        WHERE   occupied = 0
        ),
        spans AS
        (
        SELECT  row, col, COUNT(*) OVER (PARTITION BY row, col - rn) AS rl
        FROM    free
        )
SELECT  *
FROM    spans
WHERE   rl >= 5

row	col	rl
5	2	6
5	3	6
5	4	6
5	5	6
5	6	6
5	7	6
7	4	7
7	5	7
7	6	7
7	7	7
7	8	7
7	9	7
7	10	7
13 rows fetched in 0.0005s (0.0016s)

and put them on the chart:

WITH free AS
(
SELECT row, col, ROW_NUMBER() OVER (PARTITION BY row ORDER BY col) AS rn
FROM [20091230_seats].t_seat
WHERE occupied = 0
),
spans AS
(
SELECT row, col, COUNT(*) OVER (PARTITION BY row, col - rn) AS rl
FROM free
)
SELECT row, [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]
FROM (
SELECT se.row, se.col,
CASE
WHEN rl IS NULL THEN
'X'
WHEN rl < 5 THEN ' ' ELSE CAST(rl AS VARCHAR) END AS occupied FROM [20091230_seats].t_seat se LEFT JOIN spans sp ON sp.row = se.row AND sp.col = se.col ) q PIVOT ( MAX(occupied) FOR col IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10]) ) p [/sourcecode]

row	1	2	3	4	5	6	7	8	9	10
1		X	X					X
2		X		X				X		X
3		X	X	X		X	X			X
4		X			X	X		X
5	X	6	6	6	6	6	6	X		X
6	X		X			X	X		X	X
7		X	X	7	7	7	7	7	7	7
8	X			X	X	X
9	X			X		X	X	X
10	X	X			X	X		X
10 rows fetched in 0.0012s (0.0098s)

Free spans of 5 seats or more (in rows 5 and 7) are marked on the chart with the numbers corresponding to the number of free seats in the spans.

Written by Quassnoi

December 30th, 2009 at 11:00 pm

Posted in SQL Server

One Response to 'Grouping continuous ranges'

Subscribe to comments with RSS

This is genius and I’ve been using it to handle partial de-duplication of rows. Instead of ‘occupied’, I’m using BINARY_CHECKSUM. Those with the same checksum, when subtracted from the overall row number, end up with the same “span id”. The other solutions in StackOverflow are nowhere near as elegant as this.

Abner Diaz

27 Dec 18 at 19:05

Subscribe in a reader

Subscribe by email

Contacts

Ask me a question

Drop me a line

Should I?

Yes. Feel free to ask questions and write me. An interesting question is a pleasure to answer, and I really enjoy receiving feedback

EXPLAIN EXTENDED