## Archive for December 30th, 2009

## Grouping continuous ranges

From **Stack Overflow**:

I have a ticketing system.

Now I have to select adjacent places when the user asks for

2or3tickets.Every ticket has a line and column number. The concept of adjacent places is places in the same line with adjacent columns numbers.

These tickets are in an

SQL Serverdatabase.Any ideas about this algorithm to search for available adjacent seats?

This is a problem known as grouping continuous ranges

: finding the continuous ranges of the records (in a certain order) having the same value of the grouping column.

For the table in question, the groups will look like this:

Row | Column | Occupied | Group |
---|---|---|---|

1 | 1 | 1 | - |

1 | 2 | 0 | 1 |

1 | 3 | 0 | 1 |

1 | 4 | 0 | 1 |

1 | 5 | 1 | - |

1 | 6 | 1 | - |

1 | 7 | 0 | 2 |

1 | 8 | 0 | 2 |

1 | 9 | 1 | - |

1 | 10 | 0 | 3 |

To find the spans of free seats with the required length, we need to group the continuous ranges of rows with `occupied = 0`

in `column`

order (within the rows), calculate the counts of records in these groups and return the groups having sufficient value of `COUNT(*)`

.

To group something in **SQL**, we need to have an expression that would have the same value for all records belonging to the group.

How do we build such an expression for the continuous ranges?

In a movie theater, seats are numbered sequentially, so we can rely on the values of the `column`

being continuous. Now, let's calculate only the records that describe the free seats, and build their row numbers in the `column`

order:

Row | Column | Occupied | ROW_NUMBER |
---|---|---|---|

1 | 2 | 0 | 1 |

1 | 3 | 0 | 2 |

1 | 4 | 0 | 3 |

1 | 7 | 0 | 4 |

1 | 8 | 0 | 5 |

1 | 10 | 0 | 6 |

We see that filtering only free seats broke the column numbering order (the columns numbers are not continuous anymore), but the `ROW_NUMBER`

s are continuous (by definition). Each occupied seat breaks the continuity of the columns, and the spans of the free seats correspond to the ranges of columns with unbroken continuity.

If the continuity is not broken, `column`

and `ROW_NUMBER`

both increase by **1** with each record. This means that their difference will be constant:

Row | Column | Occupied | ROW_NUMBER | column - ROW_NUMBER |
---|---|---|---|---|

1 | 2 | 0 | 1 | 1 |

1 | 3 | 0 | 2 | 1 |

1 | 4 | 0 | 3 | 1 |

1 | 7 | 0 | 4 | 3 |

1 | 8 | 0 | 5 | 3 |

1 | 10 | 0 | 6 | 4 |

The value of difference between `column`

and `ROW_NUMBER`

, therefore, uniquely defines the continuous group the records belong to.

This means we can group on it and do other things (the rest being pure technical).

Let's create a sample table:

Read the rest of this entry »