EXPLAIN EXTENDED

How to create fast database queries

Archive for June 14th, 2009

Flattening timespans: PostgreSQL

Comments enabled. I *really* need your comment

In previous three articles:

, I described various approaches to merging a set of intersecting timespans into a smaller set of larger, non-intersecting ranges.

Today, I'll describe how to do it in PostgreSQL.

A quick reminder of the problem, taken from Stack Overflow:

I have lots of data with start and stop times for a given ID and I need to flatten all intersecting and adjacent timespans into one combined timespan.

To make things a bit clearer, take a look at the sample data for 03.06.2009:

The following timespans are overlapping or contiunous and need to merge into one timespan:

date start stop
2009.06.03 05:54:48:000 10:00:13:000
2009.06.03 09:26:45:000 09:59:40:000

The resulting timespan would be from 05:54:48 to 10:00:13.

Since there's a gap between 10:00:13 and 10:12:50, we also have the following timespans:

date start stop
2009.06.03 10:12:50:000 10:27:25:000
2009.06.03 10:13:12:000 11:14:56:000
2009.06.03 10:27:25:000 10:27:31:000
2009.06.03 10:27:39:000 13:53:38:000
2009.06.03 11:14:56:000 11:15:03:000
2009.06.03 11:15:30:000 14:02:14:000
2009.06.03 13:53:38:000 13:53:43:000
2009.06.03 14:02:14:000 14:02:31:000

which result in one merged timespan from 10:12:50 to 14:02:31, since they're overlapping or adjacent.

Any solution, be it SQL or not, is appreciated.

PostgreSQL implements various procedural languages, like pl/PGSQL, pl/perl etc., which can be used to solve almost any task using procedural approaches.

Using, for instance, pl/PGSQL, we can implement an approach similar to one we used for SQL Server.

We should create a set-returning function, open a cursor inside it and select all timespans ordered by start, keeping track of a running maximum of stop.

Whenever we find a timespan that starts later than end all previously started timespans, we know we have a gap.

Here's the sample table:
Read the rest of this entry »

Written by Quassnoi

June 14th, 2009 at 11:00 pm

Posted in PostgreSQL