Archive for June 4th, 2009
Longest common prefix: SQL Server
Comments enabled. I *really* need your comment
From Stack Overflow:
I have some data:
id ref 1 3536757616 1 3536757617 1 3536757618 2 3536757628 2 3536757629 2 3536757630 and want to get the result like this:
id result 1 3536757616/7/8 2 3536757629/28/30 Essentially, the data should be aggregated on
id
, and theref
's should be concatenated together and separated by a/
(slash), but with longest common prefix removed.
I've already encoutered this problem several times, so I'll try to cover solutions for all RDBMS's my blog is about:
- Longest common prefix: SQL Server
- Longest common prefix: PostgreSQL
- Longest common prefix: Oracle
- Longest common prefix: MySQL
I hope this will be interesting as approaches will differ significantly for all four systems.
Today is SQL Server time.
I won't create sample tables here, since I'm demonstrating the principle. Instead, I'll just use dynamically generated data.
What do we need to do here is:
- Find least common prefix for each group and its length
- Cut off the prefix of each but the first
ref
, usingSUBSTRING
- Concatenate the strings using
FOR XML
Steps 2 and 3 are quite simple, but the first one needs some effort.
SQL Server doesn't provide a function to find the longest common prefix (LCP), so we'll have to implement it.
With some limitations, it may be done using pure SQL
, no UDF
's.
Read the rest of this entry »