EXPLAIN EXTENDED

How to create fast database queries

Archive for May 22nd, 2009

Ordering columns independently

Comments enabled. I *really* need your comment

From Stack Overflow:

I have a table with 5 columns in it.

What's the easiest way to select all rows, but where each column is individually randomized?

All I can think of is to select each column separately, along with

ROW_NUMBER() OVER (ORDER BY NEWID()) AS lookup

, and then join each column back together on lookup.

Is there an easier way?

Though it's possible to do this by means of JOIN'ing, there is an easier way.

Since all SQL operations are row-level, we will need to make 5 independent ordered sets, which we later can use to order each column set. This implies having 5 times as much rows as there are rows in the original table.

But instead of JOIN'ing, we can use SQL Server's clause which maps rows to columns, which is called PIVOT.

We need to do the following:

  1. Generate 5 sets of column values and UNION ALL them.
  2. For each set, select a ROW_NUMBER() OVER (ORDER BY NEWID()) along with each row. It will be unique for each row in any set and will give a row identifier for PIVOT'ing.
  3. For each set, select a set identifier: just a string with column name. This will give a column identifier for PIVOT'ing
  4. Finally, PIVOT the column values: they will get into appropriate row and column, based on ROW_NUMBER() and the set name

Since PIVOT requires aggregation (in case row and column identifiers are not unique), we need some aggregation function, but our query is designed so that there is exactly one row with a given ROW_NUMBER() and set name, so any aggregation function will do.

We will use MIN.

Now, let's create a sample table and see how it works:
Read the rest of this entry »

Written by Quassnoi

May 22nd, 2009 at 11:00 pm

Posted in SQL Server