Ordering columns independently
Comments enabled. I *really* need your comment
From Stack Overflow:
I have a table with 5 columns in it.
What's the easiest way to select all rows, but where each column is individually randomized?
All I can think of is to select each column separately, along with
ROW_NUMBER() OVER (ORDER BY NEWID()) AS lookup
, and then join each column back together on lookup.
Is there an easier way?
Though it's possible to do this by means of JOIN
'ing, there is an easier way.
Since all SQL
operations are row-level, we will need to make 5 independent ordered sets, which we later can use to order each column set. This implies having 5 times as much rows as there are rows in the original table.
But instead of JOIN
'ing, we can use SQL Server's clause which maps rows to columns, which is called PIVOT
.
We need to do the following:
- Generate 5 sets of column values and
UNION ALL
them. - For each set, select a
ROW_NUMBER() OVER (ORDER BY NEWID())
along with each row. It will be unique for each row in any set and will give a row identifier forPIVOT
'ing. - For each set, select a set identifier: just a string with column name. This will give a column identifier for
PIVOT
'ing - Finally,
PIVOT
the column values: they will get into appropriate row and column, based onROW_NUMBER()
and the set name
Since PIVOT
requires aggregation (in case row and column identifiers are not unique), we need some aggregation function, but our query is designed so that there is exactly one row with a given ROW_NUMBER()
and set name, so any aggregation function will do.
We will use MIN
.
Now, let's create a sample table and see how it works:
CREATE SCHEMA [20090522_random] CREATE TABLE t_random ( id INT NOT NULL PRIMARY KEY, col1 VARCHAR(50) NOT NULL, col2 VARCHAR(50) NOT NULL, col3 VARCHAR(50) NOT NULL, col4 VARCHAR(50) NOT NULL, col5 VARCHAR(50) NOT NULL ) GO CREATE INDEX IX_random_col1 ON [20090522_random].t_random(col1) CREATE INDEX IX_random_col2 ON [20090522_random].t_random(col2) CREATE INDEX IX_random_col3 ON [20090522_random].t_random(col3) CREATE INDEX IX_random_col4 ON [20090522_random].t_random(col4) CREATE INDEX IX_random_col5 ON [20090522_random].t_random(col5) DECLARE @cnt INT; BEGIN TRANSACTION SET @cnt = 1; WHILE @cnt <= 1000 BEGIN INSERT INTO [20090522_random].t_random VALUES ( @cnt, 'Value 1 ' + CAST(@cnt AS VARCHAR), 'Value 2 ' + CAST(@cnt AS VARCHAR), 'Value 3 ' + CAST(@cnt AS VARCHAR), 'Value 4 ' + CAST(@cnt AS VARCHAR), 'Value 5 ' + CAST(@cnt AS VARCHAR) ) SET @cnt = @cnt + 1 END COMMIT GO [/sourcecode] This table has <strong>1,000</strong> rows with <strong>5</strong> <code>VARCHAR</code> columns. There is an index on each of the columns, so that each column set can be selected by browsing the appropriate index, without lookup up the table itself. The query will look like this: SELECT * FROM ( SELECT 'col1' AS name, col1 AS col, ROW_NUMBER() OVER (ORDER BY NEWID()) AS rn FROM "20090522_random".t_random UNION ALL SELECT 'col2', col2, ROW_NUMBER() OVER (ORDER BY NEWID()) FROM "20090522_random".t_random UNION ALL SELECT 'col3', col3, ROW_NUMBER() OVER (ORDER BY NEWID()) FROM "20090522_random".t_random UNION ALL SELECT 'col4', col4, ROW_NUMBER() OVER (ORDER BY NEWID()) FROM "20090522_random".t_random UNION ALL SELECT 'col5', col5, ROW_NUMBER() OVER (ORDER BY NEWID()) FROM "20090522_random".t_random ) q PIVOT ( MIN(col) FOR name IN ([col1], [col2], [col3], [col4], [col5]) ) pt
rn | col1 | col2 | col3 | col4 | col5 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Value 1 563 | Value 2 676 | Value 3 605 | Value 4 951 | Value 5 49 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 | Value 1 278 | Value 2 657 | Value 3 281 | Value 4 307 | Value 5 496 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 | Value 1 653 | Value 2 59 | Value 3 420 | Value 4 354 | Value 5 197 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4 | Value 1 635 | Value 2 682 | Value 3 620 | Value 4 139 | Value 5 31 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5 | Value 1 789 | Value 2 217 | Value 3 273 | Value 4 150 | Value 5 388 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
6 | Value 1 110 | Value 2 374 | Value 3 740 | Value 4 834 | Value 5 444 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
7 | Value 1 992 | Value 2 758 | Value 3 310 | Value 4 151 | Value 5 495 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
8 | Value 1 741 | Value 2 17 | Value 3 96 | Value 4 822 | Value 5 73 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
9 | Value 1 12 | Value 2 599 | Value 3 793 | Value 4 578 | Value 5 826 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10 | Value 1 446 | Value 2 588 | Value 3 941 | Value 4 741 | Value 5 603 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
997 | Value 1 704 | Value 2 950 | Value 3 610 | Value 4 600 | Value 5 399 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
998 | Value 1 898 | Value 2 980 | Value 3 590 | Value 4 698 | Value 5 747 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
999 | Value 1 864 | Value 2 241 | Value 3 983 | Value 4 592 | Value 5 882 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1000 | Value 1 180 | Value 2 993 | Value 3 981 | Value 4 470 | Value 5 920 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1000 rows fetched in 0.0688s (0.0483s) |
Table 't_random'. Scan count 5, logical reads 25, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 16 ms, elapsed time = 49 ms.
The values are ordered independently.
This method, though, has one drawback: it may only be used on columns of same datatype.
If we have more than one datatype, we will need to group columns by datatype, PIVOT
them in separate queries and then JOIN
the resultsets on rn
.
Alternatively, we can cast the values into BINARY
, and cast them back to appropriate types after PIVOT
'ing.