SQL Server: random records avoiding CTE reevaluation
Comments enabled. I *really* need your comment
From Stack Overflow:
I need to get 5 random records from a table plus a further record based on data from the users preferences.
How can I combine the two statements whilst ensuring that no results are repeated (i. e. the favourite is not present in the 5 random records)?
We should select the 6th record so that it would satisfy two conditions: first, it should be preferred by user, second, it should not be among the 5 random records selected earlier.
The problem with random records it that they are, um, random. We usually cannot reproduce 5 random records to apply further conditions on them.
That's where the CTEs would come handy, but there's a little problem.
A CTE, as Microsoft understands them, is just an alias for an online view.
Therefore, each time a CTE is used in a query, it may (or may not) be reevaluated.
It's not a problem for a plain CTEs which are built out of static columns.
However, this is totally a problem when we try to UNION two queries using an undeterministic CTE (like, ordered by NEWID()), as we would do in this case.
Let's create a sample table and see what happens:
CREATE SCHEMA [20090728_random]
CREATE TABLE records
(
id INT NOT NULL PRIMARY KEY,
preferred INT NOT NULL,
value VARCHAR(20) NOT NULL
)
GO
BEGIN TRANSACTION
DECLARE @cnt INT
SET @cnt = 1
WHILE @cnt <= 100000
BEGIN
INSERT
INTO [20090728_random].records
VALUES (
@cnt,
CASE @cnt % 1000 WHEN 0 THEN 1 ELSE 0 END,
'Record ' + CAST(@cnt AS VARCHAR)
)
SET @cnt = @cnt + 1
END
COMMIT
[/sourcecode]
There are <strong>100,000</strong> rows in the table, with <strong>100</strong> of them marked as <code>preferred</code>.
Let's see what's happening when we try to select <code>TOP 5 * ORDER BY RANDOM()</code> in a <strong>CTE</strong> and then <code>UNION ALL</code> two <strong>CTE</strong>s:
WITH rows AS
(
SELECT TOP 5 r.*, 1 AS rn
FROM [20090728_random].records r
ORDER BY
NEWID()
)
SELECT *
FROM rows
UNION ALL
SELECT *
FROM rows
| id | preferred | value | rn | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13696 | 0 | Record 13696 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 81375 | 0 | Record 81375 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 97171 | 0 | Record 97171 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 15569 | 0 | Record 15569 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 65057 | 0 | Record 65057 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 93492 | 0 | Record 93492 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 63785 | 0 | Record 63785 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 36144 | 0 | Record 36144 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 52825 | 0 | Record 52825 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 53930 | 0 | Record 53930 | 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 10 rows fetched in 0.0005s (0.1386s) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We have two completely different sets. This is because the CTE was reevaluated in the second query.
This means we will not be able to use the CTE to check the 6th record. The CTE will be reevaluated, and the check will fail.
We need to build our query this way:
- Build a CTE called
randomsto select 5 random records - Build a CTE called
prefsto select all preferred records FULL OUTER JOINthese CTEs onid- Select
TOP 6results, ordering byrandoms.id DESC JOINthe table usingCOALESCE(randoms.id, prefs.id)
This will select each record exactly once from each of the CTEs, even it belongs to both of them.
Ordering by randoms.id DESC makes sure that 5 random records will be selected first.
The 6th record will not be among the 5 random ones (or it would satisfy the FULL OUTER JOIN condition and would be selected earlier), and it's guaranteed to have preferred = 1 (there may be at most 5 records not satisfying this condition, and they already had been selected).
And here's the query:
WITH randoms AS
(
SELECT TOP 5 r.*
FROM [20090728_random].records r
ORDER BY
NEWID()
),
prefs AS
(
SELECT id
FROM [20090728_random].records
WHERE preferred = 1
)
SELECT TOP 6 r.*
FROM randoms
FULL OUTER JOIN
prefs
ON randoms.id = prefs.id
JOIN [20090728_random].records r
ON r.id = COALESCE (randoms.id, prefs.id)
ORDER BY
randoms.id DESC
| id | preferred | value | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 83022 | 0 | Record 83022 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 79921 | 0 | Record 79921 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 78241 | 0 | Record 78241 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 58192 | 0 | Record 58192 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 18954 | 0 | Record 18954 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 1000 | 1 | Record 1000 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 6 rows fetched in 0.0004s (0.1766s) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
First 5 records are random, the 6th one is preferred.
Subscribe in a reader