Archive for February 10th, 2010
SQL Server: EXCEPT ALL
Answering questions asked on the site.
myst asks:
I have two really large tables with lots of columns, many of them are nullable
I need to remove all rows from
table1
which are not present intable2
, but there can be duplicates in both tables and writing more than 70IS NULL
conditions would be a pain, and I want to make sure there's nothing I'm missing.Is there a more simple way?
SQL Server supports EXCEPT
clause which returns all records present in the first table and absent in the second one. But this clause eliminates duplicates and cannot be used as a subject to a DML
operation.
ANSI SQL standard describes EXCEPT ALL
which returns all records from the first table which are not present in the second table, leaving the duplicates as is. Unfortunately, SQL Server does not support this operator.
Similar behavior can be achieved using NOT IN
or NOT EXISTS
constructs. But in SQL Server, IN
predicate does not accept more than one field. NOT EXISTS
accepts any number of correlated columns, but it requires extra checks in the WHERE
clause, since equality operator does not treat two NULL values as equal. Each pair or nullable columns should be additionally checked for a NULL in both fields. This can only be done using OR
predicates or COALESCE
, neither of which adds to performance.
But there is a way to emulate EXCEPT ALL
in SQL Server quite elegantly and efficiently.
Let's create two sample tables:
Read the rest of this entry »