Archive for June 16th, 2009
IN vs. JOIN vs. EXISTS
From Stack Overflow:
I'm wanting to select rows in a table where the primary key is in another table.
I'm not sure if I should use a
JOIN
or theIN
operator in SQL Server 2005:SELECT * FROM a WHERE a.c IN ( SELECT d FROM b )SELECT a.* FROM a JOIN b ON a.c = b.dIs there any significant performance difference between these two SQL queries with a large dataset (i. e. millions of rows)?
Before we begin I should note that the queries provided by the author are not similar in fact and could produce different resultsets.
If the table used in the subquery returns a value twice, a JOIN
will also return the matching rows twice, while an IN
condition will return them only once.
That's why it's more correct to compare to the following query:
SELECT a.* FROM a JOIN ( SELECT DISTINCT d FROM b ) bo ON a.c = bo.d
Now, à nos moutons.
The folk wisdom advises against using IN
constructs in favor of JOIN
's because they say that the subquery will be executed many times in a loop
.
Let's create some sample tables and see if it's really so:
Read the rest of this entry »