## Matching whole sets

Comments enabled. I *really* need your comment

From **Stack Overflow**:

Given the following:

declare @a table ( pkid int, value int ) declare @b table ( otherID int, value int ) insert into @a values (1, 1000) insert into @a values (1, 1001) insert into @a values (2, 1000) insert into @a values (2, 1001) insert into @a values (2, 1002) insert into @b values (-1, 1000) insert into @b values (-1, 1001) insert into @b values (-1, 1002), how do I query for all the values in

`@a`

that completely match up with`@b`

?

`{@a.pkid = 1, @b.otherID = -1}`

would not be returned (only2of3values match)

`{@a.pkid = 2, @b.otherID = -1}`

would be returned (3of3values match)

We need to compare each set of values from `@a`

to each set of `@b`

and return the `id`

's of non-matching sets.

Two sets are considered matching when for each row from the first set there is an equal row from the second set and vise versa. That is, both the values and the count of the rows should match in both sets.

First thing that comes to mind is to calculate `COUNT(*)`

's of both sets, then calculate the `COUNT(*)`

for the result of both sets joined on `value`

and make sure all three are equal.

But we don't actually need to `COUNT(*)`

all values. We just need to find out the first row from either of the subsets that has no match in another subset. As soon as we find such a row, we can tell that the sets don't match; if we don't find one, then the sets do match.

This is where `FULL OUTER JOIN`

comes handy. If there is a record in either of the sets that doesn't have a match in another set, the join will return `NULL`

in one of the fields. We can match it in `WHERE`

clause and return just as soon as we encounter a `NULL`

`id`

returned in the resultset.

Finally, we need to produce all possible set combinations to compare. We can do it by cross-joining `@a`

and `@b`

, selecting `DISTINCT pkid, otherid`

from the resulting recordset and using it in a subquery to produce all possible set combinations.

The resulting query looks like this:

SELECT * FROM ( SELECT ab.pkid, ab.otherID, ( SELECT TOP 1 COALESCE(ai.value, bi.value) FROM ( SELECT * FROM @a aii WHERE aii.pkid = ab.pkid ) ai FULL OUTER JOIN ( SELECT * FROM @b bii WHERE bii.otherID = ab.otherID ) bi ON ai.value = bi.value WHERE ai.pkid IS NULL OR bi.otherID IS NULL ) unmatch FROM ( SELECT DISTINCT pkid, otherid FROM @a a , @b b ) ab ) q WHERE unmatch IS NOT NULL

pkid | otherid | unmatch |
---|---|---|

1 | -1 | 1002 |

As a free bonus, it also returns the first non-matching `value`

.