EXPLAIN EXTENDED

How to create fast database queries

Archive for February 25th, 2010

Matching sets: aggregates vs. first miss

Comments enabled. I *really* need your comment

From Stack Overflow:

Here is my schema:

Suppliers
sid sname address
Parts
pid pname color
Catalog
sid pid cost

I need to find the sids of suppliers who supply every red part or every green part.

This task requires matching the sets.

We need to compare two sets here: the first one is the set of the parts of given color; the second one is the set of parts provided by a given supplier. The former should be the subset of the latter.

Unlike other engines, MySQL does not provide the set operators like EXCEPT or MINUS that allow to check the subset / superset relationship very easily. We have to use the record-based solutions.

There are two ways to check that:

  • First miss technique: test each record from the subset candidate against the superset candidate, returning FALSE if there is no match.
  • Aggregate technique: compare the number of records in the subset candidate to the number of records in their intersection. If the numbers are equal, the sets match

Let's test which way is faster in which cases. To do this, we will need some sample tables:
Read the rest of this entry »

Written by Quassnoi

February 25th, 2010 at 11:00 pm

Posted in MySQL