How to create fast database queries

Archive for May 7th, 2010

Things SQL needs: MERGE JOIN that would seek

with 7 comments

One of the most known and least used join algorithms in SQL engines is MERGE JOIN.

This algorithm operates on two sorted recordsets, keeping two pointers that chase each other.

The Wikipedia entry above describes it quite well in terms of algorithms. I'll just make an animated GIF to make it more clear:

This is quite a nice and elegant algorithm, which, unfortunately, has two major drawbacks:

  1. It needs the recordsets to be sorted
  2. Even with the recordsets sorted, it is no better than a HASH JOIN

The sorting part is essential for this algorithm and there is nothing that can be done with it: the recordsets should be sorted, period. Databases, however, often provide the records in the sorted order: from clustered tables, indexes, previously sorted and ordered subqueries, spool tables etc.

But even when the recordsets are already sorted, on equijoins the MERGE JOIN is hardly faster than a HASH JOIN.

Read the rest of this entry »

Written by Quassnoi

May 7th, 2010 at 11:00 pm

Posted in Miscellaneous