Archive for September 28th, 2009
Adjacency list vs. nested sets: Oracle
Comments enabled. I *really* need your comment
Continuing the series:
What is better to store hierarchical data: nested sets model or adjacency list (parent-child) model?
For detailed explanations of the terms, see the first article in the series:
Today is Oracle time.
Adjacent sets require recursion, and Oracle has a native way to implement recursion in SQL statements. To do this it uses a special clause, CONNECT BY
.
Unlike other systems, Oracle always supported recursion in SQL statements since version 2 (this is the name the first version of Oracle was released under).
This was done because by that time there already were several hierarchical databases on the market which are quite good for storing hierarchical data. If not for this clause, transition from a hierarchical DBMS to the relational DBMS would be too hard.
Oracle's way to query self-relations is somewhat different from recursive CTE's that other systems use:
- Recursive CTEs can use an arbitrary set in recursive steps. The recursion base and recursion stack are not visible in further steps. The recursive operation is a set operation (usually a
JOIN
) on the recursion parameter (a set) and the result of the previous operation: CONNECT BY
queries can only use the value of a single row in recursive steps.The recursion base and recursion stack are visible, the values outside the recursion stack are not.
The recursive operation is a set operation on each row of the current row and the base rowset.
Unlike recursive CTE,
CONNECT BY
operates row-wise, that is each row of the result spans its own recursion branch with its own stack:
The main difference it that the CONNECT BY
operation cannot produce anything that was not in the original rowset, while a recursive CTE can.
However, CONNECT BY
has numerous benefits too: it's very fast, allows to track the hierarchy path easily (using a built-in funtion named SYS_CONNECT_BY_PATH
) and implicitly sorts the recordset in tree order (with additional clause ORDER SIBLINGS BY
to sort the siblings).
This method, of course, allows traversing adjacency lists easily.
To compare efficiency of adjacency lists and that of nested sets, let's create a sample table:
Read the rest of this entry »