EXPLAIN EXTENDED

How to create fast database queries

SQL Server: deleting with self-referential FOREIGN KEY

with 5 comments

From Stack Overflow:

I have an SQL Server table defined as below:

TestComposite
id (PK) siteUrl (PK) name parentId
1 site1 Item1 NULL
2 site1 Item2 NULL
3 site1 Folder1 NULL
4 site1 Folder1.Item1 3
5 site1 Folder1.Item2 3
6 site1 Folder1.Folder1 3
7 site1 Folder1.Folder1.Item1 6

Items and folders are stored inside the same table

If an item is inside a folder, the parentID column is the id of the folder.

I would like to be able to DELETE CASCADE items/folders when I delete a folder.

I tried to define a constraint similar to:

ALTER TABLE [TestComposite] 
ADD CONSTRAINT fk_parentid 
FOREIGN KEY (ParentID, SiteUrl) 
REFERENCES [TestComposite] (ID, SiteUrl) ON DELETE CASCADE

, but it gives me this error:

Introducing FOREIGN KEY constraint 'fk_parentid' on table 'TestComposite' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.

SQL Server does support chained CASCADE updates, but does not allow one table to participate more that once in a chain (i. e. does not allow loops).

SQL Server, unlike most other engines, optimizes cascading DML operations to be set-based which requires building a cycle-free DML order (which you can observe in the execution plan). With the loops, that would not be possible.

However, it is possible to define such a constraint without cascading operations, and with a little effort it is possible to delete a whole tree branch at once.

Let's create a sample table:

Table creation details

This table contains 50,000 records forming a hierarchy.

If we try to delete an entry that has some children, we'll fail:

DELETE
FROM    [20100303_cascade].TestComposite
WHERE   id = 42
        AND siteUrl = 'site1'
Msg 547, Level 16, State 0, Line 1
The DELETE statement conflicted with the SAME TABLE REFERENCE constraint "fk_TestComposite_self". The conflict occurred in database "test", table "20100303_cascade.TestComposite".
The statement has been terminated.

To delete an item and all of its children, we should build a hierarchical query to retrieve the whole branch, and delete the branch all at once.

To do it, we just semi-join the table to the results of the recursive CTE:

WITH    q AS
        (
        SELECT  id, siteUrl
        FROM    [20100303_cascade].TestComposite
        WHERE   id = 42
                AND siteUrl = 'site1'
        UNION ALL
        SELECT  tc.id, tc.siteUrl
        FROM    q
        JOIN    [20100303_cascade].TestComposite tc
        ON      tc.parentID = q.id
                AND tc.siteUrl = q.siteUrl
        )
DELETE
FROM    [20100303_cascade].TestComposite
OUTPUT  DELETED.*
WHERE   EXISTS
        (
        SELECT  id, siteUrl
        INTERSECT
        SELECT  id, siteUrl
        FROM    q
        )

View query results

We see that the whole branch of 31 records was deleted all at once, without violating the FOREIGN KEY.

Unlike some other systems, we don't have to worry about the order the records are deleted, since all SQL Server referential constraints are deferred till the end of the query.

Written by Quassnoi

March 3rd, 2010 at 11:00 pm

Posted in SQL Server

5 Responses to 'SQL Server: deleting with self-referential FOREIGN KEY'

Subscribe to comments with RSS

  1. In my case the query goes to infinite loop,
    If I use option (maxrecursion 365) or (maxrecursion 3267)
    any value it fails with error as reached highest recusion value . If I use 0 which is infinite the query runs for an hour with no output.

    Jay

    7 Jul 14 at 11:02

  2. This means you have loops in your data.

    Quassnoi

    7 Jul 14 at 16:16

  3. Hi Quassnoi,

    Any suggestion on how to avoid this situation?
    I understand it data related issue which is causing infinite loop, But I cannot control the data being sent to the DB . Any other way we can handle it?

    Thanks,
    Jay

    Jay

    9 Jul 14 at 09:35

  4. Quassnoi

    10 Jul 14 at 21:29

  5. Can you please explain the last part? Is it the same as:

    WHERE id in (SELECT id from q)

    ?

    Henry Ho

    16 Sep 14 at 05:15

Leave a Reply