INTERSECT ALL in SQL Server

(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan - you get a weekly email packed with all the essential knowledge you need to know about performance tuning on SQL Server.)

In my previous blog posting I talked about the basics of the INTERSECT set operation in SQL Server, what the differences are compared with an INNER JOIN, and why you need a very good supporting indexing strategy. Today I want to talk about the INTERSECT ALL operator that is not implemented in SQL Server.

INTERSECT ALL is part of the SQL specification, but SQL Server doesn’t care about it. The difference to the INTERSECT operator is very simple: INTERSECT ALL doesn’t eliminate duplicate rows. The nice thing is that you can simulate an INTERSECT ALL in SQL Server. Let’s try that by creating our 2 tables again, and by inserting some sample rows.

As you can see, the 2nd table consists of a duplicate record – the record with the values 2 appears twice in the table. When you now perform an INTERSECT between both tables, the records with the values of 2 just appear once in the result set. The duplicate row was just eliminated!

Duplicate Rows are eliminated by the set operation INTERSECT in SQL Server

If you want to preserve duplicate rows, you have to make them unique. One approach here is to use the ROW_NUMBER() Windowing Function that was introduced back with SQL Server 2005. With that function you just generate a unique row number for every duplicate record. Therefore your duplicate records become unique, and the “duplicate” row is returned twice as expected. The following listing shows this technique.

Summary

One of the side-effects of the INTERSECT operation in SQL Server is that duplicate rows are eliminated and not returned in the result set. If you want to preserve them you have to make them unique, e.g. by applying a ROW_NUMBER() calculation.

Like or share to get the source code.

Thanks for your time,

-Klaus

5 Comments

  • Joe Celko

    You can use this trick to get EXCEPT ALL, too.

  • Yuri Petrov

    I think your code contains small bug – row_number must be partitioned by the both col1 and col2 values.
    Otherwise you may obtain incorrect output, for example with the next data:

    declare @t1 TABLE (
    Col1 INT,
    Col2 INT,
    Col3 INT
    )

    declare @t2 TABLE (
    Col1 INT,
    Col2 INT
    )

    INSERT INTO @t1 VALUES (1, 1, 1), (2, 2, 2), (2, 2, 2), (3, 3, 3)
    INSERT INTO @t2 VALUES (2, 2), (2, 1), (2, 2), (3, 3)

    ;WITH IntersectAll AS
    (
    SELECT
    ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY (SELECT 0)) AS RowNumber,
    Col1,
    Col2
    FROM @t1

    INTERSECT

    SELECT
    ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY (SELECT 0)) AS RowNumber,
    Col1,
    Col2
    FROM @t2
    )
    SELECT Col1, Col2 FROM IntersectAll

    Regards,
    YP

    P.S. Grey foreground is so unreadable :((

    • Klaus Aschenbrenner

      Hello Yuri,

      Thanks for your comment.
      You are so right – the PARTITION BY was wrong!
      I have already changed it.

      Thanks!!!

      -Klaus

  • Thomas Franz

    Hi,

    the link to the previouse blog posting refers to this site again…

    best regards
    Thomas

    • Klaus Aschenbrenner

      Hello Thomas,

      Thanks for the info – already fixed it 🙂

      Cheers,

      -Klaus

It`s your turn

Your email address will not be published. Required fields are marked *

SQLpassion

Copyright © 2015 by SQLpassion · Klaus Aschenbrenner · Imprint · Offerings · Academy · Contact · Go to Top