INTERSECT ALL in SQL Server
In my previous blog posting I talked about the basics of the INTERSECT set operation in SQL Server, what the differences are compared with an INNER JOIN, and why you need a very good supporting indexing strategy. Today I want to talk about the INTERSECT ALL operator that is not implemented in SQL Server.
INTERSECT ALL is part of the SQL specification, but SQL Server doesn’t care about it. The difference to the INTERSECT operator is very simple: INTERSECT ALL doesn’t eliminate duplicate rows. The nice thing is that you can simulate an INTERSECT ALL in SQL Server. Let’s try that by creating our 2 tables again, and by inserting some sample rows.
-- Create the 1st table CREATE TABLE t1 ( Col1 INT, Col2 INT, Col3 INT ) GO -- Create the 2nd table CREATE TABLE t2 ( Col1 INT, Col2 INT ) GO -- Insert some records into both tables INSERT INTO t1 VALUES (1, 1, 1), (2, 2, 2), (2, 2, 2), (3, 3, 3) INSERT INTO t2 VALUES (2, 2), (2, 2), (3, 3) GO
As you can see, the 2nd table consists of a duplicate record – the record with the values 2 appears twice in the table. When you now perform an INTERSECT between both tables, the records with the values of 2 just appear once in the result set. The duplicate row was just eliminated!
If you want to preserve duplicate rows, you have to make them unique. One approach here is to use the ROW_NUMBER() Windowing Function that was introduced back with SQL Server 2005. With that function you just generate a unique row number for every duplicate record. Therefore your duplicate records become unique, and the “duplicate” row is returned twice as expected. The following listing shows this technique.
-- You can preserve duplicate rows by making them unique with the ROW_NUNBER() Windowing Function. WITH IntersectAll AS ( SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY (SELECT 0)) AS RowNumber, Col1, Col2 FROM t1 INTERSECT SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY (SELECT 0)) AS RowNumber, Col1, Col2 FROM t2 ) SELECT Col1, Col2 FROM IntersectAll GO
One of the side-effects of the INTERSECT operation in SQL Server is that duplicate rows are eliminated and not returned in the result set. If you want to preserve them you have to make them unique, e.g. by applying a ROW_NUMBER() calculation.
Thanks for your time,