Table Ordering for Joins
In today’s blog post I want to talk about a very interesting topic in SQL Server: whether or not it makes sense to rearrange the tables in a join in a specific order. Every time that I do a presentation about query and performance tuning, people ask me if they should arrange the tables in a join in a specific order to help the Query Optimizer to come up with a better performing execution plan. So let’s have a look at this very important and interesting question.
Imagine you want to make an Inner Join between the tables Sales.SalesOrderHeader and Sales.SalesOrderDetail in the AdventureWorks database:
SELECT h.SalesOrderID, h.CustomerID, d.SalesOrderDetailID, d.ProductID, d.LineTotal FROM Sales.SalesOrderHeader h JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID ORDER BY SalesOrderID GO
When we look at the resulting execution plan, we can see that the Query Optimizer has chosen a Merge Join as the physical Join operator, and that the table Sales.SalesOrderHeader is referenced as the outer table of the Merge Join. The tables are referenced in the execution plan in the same order as we have referenced them in the logical T-SQL query.
The question is now, what happens to the execution plan when we swap both tables in the logical T-SQL query? Let’s try it:
-- The logical ordering of the tables during an Inner Join -- doesn't matter. It's up to the Query Optimnizer to arrange -- the tables in the best order. -- This query produces the same execution plan as the previous one. SELECT h.SalesOrderID, h.CustomerID, d.SalesOrderDetailID, d.ProductID, d.LineTotal FROM Sales.SalesOrderDetail d JOIN Sales.SalesOrderHeader h ON d.SalesOrderID = h.SalesOrderID ORDER BY SalesOrderID GO
When we look at the resulting execution plan now, we can see something very interesting:
Nothing has changed in the execution plan! The Query Optimizer has still chosen the same physical execution plan as it did with our previous query. But why? The answer is easy: the Query Optimizer always references the smallest table (based on our statistics!) as the outer table of every physical join operator (Nested Loop Join, Merge Join, Hash Match Join). Therefore the logical ordering of our tables in the T-SQL query doesn’t influence the Query Optimizer in any way. It’s the task of the Query Optimizer to access our tables in the correct order.
A Inner Join between a table A and B is the same as an Inner Join between the table B and A.
What happens to the table ordering in an Outer Join (Left Join or Right Join)? Let’s have a look at the following query, where I perform a Left Join between the table Sales.Customer and Sales.SalesOrderHeader.
-- Execute the query with an Outer Join. -- Now we are also getting back customers that haven't placed orders. -- The left table is the preserving one, and missing rows from the right table are added with NULL values. -- SQL Server performs a "Merge Join (Left Outer Join)" in the execution plan. SELECT c.CustomerID, h.SalesOrderID FROM Sales.Customer c LEFT JOIN Sales.SalesOrderHeader h ON c.CustomerID = h.CustomerID GO
When we have a more detailed look at the resulting execution plan, we can see that the Query Optimizer has preserved the ordering of our tables.
Of course this time we can’t change the ordering of our tables in the logical T-SQL query, because then the query would return the wrong result. But what happens when we swap the tables in the query, and instead of performing a Left Join, we perform a Right Join? Let’s try it:
-- You can rewrite the query from above with a Right Outer Join when you swap the order -- of the tables. SELECT c.CustomerID, h.SalesOrderID FROM Sales.SalesOrderHeader h RIGHT JOIN Sales.Customer c ON c.CustomerID = h.CustomerID GO
When we look at the execution plan, we can see again that nothing has changed: the Query Optimizer has transformed the Right Join to a Left Join and has also rearranged our tables so that it is still able to return the correct logical result. The goal of the Query Optimizer was again to use the smallest table as the outer input of the physical join operator. Therefore the ordering of the tables in an Outer Join doesn’t matter either. As long as our statistics are correct, the Query Optimizer will choose the correct ordering.
A Left Join between table A and B is the same as a Right Join between the table B and A.
In this blog posting we have looked at whether the table ordering for a join influences the resulting execution plan. As we have seen it is completely up to the Query Optimizer to choose the optimal table ordering – based on our statistics. With an Inner Join the table ordering doesn’t matter at all, and with an Outer Join SQL Server can still rearrange the tables by swapping the Left/Right Joins.
Thanks for your time,