Is a Index Scan always a Index Scan?
A few weeks ago, someone posted an interesting question on Twitter:
Q: While watching a Summit14 recording, unanswered by speaker: Is there a way to tell how much of index is scanned with a NC scan operator?
One of the answers to this question was that a Non-Clustered Index Scan scans always the whole index.
A: Yes, it’s always 100%. A SCAN operator is always the WHOLE index ….
But there are some specific circumstances where this isn’t the case. I just want to concentrate in this blog posting about one specific example that you will always encounter – a TOP, MIN , or MAX expression in your SQL query.
TOP, MIN, MAX
Let’s have a look at the following 2 queries.
SELECT TOP 10 * FROM Person.Person
MIN(BusinessEntityID) AS 'Min',
MAX(BusinessEntityID) AS 'Max'
The first query returns the first 10 rows from the table Person.Person, the second query returns the minimum and the maximum of the column BusinessEntityID, which is the Clustered Key column on that table. When you look at the resulting execution plan, you will see interesting things:
The first query “scans” the Clustered Index to retrieve the first 10 rows, and for the second query the Clustered Index is also “scanned” 2 times to retrieve the minimum and maximum BusinessEntityID. But in these circumstances the Clustered Index Scan isn’t really a complete scan of the Clustered Index, because the Top-Operator short-circuits the Clustered Index Scan. What does this mean?
You have learned that normally you should read execution plans from right to left, because the rows in the execution plan are also flowing from right to left. But during the execution the execution plan is executed from left to right. SQL Server uses internally a so-called Iterator-Model where every operator in the execution plan just requests a new row from the operator on the right side. The following picture illustrates this very important concept.
Because of this iterator model the final data flow is from right to left. When you look now at the previous generated execution plan, you can see that the Top operator has a so-called Top Expression:
For the first query the Top expression is 10, and for the two Top operators in the second execution plan the Top expression is 1. This Top expression just defines how many rows the Top operator consumes from the input operator on the right hand side. When the Top operator in the first query has consumed 10 rows (the TOP 10 rows), then the Top operator just short-circuits the execution plan, and doesn’t return any more rows to the SELECT operator, which means that the query execution has finally finished.
And the same thing happens in the second execution plan. To retrieve the minimum value of the column BusinessEntityID (which is the Clustered Key value), the Top operator just consumes the first row from a Forward Clustered Index Scan. And the maximum value is just the first row from a Backward Clustered Index Scan.
When you see a Top operator in an execution plan, you always have to think about this specific scenario: the Top operator can just short-circuit your Scan operators. As a conclusion the Scan isn’t really always a Scan in the execution plan.
Thanks for your time,