A ever-increasing Clustered Key value doesn’t scale!
You know all the best practices how to choose a Clustered Key value? A good Clustered Key value should have the following 3 properties:
- Ever Increasing
Let’s have a more detailed look on all 3 properties, and why an ever increasing value doesn’t really scale in SQL Server.
A Clustered Key value should be as small as possible. Why? Because it takes space, and the Clustered Key Value also serves a logical pointer in the leaf level in every Non-Clustered Index. If you have a very wide Clustered Key value, you also deal with larger Non-Clustered Indexes. If you have defined a Non-Unique Non-Clustered Index (which is normally the case), the Clustered Key value is also part of the navigation structure of your Non-Clustered Index. Therefore the overhead in your index gets very large. And our goal is to minimize overhead on our index pages. Overhead that we have to pay in the physical storage, and also in the Buffer Pool, where SQL Server caches the read index pages from storage.
Normally you choose a technical key value (like INT/BIGINT data type) instead of a natural key value. I have already seen over the years Clustered Key value lengths of 100 bytes and more (combinations of LastName, FirstName, SocialSecurityNumber, etc.). Believe me – you are just waisting memory! You don’t have to do that. Choose a technical key, and you are fine.
Because the Clustered Key value is replicated in every Non-Clustered Index, your Clustered Key value should never ever change! Otherwise SQL Server has to maintain transparently in your UPDATE Execution Plan EVERY Non-Clustered Index that you have defined on your table. You are again just introducing additional computing overhead that you don’t need. Use your CPU cycles for more other important stuff. As you know, natural key values can change (like a LastName column, when you get married).
A technical key value (like an INT IDENTITY) can’t change (by default). Therefore the logical pointer (in the form of the Clustered Key value) in all your Non-Clustered Indexes remains stable – without any need to change them – forever!
And the 3rd final important property of a “good” Clustered Key value is that the chosen column should give you ever-increasing values. Why? Because you are always adding additional records at the end of your Clustered Index, and therefore you are avoiding expensive Page Splits (regarding CPU cycles, and Transaction Log overhead) and Index Fragmentation. In 99% of all cases you will be fine with an ever-increasing value like an INT IDENTITY column, but there are some scenarios, where this approach can lead to serious scalability problems. Imagine you have a workload, where a lot of different users are inserting permanently into the same table with an ever-increasing Clustered Key value. Just think a second about about Logging/Auditing tables.
Let’s have a more detailed look on what happens internally in SQL Server, when you reading and writing pages in memory. When SQL Server accesses certain memory structures (like pages that are stored in the Buffer Pool), these memory accesses must be synchronized among multiple threads. You can’t write concurrently to the same page in memory. When a thread writes to a page, some other thread can’t read the page at the same time. In traditional concurrent programming you solve that problem with Mutexes – like a Critical Section. Certain code paths are just mutually exclusive. A Latch in SQL Server is almost the same as a Critical Section. And latches are used to synchronize threads/queries among each other. Every time when you read a page, the worker thread has to acquire a Shared Latch (SH), every time when you write a page, the worker thread has to acquire an Exclusive Latch (EX). And both latches are incompatible to each other.
When you now perform an INSERT statement, the worker thread exclusively latches the page where the INSERT statement occurs. In the mean time no one else can read and write from/to this page. With an ever-increasing Clustered Key value this approach doesn’t really scale, because you are just inserting your records at the end of your Clustered Index. Therefore your parallel threads/queries are contending about an Exclusive Latch on the same last page in your Clustered Index. As a side-effect SQL Server executes your INSERT statement serially – one INSERT after the next one. You have hit the famous Last Page Insert Latch Contention. Let’s have a look at the following picture.
With the best practice of an ever-increasing Clustered Key value you have a single hotspot at the end of your Clustered Key. The smaller your records are, the more contention you are introducing here. How can you solve that problem? Easy: spread your INSERT statements across the whole B-Tree structure of the Clustered Index. There are various approaches how you can achieve that:
- Use a random Clustered Key value (like a UNIQUEIDENTIFIER). But be aware of the side-effects: larger logical pointer in EVERY Non-Clustered Index, Page Splits…)
- Implement Hash Partitioning, if you are using the Enterprise Edition of SQL Server.
- Eliminate latching through the use of In-Memory OLTP, that is part of SQL Server 2014.
- Use a so-called Reverse Index. Unfortunately SQL Server doesn’t provide you that kind of index out-of-the box, like Oracle. But you can implement it at your own…
At 99% you will be fine with a narrow, static, and ever-increasing Clustered Key value like an INT IDENTITY data type. But in some rare scenarios where you need a huge amount of parallel INSERT statements (like Logging/Auditing tables), you can hit the Last Page Insert Latch Contention with that approach. If you hit that specific problem, you have to leave your comfort zone, and you have to make sure that you spread the INSERT statements across your whole B-Tree structure. You are mainly fighting against a limitation of how multi-threaded access happens in a traditional B-Tree structure.
I hope that I have given you with that blog posting a good insight, why ever-increasing Clustered Key values can hurt the scalability of your tables.
If you are more interested in how to choose the right Clustered Key Value, I’m also offering a 1-hour long training video through the SQLpassion Online Academy.
Thanks for reading!