Primary Key vs. Clustered Index

(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan - you get a weekly email packed with all the essential knowledge you need to know about performance tuning on SQL Server.)

In today’s blog posting I want to talk about a very important topic in SQL Server that always confuses people when they first work with SQL Server. It is about the difference between the Primary Key constraint and the Clustered Index.

What is a Primary Key?

Let’s talk first about the Primary Key constraint itself. As the name implies it is just a constraint, and with that constraint you tell SQL Server that you want to have unique values in a specific column or in a specific group of columns. The following listing shows a very simple table definition where the Primary Key constraint is specified on the first column Col1.

Now when you insert records into that table, SQL Server makes sure that you always have unique values in the column Col1. If you try to insert a duplicate value, SQL Server returns error message.

Msg 2627, Level 14, State 1, Line 9 Violation of PRIMARY KEY constraint ‘PK__Foo__A259EE544224D12A’. Cannot insert duplicate key in object ‘dbo.Foo’. The duplicate key value is (1). The statement has been terminated.

The Primary Key constraint itself is defined on a logical level – you just tell SQL Server that you want to have unique values in a specific column. But SQL Server also has to enforce that uniqueness on the physical level – in the data structures where you store your table data. In the case of SQL Server, the uniqueness on the physical level is enforced with an index structure – with a Clustered Index or Non-Clustered Index. Let’s have a more detailed look at this.

Enforcement of the Primary Key constraint

When you specify the Primary Key constraint, SQL Server enforces it by default with a Unique Clustered Index on the physical level. When you look at sys.indexes, you can see that under the covers SQL Server has generated a Unique Clustered Index that is used to enforce the Primary Key constraint.

The Primary Key constraint is enforced with a Unique Clustered Index

As I have said, the Unique Clustered Index is created by default. You can also enforce a Primary Key constraint with a Unique Non-Clustered Index as shown in the following listing.

When you specify the Primary Key constraint, you can specify the following 2 options:

  • CLUSTERED
  • NONCLUSTERED

The option CLUSTERED is the default one, and therefore you don’t have to specify it. When you look now again at sys.indexes, you can see now that you have a heap table in front of you (a table without a Clustered Index), and that SQL Server has generated an additional Unique Non-Clustered Index to enforce the Primary Key constraint.

The Primary Key constraint is enforced with a Unqiue Non-Clustered Index

Primary Key <> Clustered Index

Therefore it doesn’t mean that a Primary Key and a Clustered Index is always the same in SQL Server. It is the same by default, but you can change this behavior if you want. The Primary Key constraint is always on the logical level, and the index structure is on the physical level to enforce the constraint itself.

The question now is when does it make sense to enforce the Primary Key constraint with a Unique Non-Clustered Index? Back in April 2014 I wrote about a problem in SQL Server that is called Last Page Insert Latch Contention: an ever-increasing Clustered Key column like an INT IDENTITY column doesn’t really scale in SQL Server.

If you hit that specific problem, it *could* make more sense to physically cluster/sort your table data on a random value – like a UNIQUEIDENTIFIER column. In that case you can still use the Primary Key constraint on the original ever-increasing key column, but enforce it with a Unique Non-Clustered Index, and cluster your table on a random key column. The following listing shows this approach.

When you look again at sys.indexes, you can see now that you have created a Clustered and Non-Clustered Index. But only the Non-Clustered Index is used to enforce the Primary Key constraint.

Now you have a Clustered- and Non-Clustered Index

Summary

A Primary Key constraint and a Clustered Index is not really the same in SQL Server. By default SQL Server enforces the Primary Key constraint with a Unique Clustered Index. But if you want you could instead enforce it with a Unique Non-Clustered Index. But that approach wouldn’t really make sense as a default, because you need to have specific problems to justify that approach.

Like or share to get the source code.

Thanks for your time,

-Klaus

10 Comments

  • Hi Klaus,

    Great post! It is good to see people addressing this issue in SQL Server, especially for those who come from another RDBMS technology where SQL Server’s treatment of PKs/Clustered Indexing is somewhat unique.

    As you make good reference to, a primary key is enforced through an index which may be clustered or non-clustered, it doesn’t really matter. What does matter is that a table may only have one clustered index (as this defines the physical order of the table). I wonder if it is worth more explicit treatment that the primary key will only default to a clustered index if another clustered index does not already exist?

    Regards,
    Nick

  • I’ve used this diagram I created to help show the distinction between constraints and indexes:
    http://www.e-squillace.com/tech/techdiagrams/SQLServerIndexDesignMatrix.png

    • Klaus Aschenbrenner

      Hello George,

      That’s great – nice 🙂

      Thanks,

      -Klaus

  • Thanks Klaus and other commentors – this can be a confusing thing that quickly slips into the “it is probably not too important” crack where it can lay dormant for years until it awakes to bite you where you least expected it!

    Ultimately I think the SQL Server terminology was a bad choice since that is where the confusion starts, plus enforcing a default behaviour of clustered index.

    I’ve written a bit more about it here: http://goodfast.info/now-make-it-fast

    – Kerneels

  • nauman

    Very nice article.

  • Maneesh

    Another great Article.

  • Thanks Klaus for interesting article. Why do we still need

    Col1 INT NOT NULL PRIMARY KEY NONCLUSTERED

    if Col2 becomes unique and clustered?

    Thanks

    • Klaus Aschenbrenner

      Hello Tadas,

      Thanks for your comment.
      You still want to enforce the primary key, but this time with a Non-Clustered Index.

      Thanks,

      -Klaus

  • Neeraj

    Excellent Sir ,I have not seen any better explanation of Primary and cluster Index before

It`s your turn

Your email address will not be published. Required fields are marked *

SQLpassion

Copyright © 2015 by SQLpassion · Klaus Aschenbrenner · Imprint · Offerings · Academy · Contact · Go to Top