Heap Tables in SQL Server

Today I want to talk about Heap tables in SQL Server. Heap tables are tables without a Clustered Index. A table in SQL Server can have a Clustered Index, then it’s called a Clustered Table, and without a Clustered Index, it’s called a Heap Table.

In a heap table, the data is not sorted in any way, it’s just a pile of unordered, unstructured records. When you access a heap table through a SELECT statement, SQL Server will use a Table Scan operator in the Execution Plan, when you have no suitable Non-Clustered Index defined. There is no Table Seek operator available. That’s very important. A heap table will not scale with the amount of data you have. The more data you have, the longer the operation takes. A Table Scan is always a O(n) operation (more about the Big O notation) – it will not scale as your table gets larger and larger. Now let’s have a more detailed look at the advantages and disadvantages of heap tables in SQL Server.

Advantages

Heap tables are very, very, very fast – for inserting data. As we have said earlier it’s just a pile of data – nothing more. When you crack out every page from a traditional phone book and place the individual pages on the desk in front of you, you have created a heap table. Inserting new phone book records into that heap table is very fast: you allocate a new page (of 8kb), write the new records onto that page, and finally put the page with the other ones in front of you. Finished. No sorting order has to be guaranteed.

It’s the same with SQL Server: a new page is allocated, the records are stored onto that page, and the page is assigned to the heap table. Finished. This is a very fast approach, because SQL Server doesn’t has to ensure any sorting order. It’s up to SQL Server where to put the new records.

For that reason it can be sometimes a good idea to leave some tables as heap tables in a database schema: tables on which you have a huge INSERT activity. Just think about your logging/auditing tables. But I’m never ever recommending using heap tables everywhere. There are only a few specific use cases where they make sense. But not everywhere!

Disadvantages

Besides the advantage that heap tables are very fast for inserting data, there are also a huge number of disadvantages, that you also have to incorporate into your thoughts when you decide to create a table as a heap table.

As a first disadvantage, a heap table leads to random I/O in the storage subsystem when you access the table data. Imagine you are executing a simple SELECT statement against your heap table. If the data isn’t yet cached in the Buffer Pool, SQL Server has to issue physical reads to your storage subsystem. These read will be random I/O, because the pages of the heap table are stored somewhere in your data files, but not next to each other.

If you are using traditional rotational storage (which is currently still the most common case), you will have performance problems on your storage level, because random I/O is slow, very slow. SSD drives are a big game changer here, because it’s doesn’t really matter anymore, if you are performing random or sequential I/O on SSDs. Both operations are almost the same regarding their speed (random I/O is still “a little” bit slower than sequential I/O).

Another big problem that you will definitely have with your heap tables are so-called Forwarding Records. Records that are stored on a heap table can move in some circumstances from one page to another one. If this occurs, SQL Server stores on the original page a forwarding record that points to the new location where the record is stored. When you are accessing your data, SQL Server still accesses the original page, and fetches the record through the forwarding record, which is an additional page read that you need. And this will also slow down your read performance tremendously. If you are more interested in forwarding records, and how you can avoid them, I suggest that you watch my 10th SQL Server Quickie on that specific topic.

Summary

Heap tables have their uses – in some cases. I always recommend creating a Clustered Table (define a Clustered Index on it), but think about some specific use cases, and whether a heap table would serve your business needs in a better way (regarding the performance characteristics). If you want to have a more detailed explanation of when heap tables can be suitable for your workload, I also highly recommend that you read Thomas Kejser’s blog posting Clustered Indexes vs. Heaps about it. Thomas gives you a very controversial, but still valid insight on when heap tables can make sense for you.

Thanks for your time,

-Klaus

5 thoughts on “Heap Tables in SQL Server”

  1. Nice read, was not aware of forwarding records.

    Is there a difference in terms of size when storing a table as a heap or clustered?

    My experience of heap tables is that they are very useful when there are a low level of reads. Also, if the table is partitioned, and you get the partition column right, you’ll only ever end up scanning the partition instead of the whole table.

    1. Klaus Aschenbrenner

      Hello Richie,

      Thanks for your comments.
      Yes there are differences when you store your data as heaps or clustered indexes, because *both* can introduce you fragmentation.

      Thanks,

      -Klaus

    2. Thomas Franz

      Regarding the size:
      – if you have no nonclustered indexes (and no fragmentation) the size should be the same
      – if you have nonclustered indexes on a heap table SQL Server would save a pointer to the page / row in the index
      – if you have nonclustered indexes (NCI) on table with clustered index it would save the clustered index columns again in the index (so that it could do a key lookup)
      – so it depends of the size of the clustered index (CLI)
      – if the CLI is only a 4 byte integer column it would be smaller than the 8 or 16 byte (?) pointer in a heap and you would save space
      – if your CLI is composed by an 32 byte char plus an bigint plus a datetime column the NCI’s would be bigger than with a heap (except you already have the columns in the NCI as index- or include-column)

Leave a Comment

Your email address will not be published. Required fields are marked *