Myths and Misconceptions about Hekaton
Over the last weeks and months, I invested a lot of my time to get a better understanding of Hekaton – the new In-Memory technology that will be part of SQL Server 2014. I have done a lot of reading about the various internal data structures that Hekaton is using for its storage (mainly Hash Indexes, and Bw-Tree). And in additional I gave 5 Hekaton talks across Europe, and also attended various Hekaton sessions by other speakers.
But as I have seen several times, a lot of misinformation, myths, and misconceptions are presented, and people are thinking in the wrong way about Hekaton. Even from the questions that are asked from attendees you can see that a *huge* knowledge transfer is needed so that everyone gets a better understanding about the basic concepts of Hekaton, and when Hekaton is suitable.
Here are just a few highlights from the last weeks that I have heard from presenters and attendees when they have asked questions about Hekaton:
- “Hekaton is an In-Memory technology, means data is not persisted anymore”
- “Hekaton only runs on specific CPU architectures”
- “When you move to Hekaton, you gain a 100x improvement for your workload”
- “There is no Locking, Blocking, and Spinlocking in Hekaton”
Those are just a few highlights about misconceptions that I have heard in the past. So the goal of this weblog posting is give you an overview about the most prominent misconceptions and questions from the last weeks, and I am also telling you *why* they are just wrong. So lets get started with my top highlights (in no specific ordering)!
“Is Hekaton providing the ACID properties for transactions?”
- Atomic: You are still dealing with transactions that are able to rolled forward and rolled back, even when your SQL Server crashes, and SQL Server has to go through crash recovery for your Hekaton enabled database (as long as you are persisting your data across SQL Server restarts – more on that later).
- Consistency: Hekaton also provides you *always* consistent data. This is currently very easy to achieve, because one of the biggest limitations of Hekaton is the ability to create Foreign-Keys, and constraints itself. So mainly you are getting a consistency within your Hekaton enabled table.
- Isolation: A few days ago, when I have delivered my Hekation talk in Bristol/UK during my SQLRelay tour, someone wanted to convince me, that Hekaton provides you Dirty Reads, because there is no Locking and Blocking anymore. WRONG! Hekaton just used an approach called “Multi Version Concurrency Control” (MVCC), that makes the magic of no locks possible. When you are reading data, you are getting back a version that was valid at the beginning of your statement/your transaction (depending on the used Transaction Isolation Level). So there are no Dirty Reads in Hekaton.
- Durable: It depends 😉 When you create your Hekaton table with Schema and Data Durability, your data will also survive a SQL Server crash. It will be just recovered from the transaction log, and the Checkpoint Files that Hekaton writes. That is a very important point: when you want your data to be durable, Hekaton is still using your transaction log, which means that your transaction log will be one of your final bottlenecks in Hekaton. But Hekaton uses here a very efficient logging model, compared to traditional disk based table. One change here is, that only data modifications are logged, but not on the index level anymore. When you are doing an INSERT into a traditional disk based table, SQL Server has to log the INSERT for *every* index (Clustered, Non-Clustered). In Hekaton, SQL Server only logs once the INSERT, because all your Hekaton Indexes (Hash Indexes, Range Indexes) are recreated during the startup of SQL Server. Therefore the impact on the transaction log is as small as possible.
“Hekaton is the new No-SQL approach by Microsoft”
That is also a very nice one. Someone in Bristol wanted to convince me that Hekaton is just a new approach by Microsoft of creating a No-SQL product. Come on, with Hekaton we are still talking about a relational database with all ACID properties (see previous point). Hekaton and No-SQL are just 2 different things, which have nothing in common. Hekaton uses internally just a more elegant and faster approach to implement the characteristics of a relational database – nothing more!
“Hekaton only works on specific CPU architectures”
Wow, I thought I was in a wrong dimension! Hekaton only works on specific CPU models/architectures, because internally Hekaton uses so-called Atomic CAS operations (Atomic Compare & Swap, or Atomic Compare & Exchange). That statement is more than completely wrong! Of course, the Bw-Tree uses internally CAS operations to make SMOs (Structure Modification Operations) in the tree as multiple atomic steps. Hekaton uses here internally the Win32 API function “InterlockedCompareExchange”, see http://msdn.microsoft.com/en-us/library/windows/desktop/ms683560(v=vs.85).aspx). This function just compares a value at a specific memory location against the old value, if both are the same, a new value is written to the memory location. The function itself is executed as *one* atomic assembly instruction on the CPU level, means no other thread can interrupt that assembly function. It is just executed as one atomic block from the beginning to the end.
The myth was here, that the needed assembly function is only supported on specific CPU architectures. That is true, but the assembly function is supported since the Pentium processor! The function itself is not available on 386 and 486 architectures… As you can also see from the requirements section of the above mentioned MSDN article, the minimum supported Windows OS is Windows XP! So when you are running SQL Server 2014 on a pre Windows XP installation, that myth would be correct.
“There is no Locking, Blocking, and Spinlocking in Hekaton”
In theory this statement is correct. This statement can be also debated from various aspects. Lets start with the first aspect. Hekaton *itself* is lock, latch, and spinlock free, but you are still dealing with the traditional relational engine of SQL Server. This means when you are leaving the Hekaton universe, you are still dealing with the old code base (sorry for that..) of SQL Server, e.g. the Transaction Log Manager. And this code still has latches and spinlocks to synchronize the access amoung the various threads. From that perspective the above statement is only partially true.
And the second aspect where you can still have a blocking in Hekaton is, when you are dealing with the atomic CAS operations. An atomic CAS operation can not be interruped by a different thread. Therefore SMOs (Structure Modification Operations) in Bw-Trees can be implemented in a very clever, elegant way. This also means when you want to perform at the same time a SMO on the same page in the same Bw-Tree, one thread will win the race, and the other thread has to retry his atomic CAS operation. What happens in the mean time? The thread has to spin, and try the atomic CAS operation again. From my basic understanding, the atomic CAS operation itself is just an assembly function that is wrapped by a synchronization concept like a Criticial Section. This means your thread has to spin, and you are loosing CPU cycles, and increasing the latency of your transaction. Of course, SMOs should be very, very rare, so that this is not a big deal – but it is still possible that threads are spinning, when there is contention for a low-level synchronization object (or assembly function).
This is my basic understanding on atomic CAS operations. If I am wrong on that, please feel free to correct me!
“Because INT IDENTITY values are not supported in Hekaton, use Sequences”
This was also a nice statement that was done yesterday at the SQLRelay conference in London. I am not signing this statement in any way! Why? Because a sequence is a shared object in your SQL Server database, means the access to the current value has to be synchronized by SQL Server. And that synchronization ends in contention, which means you can not scale your workload, and Hekaton is all about scaling workloads.
I have done on CTP1 some tests with Sequence values, and as soon as you are running a huge amount of parallel threads against your Hekaton tables/stored procedures, you are hitting latch contention in the Sequence Generator. Of course, the Sequence Generator stores on some internal pages the current value, and when the Sequence Generator reads and writes to these specialized pages, they have to be latched. You are just ending up with Latch Contention in the Sequence Generator, and your Hekaton workload will not scale! In my tests it also does not made a difference, if I have requested a whole range of sequence values, nor it made a difference when I used caching. The Sequence Generator was always the bottleneck.
How can you overcome this specific problem? Use our good, old, bad friend UNIQUEIDENTIFIER. These values are generated completed indepedent from each other, which means there is no shared resource involved when you generate a new value, and therefore you will eliminate the bottleneck, and Hekaton will scale until you hit your CPUs at 100% usage (besides other bottlenecks like the transaction log, or your network).
“Hekaton is completely transparent to your application”
This statement is true, as long as you have no idea about database design. As I have already mentioned earlier, Hekaton provides in the first release no possibility to create Foreign-Keys and Check Constraints. I am dealing with a huge amount of worst performing databases during my various consulting engagements, but I have never ever seen any disk based table, which can be moved one-by-one to a Memory-Optimized table. You remember, INT IDENTITY values are currently not supported? When you move to Hekaton, you have to do more than switching a simple switch to get an performance improvement by 100x. Sorry.
“Hekaton uses a traditional B+ Tree structure for Range Indexes”
No! Range Indexes are using a so-called Bw-Tree, which is almost the same as a B+ Tree that is used currently used by SQL Server for Clustered and Non-Clustered Indexes. And the Bw-Tree is based on the B-Link Tree – just to confuse now everyone 😉 There are mainly 3 huge differences in a Bw-Tree, compared to a traditional B+ tree:
- The intermediate pages are storing the High Range Key value, instead of the Low Range Key value of the page in the level below. Because the High Range Key value is stored on a page, SMOs (like Page Splits) can be implemented in 2 atomic operations (performed by 2 atomic CAS). This is a concept coming from the design principles of the B-Link tree.
- Pages are *never ever* changed, because this will lead to CPU Cache Line Invalidations, which have to be propagated across the whole memory hierarchy, which is very expensive (in terms of wasted CPU cycles). When Hekaton has to change a page in memory, the original memory location is untouched, and Hekaton just creates a new Delta Record, which describes the performed modification. And a so-called “Page Mapping Table” points to the new delta record, and the delta record to the old unchanged page. Because of this approach CPU Cache Line Invalidations can be avoided.
- Page sizes are elastic, and are not always 8kb large.
“Hekaton provides you super fast business logic inside the database”
From a Microsoft perspective this is a valid statement, because SQL Server is licensed on a CPU core level since SQL Server 2012. The more CPU cycles you are using, the more CPU cores you need, the more you are paying to Microsoft for the CPU core licenses. But from an architectural perspective you are doing everything wrong! A database deals with stored data, and the retrieval of data, but a database is not an application server, that should perform business logic. Just think about that. When you have CPU contention, because you are running a huge amount of business logic within your database server, you should rearchitecture your database, so that you are moving your business logic to a dedicated application server, which is licensed completely different as SQL Server – you are just paying for the OS, and thats it!
“How can I move my complete SAP database to Hekaton. Is Microsoft providing tools for that task?”
When you are thinking about moving a whole database into Hekaton, please start thinking in the first step. Hekaton is there to solve specific problems, like Latch Contention. It only makes sense to move specific tables and stored procedures to Hekaton – but please – not the *whole* database! For every database object (table, stored procedure) SQL Server has to compile and link the corresponding DLL file (that is loaded afterwards into the process space of sqlservr.exe, and this also takes some time. The compilation and linking is also performed when you are restarting your SQL Server, or when you perform a cluster failover. So it directly affects your RTO (Recovery Time Objective) of your High Availability approach!
I hope that I was able to clarify some myths, misinformation, and misconceptions about Hekaton. As I am always saying in my Hekaton talks: Hekaton makes sense, if you have the *right* problems, if you just have the traditional problems (bad indexing strategy, bad performance storage) forget about Hekaton, and make your homework first. Hekaton is like a Formula 1 car: if you have a bad driving style, even a Formula 1 car can’t help you.
Thanks for reading, and I am looking forward to your comments.