How the CHECKPOINT process deals with uncommitted transactions

At the beginning of my SQL Server Performance Tuning Workshop I always start by giving a general overview of what happens within SQL Server when we execute a query. Executing a SELECT query is quite simple, but executing a DML statement is more complex, because SQL Server has to change the corresponding page in memory, and has to log the whole transaction within the transaction log.

After talking about these specific steps, I always ask the same question: what happens when we have an uncommitted transaction, then the CHECKPOINT process kicks in, and then SQL Server crashes? Do we have the uncommitted data in our data files? Think for a few seconds, and write down your answer.

Creating the test scenario

Now I want to recreate this specific scenario together with you, and finally you will see if you were right. In the first step of this scenario I create a new database, a new table, and insert a record into it.

-- Create a new database
CREATE DATABASE Checkpointing
GO

-- Use it
USE Checkpointing
GO

-- Create a new table
CREATE TABLE Foo
(
	Col1 CHAR(100) NOT NULL,
	Col2 CHAR(100) NOT NULL,
	Col3 CHAR(100) NOT NULL
)
GO

-- Insert a record
INSERT INTO Foo VALUES
(
	REPLICATE('A', 100),
	REPLICATE('B', 100),
	REPLICATE('C', 100)
)
GO

-- Retrieve the record
SELECT * FROM Foo
GO

After we have inserted the record we want to know the page number where SQL Server stored that specific record. We can use the DBCC IND command to return all the pages of a specific table. In my case SQL Server used the page id 303.

-- Retrieve the first data page for the specified table (columns PageFID and PagePID)
DBCC IND(Checkpointing, Foo, -1)
GO

Retrieving the page id

When we now dump out the page with the DBCC PAGE command (we have to have enabled traced flag 3604 in our session to see the output), we can see the hex values of the inserted A, B, and C values.

-- Enable DBCC trace flag 3604
DBCC TRACEON(3604)
GO

-- Dump the first data page of the table Customers retrieved by DBCC IND previously
DBCC PAGE (Checkpointing, 1, 303, 3)
GO

Dumping out the data page

But what happens now when we perform a CHECKPOINT process, and finally kill SQL Server? Does the uncommitted data get written physically to the data file? Let’s try it…

Crashing and recovering SQL Server

Let’s begin now a new transaction and update the first column of the inserted record.

-- Begin a new transaction without committing it...
BEGIN TRANSACTION

UPDATE Foo
SET Col1 = REPLICATE('X', 100)

As you can see from the code, we haven’t yet committed this transaction! It is still a pending, uncommitted transaction. And now let’s open another session, where we will manually perform a CHECKPOINT process, and finally crash SQL Server.

-- Execute it in a different session
CHECKPOINT
GO

SHUTDOWN WITH NOWAIT
GO

Do you think now that the uncommitted transaction was written to the data file? Not sure about it? Let’s figure it out! Let’s open the data file in a hex editor (like XVI32), and jump to the beginning of page id 303. The page id itself is just the physical offset within the data file where the page starts – multiplied by 8192 bytes, because the page size in SQL Server is 8kb. Therefore the page 303 starts at the integer offset 2482176 (303 x 8192). When we look at the hex dump we can see our uncommitted data!

We have uncommitted data in our data file!

The CHECKPOINT process doesn’t differentiate between a committed and an uncommitted transaction. It just goes to the Buffer Manager and asks for all dirty pages regardless of their transaction state.

Do we have now an inconsistent, corrupt database? No, not really. Because when we now startup SQL Server, every database goes through the Crash Recovery phase and all uncommitted transactions are just rolled back. We can see that behavior in the SQL Server log, when SQL Server starts up:

Crash Recovery runs during SQL Server startup

Summary

The CHECKPOINT doesn’t care about the state of your transactions. Every dirty page from the Buffer Pool is written out to the data files. If SQL Server crashes it doesn’t matter, because Crash Recovery is able to recover your database into a fully consistent state. I hope that this blog posting gave you a better understanding of the CHECKPOINT process, and how it works together with uncommitted transactions.

As home work you could also leave me a comment telling me about other scenarios where SQL Server has to run Crash Recovery to get a consistent view for your databases. How many different scenarios do you know in SQL Server? 🙂

Like or share to get the source code.

Thanks for your time,

-Klaus

4 thoughts on “How the CHECKPOINT process deals with uncommitted transactions”

  1. Whoa! I had this doubt on my mind so many times but when I just read this, it all vanished! Fantastic post on Checkpoint and not many would post topics with such intricacy.

Leave a Comment

Your email address will not be published. Required fields are marked *