A few weeks ago I have presented my Performance Tuning Workshop in Utrecht/Netherlands. On the 3rd day – the most intensive one, where I cover Locking, Blocking, Deadlocking, Latching, Spinlocking – we have also talked about the NOLOCK query hint. Some people always refer to the NOLOCK query hint as the Turbo Booster in SQL Server, because it avoids a huge amount of blocking situations. In this blog posting I want to show you why the NOLOCK query hint is a bad idea.
One of the largest side effects of the NOLOCK query hint is the fact that so-called Dirty Reads can appear in your result set. Let’s have a look at the following code.
BEGIN TRANSACTION UPDATE Person.Person SET FirstName = 'James' WHERE LastName = 'Jones'
As you can see I begin a new transaction, and perform an UPDATE statement against the table Person.Person in the AdventureWorks2012 database. When you now try to read this record concurrently in another session, this SELECT statement will block – in that case the requested Shared Lock is blocked by the already granted Exclusive Lock – the writer blocks the reader.
SELECT FirstName, LastName FROM Person.Person WHERE LastName = 'Jones' GO
Some people apply now the Turbo Booster option in SQL Server and use the query hint NOLOCK:
SELECT FirstName, LastName FROM Person.Person WITH (NOLOCK) -- It's a kind of magic... WHERE LastName = 'Jones' GO
As you can see the blocking situation is immediately resolved and you get back a result from SQL Server – unfortunately you have now a Dirty Read in front of you: you have read uncommitted data. Imagine now that your first transaction with the UPDATE statement is rolled back: in this case you have read data that never ever existed logically in SQL Server!!!
Therefore the query hint NOLOCK doesn’t really make sense in every possible scenario. It *can be* an option if you want to run a report, and you just want to get back a result very fast, and the result doesn’t have to be 100% correct (think about average sales numbers over the day). Of course, when you need a precise result, NOLOCK isn’t really an option. And there are also some specific cases where even NOLOCK can block 😉
Read Committed Snapshot Isolation
One person in Utrecht gave me another good point why the NOLOCK query hint is mostly a no-go option: you can’t switch your database queries in a transparent way to Read Committed Snapshot Isolation (RCSI). I never thought about this specific scenario, but it makes of course sense. Read Committed Snapshot Isolation is a database option. When you enable it *every* query runs in the new optimistic isolation level Read Committed Snapshot Isolation – as long as you don’t have specified any other isolation level in your transactions.
With the NOLOCK query hint you temporarily change the isolation level for the specific SQL statement to Read Uncommitted. Therefore the SQL statement will also not benefit from Read Committed Snapshot Isolation, because the statement is not running in the default isolation level of Read Committed. Makes sense! Just think about this additional argument the next time when write magically … WITH (NOLOCK)…
Running every possible query with the query hint NOLOCK doesn’t really make sense. On one hand you can get back inconsistent data through Dirty Reads. And on the other hand you can’t benefit from the optimistic isolation level Read Committed Snapshot Isolation, because you have changed temporarily the default isolation level of your SQL statement.
Thanks for your time,
11 thoughts on “Why the Query Hint NOLOCK is a bad idea”
Another example with NOLOCK that is frequently quoted but I had never seen, is duplicate reads due to page splits. I created an example of this behavior – http://wp.me/p3Vxvi-8F
Thanks for this another great example!
I agree with you with regards to transaction databases. However, in the data warehouse/BI world, dirty reads and isolation levels are rarely significant issues. The NOLOCK hint is the default for most queries for performance, scale and concurrency in the bulk operations for reporting.
Thanks for your comments.
I’m mainly a OLTP guy, so I can’t talk about DWH/reporting scenarios, but of course the requirements are here different.
There are definitely places where NOLOCK should be avoided due to the “dirty read” effects. However, there are vastly more places where NOLOCK can and *should* be used safely.
If you are retrieving a record set of unknown size based on a query AND you need to be sure that the record set does not include any uncommitted data, then NOLOCK may be a bad choice.
If you are retrieving a specific record based on a key, then NOLOCK can avoid unnecessary blocking with other transactions.
Even in the open record set case, you need to think about whether there is any harm in seeing uncommitted data? If the “uncommitted” data is rolled back, then yes – you saw some data that was invalid. But in an operational system, how often are rollbacks of transactions really occurring? A rollback is normally due to an error, which should be relatively rare in a working system.
Also – assuming your query is doing something meaningful, you are probably joining with multiple other tables. In that case, if the uncommitted data has not been completely instantiated in all the relevant tables, you won’t see it in your result set anyway.
Examples of why NOLOCKs are bad show simple retrievals of an open-ended set, which could then include additional uncommitted records. And the harm there is? Your report has an extra row?
On the other hand, I have seen too many systems where the developers settled for the default READ COMMITTED mode without considering NOLOCKs, resulting in unnecessary blocking and poor performance – sometimes even to the level of transaction timeouts when many users tried to perform similar actions (against entirely different records, but in the same tables).
Thanks for your comment.
In addition you also have to know that there are other side effects with the NOLOCK query hint – like wrong row counts: http://blogs.msdn.com/b/sqlcat/archive/2007/02/01/previously-committed-rows-might-be-missed-if-nolock-hint-is-used.aspx
I have seen NOLOCK used a lot, but never understanded too much its pros and cons. This post is quite clear helping to understand it better. Thank you.
Thanks for your comments.
Nolocks are not evil used correctly, we write high performance web sites with millions of rows in the db.
A website is stateless, the data you show to the user is disconnected so its committed state is not important. What is important is you understand when you need locks, when updating based on data etc. Reads to dump data, really who cares? Yes you can get phantom reads, potential duplciate rows. But Pros way out perfrom the cons.
There is no right or wrong here NOLOCK is not evil or bad, I’m not a fan of NOLOCK on everything, we have dbs where you will not see a nolock hint anywhere and others with it scattered around. Less bottlenecks, better throughput = happier users.
Use it correctly and not some magic turbo button.
Thanks for your comment.
A better option instead of the NOLOCK query hint would be IMHO Read Committed Snapshot Isolation.
Because it also prevents the reader/writer, writer/reader conflict, and still provides you consistent data.
But as I have written: migrating to RCSI with the NOLOCK query hint isn’t really straightforward…
Problem with Read Committed Snapshot Isolation is that it cant be used in views even if it was able to be used it would probably slow down the results of the view.