Killing me softly with Service Broker…

(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan – you get a weekly email packed with all the essential knowledge you need to know about performance tuning on SQL Server.)

A few days ago, I found a very interesting behavior regarding Service Broker on a customer system. The main thing is that your activation stored procedure is not allowed to block for a very long time, otherwise you will impact the health of your whole SQL Server instance. In this blog posting I want to demonstrate this behavior to you. In the first step I have to create the whole necessary Service Broker infrastructure objects, nothing special with this – I’ve just stolen that code from Chapter 4 of my Service Broker book:

USE master;

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'Chapter4_InternalActivation')
BEGIN
	PRINT 'Dropping database ''Chapter4_InternalActivation''';
	DROP DATABASE Chapter4_InternalActivation;
END
GO

CREATE DATABASE Chapter4_InternalActivation
GO

USE Chapter4_InternalActivation
GO

--*********************************************
--*  Create the message type "RequestMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = NONE
GO

--*********************************************
--*  Create the message type "ResponseMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = NONE
GO

--************************************************
--*  Changing the validation of the message types
--************************************************
ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = WELL_FORMED_XML
GO

ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = WELL_FORMED_XML
GO

--************************************************
--*  Create the contract "HelloWorldContract"
--************************************************
CREATE CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
(
	[http://ssb.csharp.at/SSB_Book/c04/RequestMessage] SENT BY INITIATOR,
	[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] SENT BY TARGET
)
GO

--**************************************************
--*  Create a table to store the processed messages
--**************************************************
CREATE TABLE ProcessedMessages
(
	ID UNIQUEIDENTIFIER NOT NULL,
	MessageBody XML NOT NULL,
	ServiceName NVARCHAR(MAX) NOT NULL
)
GO

--************************************************************************
--*  A stored procedure used for internal activation on the target queue
--************************************************************************
CREATE PROCEDURE ProcessRequestMessages
AS
	DECLARE @ch UNIQUEIDENTIFIER
	DECLARE @messagetypename NVARCHAR(256)
	DECLARE	@messagebody XML
	DECLARE @responsemessage XML;

	WHILE (1=1)
	BEGIN
		BEGIN TRY
			BEGIN TRANSACTION

			WAITFOR (
				RECEIVE TOP(1)
					@ch = conversation_handle,
					@messagetypename = message_type_name,
					@messagebody = CAST(message_body AS XML)
				FROM TargetQueue
			), TIMEOUT 60000

			IF (@@ROWCOUNT = 0)
			BEGIN
				ROLLBACK TRANSACTION
				BREAK
			END

			IF (@messagetypename = 'http://ssb.csharp.at/SSB_Book/c04/RequestMessage')
			BEGIN
				-- Store the received request message in a table
				INSERT INTO ProcessedMessages (ID, MessageBody, ServiceName) VALUES (NEWID(), @messagebody, 'TargetService')

				-- Construct the response message
				SET @responsemessage = '<HelloWorldResponse>' + @messagebody.value('/HelloWorldRequest[1]', 'NVARCHAR(MAX)') + '</HelloWorldResponse>';

				-- Send the response message back to the initiating service
				SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] (@responsemessage);

				-- End the conversation on the target's side
				END CONVERSATION @ch;
			END

			IF (@messagetypename = 'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog')
			BEGIN
				-- End the conversation
				END CONVERSATION @ch;
			END

			COMMIT TRANSACTION
		END TRY
		BEGIN CATCH
			ROLLBACK TRANSACTION
		END CATCH
	END
GO

--********************************************************
--*  Create the queues "InitiatorQueue" and "TargetQueue"
--********************************************************
CREATE QUEUE InitiatorQueue
WITH STATUS = ON
GO

CREATE QUEUE TargetQueue
WITH ACTIVATION
(
	STATUS = ON,
	PROCEDURE_NAME = [ProcessRequestMessages],
	MAX_QUEUE_READERS = 1,
	EXECUTE AS SELF
)
GO

--************************************************************
--*  Create the queues "InitiatorService" and "TargetService"
--************************************************************
CREATE SERVICE InitiatorService
ON QUEUE InitiatorQueue 
(
	[http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
)
GO

CREATE SERVICE TargetService
ON QUEUE TargetQueue
(
	[http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
)
GO

As you can see from this listing, the target queue is enabled for activation, means that the ProcessRequestMessages stored procedure automatically started by a Queue Monitor as soon as a new message arrives on that queue. The stored procedure itself just inserts the received message in the table ProcessesMessages. Now imagine that this table has a lock from some other query that is executing inside your database:

BEGIN TRANSACTION
SELECT * FROM ProcessedMessages WITH (TABLOCKX)
GO

I know that TABLOCKX is a little bit too much, but I just want to demonstrate the problem to you. When you afterwards send a new message, the activation stored procedure gets started, and finally blocks because the necessary IX lock for the INSERT statement can’t be acquired:

--********************************************************************
--*  Sending a message from the InitiatorService to the TargetService
--********************************************************************
BEGIN TRY
	BEGIN TRANSACTION;
		DECLARE @ch UNIQUEIDENTIFIER
		DECLARE @msg NVARCHAR(MAX);

		BEGIN DIALOG CONVERSATION @ch
			FROM SERVICE [InitiatorService]
			TO SERVICE 'TargetService'
			ON CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
			WITH ENCRYPTION = OFF;

		SET @msg = 
			'<HelloWorldRequest>
					Klaus Aschenbrenner
			</HelloWorldRequest>';

		SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] (@msg);
	COMMIT;
END TRY
BEGIN CATCH
	ROLLBACK TRANSACTION
END CATCH
GO

You can now check very easily that your activated stored procedure just blocks and is doing nothing inside SQL Server:

--********************************************************************
--*  View the currently activated stored procedures
--*  The activated stored procedure is now waiting on a LCK_M_IS wait type
--*  and is never *ever* finishing...
--********************************************************************
SELECT t.*, s.last_wait_type FROM sys.dm_broker_activated_tasks t
INNER JOIN sys.dm_exec_requests s ON s.session_id = t.spid
GO

The crazy thing about that behavior is, that you can’t just stop SQL Server, you have to do a SHUTDOWN WITH NOWAIT, otherwise SQL Server will not stop anymore…:

-- *************************************************************
-- * A normal shutdown of SQL Server will not working anymore...
-- *************************************************************
SHUTDOWN
GO

-- ***************************************
-- * You have to do a SHUTDOWN WITH NOWAIT
-- ***************************************
SHUTDOWN WITH NO WAIT
GO

Moral of the story: make sure that you release your locks as fast as possible. The other problem with locking in Service Broker is that it yields to high LOCK_HASH spinlocks, which will burn down a lot of CPU cycles without doing anything in SQL Server – it’s a very crazy scenario and in the first step very hard to explain. You can also crosscheck this with sys.dm_os_spinlock_stats. In that scenario the CPU is up to 100% but doesn’t do any work regarding Service Broker – very contraprodutive! Thanks to Thomas Kejser for that tip regarding the LOCK_HASH spinlock.

Thanks for reading

-Klaus

Comments

  1. Interesting problem. In your customers case what caused the lock to be held in the first place? I would have expected use of the RECIEVE command which should row lock (I haven’t tested but would expect this). Therefore the entire table should never lock.
    Regarding the shutdown I assume this was just a side effect and that you could just kill the blocking spid?
    I’m doing a lot of work with service broker right now, hopefully I won’t see this issue!
    Thanks

  2. Hello Simon,

    The root cause was an uncommitted transactions. The transaction logic in the activated stored procedure was written wrong.

    -Klaus