Killing me softly with Service Broker…

(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan, where you are getting week by week via email all the essential knowledge you need to know about performance tuning on SQL Server.)

A few days ago, I found a very interesting behavior regarding Service Broker on a customer system. The main thing is that your activation stored procedure is not allowed to block for a very long time, otherwise you will impact the health of your whole SQL Server instance. In this blog posting I want to demonstrate this behavior to you. In the first step I have to create the whole necessary Service Broker infrastructure objects, nothing special with this – I’ve just stolen that code from Chapter 4 of my Service Broker book:

USE master;

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'Chapter4_InternalActivation')
BEGIN
	PRINT 'Dropping database ''Chapter4_InternalActivation''';
	DROP DATABASE Chapter4_InternalActivation;
END
GO

CREATE DATABASE Chapter4_InternalActivation
GO

USE Chapter4_InternalActivation
GO

--*********************************************
--*  Create the message type "RequestMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = NONE
GO

--*********************************************
--*  Create the message type "ResponseMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = NONE
GO

--************************************************
--*  Changing the validation of the message types
--************************************************
ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = WELL_FORMED_XML
GO

ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = WELL_FORMED_XML
GO

--************************************************
--*  Create the contract "HelloWorldContract"
--************************************************
CREATE CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
(
	[http://ssb.csharp.at/SSB_Book/c04/RequestMessage] SENT BY INITIATOR,
	[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] SENT BY TARGET
)
GO

--**************************************************
--*  Create a table to store the processed messages
--**************************************************
CREATE TABLE ProcessedMessages
(
	ID UNIQUEIDENTIFIER NOT NULL,
	MessageBody XML NOT NULL,
	ServiceName NVARCHAR(MAX) NOT NULL
)
GO

--************************************************************************
--*  A stored procedure used for internal activation on the target queue
--************************************************************************
CREATE PROCEDURE ProcessRequestMessages
AS
	DECLARE @ch UNIQUEIDENTIFIER
	DECLARE @messagetypename NVARCHAR(256)
	DECLARE	@messagebody XML
	DECLARE @responsemessage XML;

	WHILE (1=1)
	BEGIN
		BEGIN TRY
			BEGIN TRANSACTION

			WAITFOR (
				RECEIVE TOP(1)
					@ch = conversation_handle,
					@messagetypename = message_type_name,
					@messagebody = CAST(message_body AS XML)
				FROM TargetQueue
			), TIMEOUT 60000

			IF (@@ROWCOUNT = 0)
			BEGIN
				ROLLBACK TRANSACTION
				BREAK
			END

			IF (@messagetypename = 'http://ssb.csharp.at/SSB_Book/c04/RequestMessage')
			BEGIN
				-- Store the received request message in a table
				INSERT INTO ProcessedMessages (ID, MessageBody, ServiceName) VALUES (NEWID(), @messagebody, 'TargetService')

				-- Construct the response message
				SET @responsemessage = '<HelloWorldResponse>' + @messagebody.value('/HelloWorldRequest[1]', 'NVARCHAR(MAX)') + '</HelloWorldResponse>';

				-- Send the response message back to the initiating service
				SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] (@responsemessage);

				-- End the conversation on the target's side
				END CONVERSATION @ch;
			END

			IF (@messagetypename = 'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog')
			BEGIN
				-- End the conversation
				END CONVERSATION @ch;
			END

			COMMIT TRANSACTION
		END TRY
		BEGIN CATCH
			ROLLBACK TRANSACTION
		END CATCH
	END
GO

--********************************************************
--*  Create the queues "InitiatorQueue" and "TargetQueue"
--********************************************************
CREATE QUEUE InitiatorQueue
WITH STATUS = ON
GO

CREATE QUEUE TargetQueue
WITH ACTIVATION
(
	STATUS = ON,
	PROCEDURE_NAME = [ProcessRequestMessages],
	MAX_QUEUE_READERS = 1,
	EXECUTE AS SELF
)
GO

--************************************************************
--*  Create the queues "InitiatorService" and "TargetService"
--************************************************************
CREATE SERVICE InitiatorService
ON QUEUE InitiatorQueue 
(
	[http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
)
GO

CREATE SERVICE TargetService
ON QUEUE TargetQueue
(
	[http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
)
GO

As you can see from this listing, the target queue is enabled for activation, means that the ProcessRequestMessages stored procedure automatically started by a Queue Monitor as soon as a new message arrives on that queue. The stored procedure itself just inserts the received message in the table ProcessesMessages. Now imagine that this table has a lock from some other query that is executing inside your database:

BEGIN TRANSACTION
SELECT * FROM ProcessedMessages WITH (TABLOCKX)
GO

I know that TABLOCKX is a little bit too much, but I just want to demonstrate the problem to you. When you afterwards send a new message, the activation stored procedure gets started, and finally blocks because the necessary IX lock for the INSERT statement can’t be acquired:

--********************************************************************
--*  Sending a message from the InitiatorService to the TargetService
--********************************************************************
BEGIN TRY
	BEGIN TRANSACTION;
		DECLARE @ch UNIQUEIDENTIFIER
		DECLARE @msg NVARCHAR(MAX);

		BEGIN DIALOG CONVERSATION @ch
			FROM SERVICE [InitiatorService]
			TO SERVICE 'TargetService'
			ON CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
			WITH ENCRYPTION = OFF;

		SET @msg = 
			'<HelloWorldRequest>
					Klaus Aschenbrenner
			</HelloWorldRequest>';

		SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] (@msg);
	COMMIT;
END TRY
BEGIN CATCH
	ROLLBACK TRANSACTION
END CATCH
GO

You can now check very easily that your activated stored procedure just blocks and is doing nothing inside SQL Server:

--********************************************************************
--*  View the currently activated stored procedures
--*  The activated stored procedure is now waiting on a LCK_M_IS wait type
--*  and is never *ever* finishing...
--********************************************************************
SELECT t.*, s.last_wait_type FROM sys.dm_broker_activated_tasks t
INNER JOIN sys.dm_exec_requests s ON s.session_id = t.spid
GO

The crazy thing about that behavior is, that you can’t just stop SQL Server, you have to do a SHUTDOWN WITH NOWAIT, otherwise SQL Server will not stop anymore…:

-- *************************************************************
-- * A normal shutdown of SQL Server will not working anymore...
-- *************************************************************
SHUTDOWN
GO

-- ***************************************
-- * You have to do a SHUTDOWN WITH NOWAIT
-- ***************************************
SHUTDOWN WITH NO WAIT
GO

Moral of the story: make sure that you release your locks as fast as possible. The other problem with locking in Service Broker is that it yields to high LOCK_HASH spinlocks, which will burn down a lot of CPU cycles without doing anything in SQL Server – it’s a very crazy scenario and in the first step very hard to explain. You can also crosscheck this with sys.dm_os_spinlock_stats. In that scenario the CPU is up to 100% but doesn’t do any work regarding Service Broker – very contraprodutive! Thanks to Thomas Kejser for that tip regarding the LOCK_HASH spinlock.

Thanks for reading

-Klaus

Comments

  1. Interesting problem. In your customers case what caused the lock to be held in the first place? I would have expected use of the RECIEVE command which should row lock (I haven’t tested but would expect this). Therefore the entire table should never lock.
    Regarding the shutdown I assume this was just a side effect and that you could just kill the blocking spid?
    I’m doing a lot of work with service broker right now, hopefully I won’t see this issue!
    Thanks

  2. Hello Simon,

    The root cause was an uncommitted transactions. The transaction logic in the activated stored procedure was written wrong.

    -Klaus