Service Broker Adventures: Stucked Queue Monitors

This week I’m in London helping a client with its large scale Service Broker deployment, mainly troubleshooting some strange problems with Service Broker and helping to improve the overall scalability and messaging throughput. Today I want to share a strange scenario with you, how a badly written Activated Stored Procedure isn’t really activated by Service Broker. Let’s assume the following basic setup of Service Broker, where we are sending messages from the Initiator Service to the Target Service:

CREATE DATABASE Chapter4_InternalActivation
GO

USE Chapter4_InternalActivation
GO

--*********************************************
--*  Create the message type "RequestMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = NONE
GO

--*********************************************
--*  Create the message type "ResponseMessage"
--*********************************************
CREATE MESSAGE TYPE
[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = NONE
GO

--************************************************
--*  Changing the validation of the message types
--************************************************
ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage]
VALIDATION = WELL_FORMED_XML
GO

ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage]
VALIDATION = WELL_FORMED_XML
GO

--************************************************
--*  Create the contract "HelloWorldContract"
--************************************************
CREATE CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
(
	[http://ssb.csharp.at/SSB_Book/c04/RequestMessage] SENT BY INITIATOR,
	[http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] SENT BY TARGET
)
GO

--**************************************************
--*  Create a table to store the processed messages
--**************************************************
CREATE TABLE ProcessedMessages
(
	ID UNIQUEIDENTIFIER NOT NULL,
	MessageBody XML NOT NULL,
	ServiceName NVARCHAR(MAX) NOT NULL
)
GO

--********************************************************
--*  Create the queues "InitiatorQueue" and "TargetQueue"
--********************************************************
CREATE QUEUE InitiatorQueue
WITH STATUS = ON
GO

--************************************************************
--*  Create the queues "InitiatorService" and "TargetService"
--************************************************************
CREATE SERVICE InitiatorService
ON QUEUE InitiatorQueue 
(
	[http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract]
)
GO

For the Target Service an Activated Stored Procedure was written, and this Stored Procedure was driven by a configuration table. The configuration table specified if the Stored Procedure was retrieving messages from the underlying queue, or not. Have a look at the following code:

--************************************************
--*  Create a simple config table
--************************************************
CREATE TABLE Config
(
	QueueName SYSNAME NOT NULL PRIMARY KEY,
	QueueEnabled BIT NOT NULL
)
GO

--************************************************
--*  Insert a config record
--************************************************
INSERT INTO Config VALUES ('TargetQueue', 0)
GO

--************************************************************************
--*  A stored procedure used for internal activation on the target queue
--************************************************************************
CREATE PROCEDURE ProcessRequestMessages
AS
	DECLARE @ch UNIQUEIDENTIFIER
	DECLARE @messagetypename NVARCHAR(256)
	DECLARE	@messagebody XML
	DECLARE @responsemessage XML
	DECLARE @enabled BIT;

	WHILE (1=1)
	BEGIN
		BEGIN TRY
			BEGIN TRANSACTION

			-- Check if the queue is enabled
			SELECT @enabled = QueueEnabled FROM Config
			WHERE QueueName = 'TargetQueue'

			IF (@enabled = 1)
			BEGIN
				WAITFOR (
					RECEIVE TOP(1)
						@ch = conversation_handle,
						@messagetypename = message_type_name,
						@messagebody = CAST(message_body AS XML)
					FROM TargetQueue
				), TIMEOUT 60000

				IF (@@ROWCOUNT = 0)
				BEGIN
					ROLLBACK TRANSACTION
					BREAK
				END

				IF (@messagetypename = 'http://ssb.csharp.at/SSB_Book/c04/RequestMessage')
				BEGIN
					-- Store the received request message in a table
					INSERT INTO ProcessedMessages (ID, MessageBody, ServiceName) VALUES (NEWID(), @messagebody, 'TargetService')

					-- Construct the response message
					SET @responsemessage = '' + @messagebody.value('/HelloWorldRequest[1]', 'NVARCHAR(MAX)') + '';

					-- Send the response message back to the initiating service
					SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] (@responsemessage);

					-- End the conversation on the target's side
					END CONVERSATION @ch;
				END

				IF (@messagetypename = 'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog')
				BEGIN
					-- End the conversation
					END CONVERSATION @ch;
				END
			END
			ELSE
			BEGIN
				-- When the queue is not "enabled" in the config table, we just return
				COMMIT TRANSACTION
				BREAK
			END

			COMMIT TRANSACTION
		END TRY
		BEGIN CATCH
			ROLLBACK TRANSACTION
		END CATCH
	END
GO

This approach is a little bit strange, because you just have to disable activation, which leads to the same scenario, so that the Activated Stored Procedure isn’t doing anything anymore (as soon as the queue is empty). This scenario was invented, because sometimes it was impossible to disable Service Broker queues, because of a Locking/Blocking scenario. The Locking/Blocking scenario occurred, because there were uncommitted transactions from the Activated Stored Procedure, which means that some locks were held forever in SQL Server, which finally blocked the disabling of the queue. See my last blog posting on this phenomenon: https://www.sqlpassion.at/blog/PermaLink,guid,fc4f98af-f42a-4b3d-872e-c31815e6fc02.aspx. So when you have changed the entry in the config table, the Activated Stored Procedure was just exiting – so far so good in the theory…

The real truth about that “solution” was the fact, that the Activated Stored Procedure wasn’t ever processing messages. You might now ask why. The answer on that question was finally very easy: As soon as your Stored Procedure gets activated by Service Broker, you MUST process messages from your queue through the RECEIVE statement, otherwise Service Broker assumes that your Stored Procedure has encountered a problem, and considers the Stored Procedure to be failed. See the following remarks from Books Online (http://msdn.microsoft.com/en-us/library/ms171585(v=sql.105).aspx):

An activated stored procedure must receive messages from the queue that activated the procedure. If the stored procedure exits without receiving messages or the queue monitor detects that the stored procedure is not receiving messages after a short time-out, the queue monitor considers the stored procedure to have failed. In this case, the queue monitor stops activating the stored procedure.

But how can you now troubleshoot that specific problem to find out if Service Broker had considered your Stored Procedure to be failed? There’s the DMV sys.dm_broker_queue_monitors, which shows the so-called Queue Monitors. Those components are responsible for activating your Stored Procedure. A Queue Monitor can be in 3 different states:

  • INACTIVE
  • NOTIFIED
  • RECEIVE_OCCURING

When a Stored Procedure is not activated for your queue, then the corresponding Queue Monitor is in the INACTIVE state. As soon as the Queue Monitor has started your Stored Procedure, the Queue Monitor goes into the NOTIFIED state. And finally the Queue Monitor goes into the RECEIVE_OCCURING state, when the Activated Stored Procedure receives messages. This means that the Queue Monitor remains in the NOTIFIED state, when your Stored Procedure isn’t receiving any messages. As long as your Queue Monitor is stucked in the NOTIFIED state, you are not processing any messages from your queue!

In this specific scenario the Queue Monitor is also not moving into the RECEIVE_OCCURING state, when you change the entry in the user-defined configuration table – it just remains in the NOTIFIED state. The only solution is to disable and re-enable the queue to restart the corresponding Queue Monitor. But when your queues are blocked through locks from an uncommitted transaction… you see one problem leads to another problem… You can find here the download to the script with which you can reproduce the stucked Queue Monitor within Service Broker.

What’s the moral of this story: read the f… manual and code your Activated Stored Procedures in a very robust way J. I’ve talked a lot to different people over the last years about Service Broker. Everyone is just scared about Service Broker, because it’s such an overcomplicated technology. In my opinion Service Broker isn’t really complicated, but you have to know the various design patterns behind Service Broker, and it’s up to your Activated Stored Procedures what you’re doing with Service Broker, and how healthy your Service Broker solution will be. Service Broker itself is very robust (he’s just sending messages from A to B, nothing more), and will not cause any problems to you, almost of the time you are the trigger of the problems…

Thanks for reading

-Klaus