Killing me softly with Service Broker…
A few days ago, I found a very interesting behavior regarding Service Broker on a customer system. The main thing is that your activation stored procedure is not allowed to block for a very long time, otherwise you will impact the health of your whole SQL Server instance. In this blog posting I want to demonstrate this behavior to you. In the first step I have to create the whole necessary Service Broker infrastructure objects, nothing special with this – I’ve just stolen that code from Chapter 4 of my Service Broker book:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
USE master; IF EXISTS (SELECT * FROM sys.databases WHERE name = 'Chapter4_InternalActivation') BEGIN PRINT 'Dropping database ''Chapter4_InternalActivation'''; DROP DATABASE Chapter4_InternalActivation; END GO CREATE DATABASE Chapter4_InternalActivation GO USE Chapter4_InternalActivation GO --********************************************* --* Create the message type "RequestMessage" --********************************************* CREATE MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] VALIDATION = NONE GO --********************************************* --* Create the message type "ResponseMessage" --********************************************* CREATE MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] VALIDATION = NONE GO --************************************************ --* Changing the validation of the message types --************************************************ ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] VALIDATION = WELL_FORMED_XML GO ALTER MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] VALIDATION = WELL_FORMED_XML GO --************************************************ --* Create the contract "HelloWorldContract" --************************************************ CREATE CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract] ( [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] SENT BY INITIATOR, [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] SENT BY TARGET ) GO --************************************************** --* Create a table to store the processed messages --************************************************** CREATE TABLE ProcessedMessages ( ID UNIQUEIDENTIFIER NOT NULL, MessageBody XML NOT NULL, ServiceName NVARCHAR(MAX) NOT NULL ) GO --************************************************************************ --* A stored procedure used for internal activation on the target queue --************************************************************************ CREATE PROCEDURE ProcessRequestMessages AS DECLARE @ch UNIQUEIDENTIFIER DECLARE @messagetypename NVARCHAR(256) DECLARE @messagebody XML DECLARE @responsemessage XML; WHILE (1=1) BEGIN BEGIN TRY BEGIN TRANSACTION WAITFOR ( RECEIVE TOP(1) @ch = conversation_handle, @messagetypename = message_type_name, @messagebody = CAST(message_body AS XML) FROM TargetQueue ), TIMEOUT 60000 IF (@@ROWCOUNT = 0) BEGIN ROLLBACK TRANSACTION BREAK END IF (@messagetypename = 'http://ssb.csharp.at/SSB_Book/c04/RequestMessage') BEGIN -- Store the received request message in a table INSERT INTO ProcessedMessages (ID, MessageBody, ServiceName) VALUES (NEWID(), @messagebody, 'TargetService') -- Construct the response message SET @responsemessage = '<HelloWorldResponse>' + @messagebody.value('/HelloWorldRequest[1]', 'NVARCHAR(MAX)') + '</HelloWorldResponse>'; -- Send the response message back to the initiating service SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/ResponseMessage] (@responsemessage); -- End the conversation on the target's side END CONVERSATION @ch; END IF (@messagetypename = 'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog') BEGIN -- End the conversation END CONVERSATION @ch; END COMMIT TRANSACTION END TRY BEGIN CATCH ROLLBACK TRANSACTION END CATCH END GO --******************************************************** --* Create the queues "InitiatorQueue" and "TargetQueue" --******************************************************** CREATE QUEUE InitiatorQueue WITH STATUS = ON GO CREATE QUEUE TargetQueue WITH ACTIVATION ( STATUS = ON, PROCEDURE_NAME = [ProcessRequestMessages], MAX_QUEUE_READERS = 1, EXECUTE AS SELF ) GO --************************************************************ --* Create the queues "InitiatorService" and "TargetService" --************************************************************ CREATE SERVICE InitiatorService ON QUEUE InitiatorQueue ( [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract] ) GO CREATE SERVICE TargetService ON QUEUE TargetQueue ( [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract] ) GO |
As you can see from this listing, the target queue is enabled for activation, means that the ProcessRequestMessages stored procedure automatically started by a Queue Monitor as soon as a new message arrives on that queue. The stored procedure itself just inserts the received message in the table ProcessesMessages. Now imagine that this table has a lock from some other query that is executing inside your database:
1 2 3 |
BEGIN TRANSACTION SELECT * FROM ProcessedMessages WITH (TABLOCKX) GO |
I know that TABLOCKX is a little bit too much, but I just want to demonstrate the problem to you. When you afterwards send a new message, the activation stored procedure gets started, and finally blocks because the necessary IX lock for the INSERT statement can’t be acquired:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
--******************************************************************** --* Sending a message from the InitiatorService to the TargetService --******************************************************************** BEGIN TRY BEGIN TRANSACTION; DECLARE @ch UNIQUEIDENTIFIER DECLARE @msg NVARCHAR(MAX); BEGIN DIALOG CONVERSATION @ch FROM SERVICE [InitiatorService] TO SERVICE 'TargetService' ON CONTRACT [http://ssb.csharp.at/SSB_Book/c04/HelloWorldContract] WITH ENCRYPTION = OFF; SET @msg = '<HelloWorldRequest> Klaus Aschenbrenner </HelloWorldRequest>'; SEND ON CONVERSATION @ch MESSAGE TYPE [http://ssb.csharp.at/SSB_Book/c04/RequestMessage] (@msg); COMMIT; END TRY BEGIN CATCH ROLLBACK TRANSACTION END CATCH GO |
You can now check very easily that your activated stored procedure just blocks and is doing nothing inside SQL Server:
1 2 3 4 5 6 7 8 |
--******************************************************************** --* View the currently activated stored procedures --* The activated stored procedure is now waiting on a LCK_M_IS wait type --* and is never *ever* finishing... --******************************************************************** SELECT t.*, s.last_wait_type FROM sys.dm_broker_activated_tasks t INNER JOIN sys.dm_exec_requests s ON s.session_id = t.spid GO |
The crazy thing about that behavior is, that you can’t just stop SQL Server, you have to do a SHUTDOWN WITH NOWAIT, otherwise SQL Server will not stop anymore…:
1 2 3 4 5 6 7 8 9 10 11 |
-- ************************************************************* -- * A normal shutdown of SQL Server will not working anymore... -- ************************************************************* SHUTDOWN GO -- *************************************** -- * You have to do a SHUTDOWN WITH NOWAIT -- *************************************** SHUTDOWN WITH NO WAIT GO |
Moral of the story: make sure that you release your locks as fast as possible. The other problem with locking in Service Broker is that it yields to high LOCK_HASH spinlocks, which will burn down a lot of CPU cycles without doing anything in SQL Server – it’s a very crazy scenario and in the first step very hard to explain. You can also crosscheck this with sys.dm_os_spinlock_stats. In that scenario the CPU is up to 100% but doesn’t do any work regarding Service Broker – very contraprodutive! Thanks to Thomas Kejser for that tip regarding the LOCK_HASH spinlock.
Thanks for reading
-Klaus
2 Comments
Interesting problem. In your customers case what caused the lock to be held in the first place? I would have expected use of the RECIEVE command which should row lock (I haven’t tested but would expect this). Therefore the entire table should never lock.
Regarding the shutdown I assume this was just a side effect and that you could just kill the blocking spid?
I’m doing a lot of work with service broker right now, hopefully I won’t see this issue!
Thanks
Hello Simon,
The root cause was an uncommitted transactions. The transaction logic in the activated stored procedure was written wrong.
-Klaus