So you built your first Service Broker app, you’ve sent the first
message and now you’re looking for the message on the target queue. Yet, the
message is not there. What do you do? Where do you look first? Well,
troubleshooting Service Broker is a bit different than troubleshooting your
everyday database app.
So I’m trying to build here a short guide that you can follow to troubleshoot
Service Broker issues.
First, you should ensure that the message was actually sent and committed.
Next, check if the message exists in the sys.transmission_queue. The
transmission queue is similar to an Outgoing mailbox, an ‘Outbox’. Messages are
kept there until the target acknowledges that it received the message, after
which they are deleted. Therefore, if the message is still in the transmission
queue it means that it was not yet acknowledged by the destination. How does
one diagnose what is the problem? My recommendation is to follow the message
flow: message is sent by sender, then accepted by the target, then an ack is sent
by the target and finally this ack is accepted by the sender. I any of these
steps have a problem, then the message will sit in the sys.transmission_queue.
Let’s now look at how to diagnose each step of the message flow. BTW, I often
refer to the acknowledgement as ‘ack’, and to the transmission queue as ‘xmit
queue’.
1. The sender cannot send the message, for whatever reason. If this is the
case, the transmission_status column in sys.transmission_queue will contain an
error message that will point at the problem. The appropriate action depends on
the error being displayed.
Common problems include security problems (no database master key, no remote
service binding, no certificates etc), classification problem (no route for the
target service etc) or adjacent transport connection issues (connection
handshake errors, unreachable target host etc)
2. The sender does send the message and the message reaches the target but the
target does not accept the message. In this case, the sender’s
transmission_status will be empty. To diagnose this issue, you must attach the
Profiler to the target machine and enable the following events:
‘Broker/Broker:Conversation’, ‘Broker/Broker:Message Undeliverable’ and
‘Broker/Broker:Remote Message Acknowledgement’. When the message arrives, you
will see the event ‘Broker/Broker:Remote Message Acknowledgement‘ with the
EventSubClass ‘Message with Acknowledgement Received‘ followed by
‘Broker/Broker:Message Undeliverable‘ event. The TextData of this last event
will contain an error message that will point at the problem.
Common problem in this case are security problems (you must turn on in addition
the ‘Audit Security/Audit Broker Conversation‘ event in the Profiler to
investigate these problems, the TextData should pinpoint to the failure cause),
typos in service names or broker instance id, disabled target queues.
Note that in case this error in TextData says ‘This message could not be
delivered because it is a duplicate.’ it means that the message is actually
accepted by the target, but the acks don’t reach back to the sender and
therefore the sender is retrying the message again and again (see below).
3. The sender does send the message, the message reaches the target and is
accepted, but the target is unable to send back an ack. Same as above, you must
attach the Profiler to the target machine and you will see repeated occurrences
of the ‘Broker/Broker:Message Undeliverable‘ event with the TextData ‘This
message could not be delivered because it is a duplicate.‘. The vent will be
generated each time the sender is retrying the message, which happens about
once/minute (strictly speaking is after 4, 8, 16, 32 and then once every 64
seconds).
Typically the problem is a misconfigured route back from the target to the
sender (the route for the initiator service is missing). The Profiler event
‘Broker:Message Classify‘ will show this, with an EventSubClass ‘3 – Delayed’
and a TextData message of ‘The target service name could not be found. Ensure
that the service name is specified correctly and/or the routing information has
been supplied.’.
Another possible cause is when the route configured on the target for the
sender service has a typo. Since the ack is not stored in the
sys.transmission_queue, you don’t have the handy transmission_status. Or do
you? Actually, you can use the get_transmission_status function to get the
transmission status of the ack! Lookup the conversation handle is
sys.conversation_endpoints and then use this function to query the transmission
status of the ack sent by that dialog back to the sender.
4. The sender does send the message, but the message never reaches the target.
This can happen only if there are intermediate hops (SQL Server instances
acting as forwarders). To determine which forwarder drops the messages, you
have to connect the Profiler to each forwarder and see which one traces
‘Broker:Forwarded Message Dropped’ events. The most likely causes are either
message timeout (the forwarders can’t get to send the message in time due to
high load) or a misconfigured routing information on the forwarder (like
missing routes in MSDB database, which is the one used for forwarding).
5. The sender does send the message, the message reaches the target and is
accepted, the target is sending back the ack but the ack never reaches back the
initiator (again, a forwarder is required for this to happen). Investigating
this issue is identical with the issue above: attach the Profiler to each
forwarder until you find the one that is dropping messages. Note that the forwarders
from the sender to the target are not necessarily the same as the ones on the
route from the target back to the sender!
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/16754036/viewspace-734014/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/16754036/viewspace-734014/