-
Notifications
You must be signed in to change notification settings - Fork 958
Description
Bug Report
- import path of package in question
github.com/Azure/azure-sdk-for-go/sdk/messaging/azservicebus - SDK version
v1.10.0- Specify the exact commit if possible; one way to get this is the REVISION
column output bygo list -m <module>, for examplego list -m github.com/Azure/azure-sdk-for-go/sdk/azcore.
- Specify the exact commit if possible; one way to get this is the REVISION
- output of
go version1.24.4
- What happened?
We had a short service side unavailability which caused InternalServerErrors during ReceiveMessages() calls. It seems like the internal link recovery attempts are executed without any backoffs, therefore hammering the service even when it's down.
Sampled 1 hour of data and from a single container we made close to 3k link recovery requests to Service Bus (roughly 1 recovery attempt every second?).
Note that Service Bus did have to do a manual mitigation for this to recover, but I am not sure if us hammering their endpoints made the problems worse.
Here is a sample of the link recovery loop, we had this pattern until we manually restarted our container. @richardpark-msft I can DM you some query links too if you need them!
| TIMESTAMP | level | msg |
|---|---|---|
| 2025-09-17T23:00:00.0115926Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) refreshing claim |
| 2025-09-17T23:00:00.0116228Z | info | azservicebus SDK event: azsb.Conn, content: Creating new client, current rev: 4379 |
| 2025-09-17T23:00:00.0443104Z | info | azservicebus SDK event: azsb.Conn, content: Client created, new rev: 4380, took 32ms |
| 2025-09-17T23:00:00.0443277Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) negotiate claim, token expires on 2025-09-18T08:38:56Z |
| 2025-09-17T23:00:00.0453975Z | info | azservicebus SDK event: azsb.Auth, content: negotiating claim for audience amqps://akssvcbus-israelnorthwest.servicebus.windows.net/redacted-queue-name with token type jwt and expiry of 1758184736 |
| 2025-09-17T23:00:00.0499255Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) refreshing claim |
| 2025-09-17T23:00:00.0499377Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) negotiate claim, token expires on 2025-09-18T08:38:56Z |
| 2025-09-17T23:00:00.0500455Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) next refresh in 9h28m55.950088677s |
| 2025-09-17T23:00:00.0509761Z | info | azservicebus SDK event: azsb.Auth, content: negotiating claim for audience amqps://akssvcbus-israelnorthwest.servicebus.windows.net/redacted-queue-name/$management with token type jwt and expiry of 1758184736 |
| 2025-09-17T23:00:00.0522487Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) next refresh in 9h28m55.947863564s |
| 2025-09-17T23:00:00.0555609Z | info | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Links created |
| 2025-09-17T23:00:00.0555723Z | info | azservicebus SDK event: azsb.Receiver, content: [c:4380, l:4380, r:name:mBbW_4] Issuing 100 credits, have 0 |
| 2025-09-17T23:00:00.3072010Z | info | azservicebus SDK event: azsb.Receiver, content: [c:4380, l:4380, r:name:mBbW_4] Received 0/100 messages |
| 2025-09-17T23:00:00.3072188Z | error | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Closing connection AND links for error *Error{Condition: amqp:internal-error, Description: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d6234e7-d7c3-4646-b139-0086e82c3aef, TrackingId:654c14f1000000030003629668cb3d70_G11_B22, SystemTracker:gi::G11:2193799:mBbW_4ZYxiqCjKm59_Flvpp9ZtSYTqh9VBfCaH_MKRu3CQnTuPiMog, bi::link221846(2193800_G11), Timestamp:2025-09-17T23:00:00, Info: map[]} |
| 2025-09-17T23:00:00.3072214Z | info | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Links closing (permanent: false) |
| 2025-09-17T23:00:00.3072242Z | info | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Closing Receiver |
| 2025-09-17T23:00:00.3072261Z | info | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Closing Session |
| 2025-09-17T23:00:00.3075907Z | info | azservicebus SDK event: azsb.Conn, content: [c:4380, l:4380, r:name:mBbW_4] Closing RPC |
| 2025-09-17T23:00:00.3082437Z | error | azservicebus SDK event: azsb.Receiver, content: [c:4380, l:4380, r:name:mBbW_4] Failure when receiving messages: *Error{Condition: amqp:internal-error, Description: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:9d6234e7-d7c3-4646-b139-0086e82c3aef, TrackingId:654c14f1000000030003629668cb3d70_G11_B22, SystemTracker:gi::G11:2193799:mBbW_4ZYxiqCjKm59_Flvpp9ZtSYTqh9VBfCaH_MKRu3CQnTuPiMog, bi::link221846(2193800_G11), Timestamp:2025-09-17T23:00:00, Info: map[]} |
| 2025-09-17T23:00:01.3087903Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) refreshing claim |
| 2025-09-17T23:00:01.3088149Z | info | azservicebus SDK event: azsb.Conn, content: Creating new client, current rev: 4380 |
| 2025-09-17T23:00:01.3357072Z | info | azservicebus SDK event: azsb.Conn, content: Client created, new rev: 4381, took 26ms |
| 2025-09-17T23:00:01.3357246Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) negotiate claim, token expires on 2025-09-18T08:38:56Z |
| 2025-09-17T23:00:01.3367936Z | info | azservicebus SDK event: azsb.Auth, content: negotiating claim for audience amqps://akssvcbus-israelnorthwest.servicebus.windows.net/redacted-queue-name with token type jwt and expiry of 1758184736 |
| 2025-09-17T23:00:01.3380178Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) refreshing claim |
| 2025-09-17T23:00:01.3380299Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) negotiate claim, token expires on 2025-09-18T08:38:56Z |
| 2025-09-17T23:00:01.3381426Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name) next refresh in 9h28m54.662004374s |
| 2025-09-17T23:00:01.3389710Z | info | azservicebus SDK event: azsb.Auth, content: negotiating claim for audience amqps://akssvcbus-israelnorthwest.servicebus.windows.net/redacted-queue-name/$management with token type jwt and expiry of 1758184736 |
| 2025-09-17T23:00:01.3401031Z | info | azservicebus SDK event: azsb.Auth, content: (redacted-queue-name/$management) next refresh in 9h28m54.659994162s |
| 2025-09-17T23:00:01.3462211Z | info | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Links created |
| 2025-09-17T23:00:01.3462994Z | info | azservicebus SDK event: azsb.Receiver, content: [c:4381, l:4381, r:name:Jb5Ji6] Issuing 100 credits, have 0 |
| 2025-09-17T23:00:01.5727826Z | info | azservicebus SDK event: azsb.Receiver, content: [c:4381, l:4381, r:name:Jb5Ji6] Received 0/100 messages |
| 2025-09-17T23:00:01.5728086Z | error | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Closing connection AND links for error *Error{Condition: amqp:internal-error, Description: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 Reference:2667034d-508e-4049-af04-2443a300de7c, TrackingId:654c14f1000000030003629968cb3d71_G9_B22, SystemTracker:gi::G9:7863226:Jb5Ji6Bwreb6EG3oiVcJ8Ds27EMkNjLzBIpLMeaQKk4ymSLxVAjVsg, bi::link221849(7863227_G9), Timestamp:2025-09-17T23:00:01, Info: map[]} |
| 2025-09-17T23:00:01.5728164Z | info | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Links closing (permanent: false) |
| 2025-09-17T23:00:01.5728220Z | info | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Closing Receiver |
| 2025-09-17T23:00:01.5728437Z | info | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Closing Session |
| 2025-09-17T23:00:01.5733018Z | info | azservicebus SDK event: azsb.Conn, content: [c:4381, l:4381, r:name:Jb5Ji6] Closing RPC |
- What did you expect or want to happen?
Should we have some kind of backoff if the link recovery fails repeatedly?
- How can we reproduce it?
Unsure.. We only caught it because one of our queues were totally down
- Anything we should know about your environment.
Running on k8s