Skip to content

Conversation

danielmarbach
Copy link
Contributor

@danielmarbach danielmarbach commented Sep 3, 2024

Related issue:

This PR follows through on related changes done in Amazon SQS but adjusts the core pieces of the change to Azure Service Bus-specific behavior. Additionally, it brings in a number of layered tests that verify the code works as expected and doesn't accidentally break later.

An LRU cache to track the delivery attempts is not necessary because the receive count is automatically updated by the storage queue infrastructure.

The changes have only been applied for the receive only transaction mode because with the sends-atomic receive mode the outgoing messages are enlisted with a send-via information together with the incoming message. The outgoing messages are only sent when the incoming message successfully completed. In cases when the incoming message wasn't successfully completed, the outgoing operations are discarded by the broker and there is no way to refer to outgoing operations of a previous message handling attempt.

  • Caching Acknowledged Messages: A cache (using BitFaster.Caching.Lru) is introduced to keep track of messages that have been marked as successfully completed. This helps in avoiding unnecessary retries for messages that are already acknowledged.
  • Error Handling Improvements: Enhanced handling around retries and error recovery, including better logging and recovery actions for different failure scenarios.
  • Minor cleanups such as removing the async suffix or making the abandon actually safe.

Before ReceiveOnly

graph TD
    A[Receive Message] --> B[Invoke OnMessage]
    B --> C{Success?}
    C -- Yes --> D[Acknowledge message]
    D --> E[Message acknowledged]  
    C -- No --> F[Invoke OnError]
    F --> G{Error requires retry?}
    G -- Yes --> H[Requeue message]
    G -- No --> I[Acknowledge message]
    H --> J[Message requeued]
    I --> E[Message acknowledged]
Loading

After ReceiveOnly

graph TD
    A[Receive Message] --> B[ID in LRU]
    B -- Yes --> M[Acknowledge message]
    F --> G[Message tracked]
    B -- No --> H[Invoke OnMessage]
    H --> I{Success?}
    I -- Yes --> M
    I -- No --> J[Invoke OnError]
    J --> K{Error requires retry?}
    K -- Yes --> L[Requeue message]
    K -- No --> M[Acknowledge message]
    M --> N{Ack success?}
    N -- Yes --> E[Message acknowledged]
    N -- No --> F[Track ID in LRU]
    G -- Message requeued --> A
Loading

Tradeoffs

The tradeoffs are the sames as the other fixes applied in RabbitMQ and Amazon SQS. In competing consumer scenarios there can still be more retries because the LRU cache is in memory, but it is still marginally better than having a potentially unlimited number of retries.

As for the transaction mode None, there is no change required since we are not giving any guarantees there and messages are acknowledged early on with ReceiveAndDelete.

Ideally at some point, if the complexity grows further, it would be good to split the pump into different receive strategies. Given the message settlement operations already had specific code around transport transaction modes and the risk of doing such a split is high a decision was made to not introduce a receive strategy and treat the settlement operation extensions as closesly related to the pump (hence they share the same logger)

@danielmarbach danielmarbach force-pushed the receive-only-lock-lost-discussion branch from bd315c2 to 42ba8e7 Compare September 10, 2024 14:39
@danielmarbach danielmarbach changed the title LockLost: WIP for discussion Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases in receiveonly mode Sep 13, 2024
@danielmarbach danielmarbach marked this pull request as ready for review September 13, 2024 09:58
@danielmarbach danielmarbach self-assigned this Sep 13, 2024
@danielmarbach danielmarbach force-pushed the receive-only-lock-lost-discussion branch from 827009a to 9032693 Compare September 17, 2024 14:22
@danielmarbach
Copy link
Contributor Author

There is intermittent flakiness happening with the concurrency tests that got introduced recently. It also happens on master so it is quite likely not related to the changes proposed in this PR

@danielmarbach
Copy link
Contributor Author

danielmarbach commented Sep 20, 2024

Requires a rebase after #1042 has been merged

@danielmarbach danielmarbach force-pushed the receive-only-lock-lost-discussion branch from 9032693 to d4621e4 Compare September 20, 2024 13:44
@TravisNickels TravisNickels merged commit 5388e89 into master Sep 23, 2024
3 checks passed
@TravisNickels TravisNickels deleted the receive-only-lock-lost-discussion branch September 23, 2024 23:55
danielmarbach added a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach added a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach added a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach added a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)

# Conflicts:
#	src/AcceptanceTests/NServiceBus.Transport.AzureServiceBus.AcceptanceTests.csproj
#	src/CommandLine/NServiceBus.Transport.AzureServiceBus.CommandLine.csproj
#	src/CommandLineTests/NServiceBus.Transport.AzureServiceBus.CommandLine.Tests.csproj
#	src/Tests/FakeReceiver.cs
#	src/Tests/NServiceBus.Transport.AzureServiceBus.Tests.csproj
#	src/Transport/NServiceBus.Transport.AzureServiceBus.csproj
#	src/TransportTests/NServiceBus.Transport.AzureServiceBus.TransportTests.csproj
danielmarbach pushed a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach pushed a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach pushed a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach pushed a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach pushed a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach added a commit that referenced this pull request Sep 24, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1034)
danielmarbach added a commit that referenced this pull request Sep 25, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1045)

* Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases in receiveonly mode (#1034)

* Update src/AcceptanceTests/Receiving/When_message_visibility_expired.cs

Co-authored-by: Travis Nickels <travis.nickels@particular.net>

---------

Co-authored-by: Travis Nickels <travis.nickels@particular.net>
danielmarbach added a commit that referenced this pull request Sep 25, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1046)

* Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases in receiveonly mode (#1034)

# Conflicts:
#	src/AcceptanceTests/NServiceBus.Transport.AzureServiceBus.AcceptanceTests.csproj
#	src/CommandLine/NServiceBus.Transport.AzureServiceBus.CommandLine.csproj
#	src/CommandLineTests/NServiceBus.Transport.AzureServiceBus.CommandLine.Tests.csproj
#	src/Tests/FakeReceiver.cs
#	src/Tests/NServiceBus.Transport.AzureServiceBus.Tests.csproj
#	src/Transport/NServiceBus.Transport.AzureServiceBus.csproj
#	src/TransportTests/NServiceBus.Transport.AzureServiceBus.TransportTests.csproj

* Update src/AcceptanceTests/Receiving/When_message_visibility_expired.cs

Co-authored-by: Travis Nickels <travis.nickels@particular.net>

---------

Co-authored-by: Travis Nickels <travis.nickels@particular.net>
danielmarbach added a commit that referenced this pull request Sep 25, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1044)

* Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases in receiveonly mode (#1034)

* Update src/AcceptanceTests/Receiving/When_message_visibility_expired.cs

Co-authored-by: Travis Nickels <travis.nickels@particular.net>

---------

Co-authored-by: Travis Nickels <travis.nickels@particular.net>
danielmarbach added a commit that referenced this pull request Sep 25, 2024
…ine but failed to get acknowledged due to expired leases in receiveonly mode (#1047)

* Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases in receiveonly mode (#1034)

* Update src/AcceptanceTests/Receiving/When_message_visibility_expired.cs

Co-authored-by: Travis Nickels <travis.nickels@particular.net>

---------

Co-authored-by: Daniel Marbach <danielmarbach@users.noreply.github.com>
Co-authored-by: Travis Nickels <travis.nickels@particular.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants