- 
                Notifications
    You must be signed in to change notification settings 
- Fork 612
Description
Describe the bug
I have an application running under IIS. Its one of many in our microservices architecture. But this one seems to be having issues at startup when consumers are not created.
So far I was able to identify 2 exceptions which are happening somewhere inside RabbitMQ.Client (or in whatever place).
What my application does at startup:
- creates a connection to server
- gets a list of consumer it needs to create
- goes through the list (one by one)
- gets a channel from pool (always from the same connection)
- checks exchange, queue and bindings (using ExchangeDeclareAsync, QueueDeclareAsync, QueueBindAsync)
- starts consumer
- repeats from 4. with next consumer until all are configured
However one service gets TaskCanceledException at random code places, sometimes inside ExchangeDeclareAsync, sometimes inside QueueDeclareAsync, sometimes inside QosAsync, basically in any of async methods which is part of RabbitMQ.Client.
I double checked - the cancellation token which I pass to these methods remains uncancelled.
Here is example stack trace:
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at RabbitMQ.Client.Impl.Channel.QueueDeclareAsync(String queue, Boolean durable, Boolean exclusive, Boolean autoDelete, IDictionary`2 arguments, Boolean passive, Boolean noWait, CancellationToken cancellationToken)
   at RabbitMQ.Client.Impl.AutorecoveringChannel.QueueDeclareAsync(String queue, Boolean durable, Boolean exclusive, Boolean autoDelete, IDictionary`2 arguments, Boolean passive, Boolean noWait, CancellationToken cancellationToken)
   at ESCID.ESP.Messaging.RabbitMQ.Server.Hosting.RabbitMqServerManager.TryDeclareQueueAsync(RabbitMqChannelPool channels, ILogger logger, String queueName, Boolean autoDelete, Dictionary`2 arguments, CancellationToken cancellationToken)
Sometimes in a bit more rare situations I get another exception like:
RabbitMQ.Client.Exceptions.OperationInterruptedException: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='An attempt was made to transition a task to a final state when it had already completed.', classId=0, methodId=0, exception=System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetResult(TResult result)
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
 ---> System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetResult(TResult result)
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
   --- End of inner exception stack trace ---
   at RabbitMQ.Client.Impl.Channel.OpenAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Impl.RecoveryAwareChannel.CreateAndOpenAsync(ISession session, CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.AutorecoveringConnection.CreateChannelAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at ESCID.ESP.Messaging.RabbitMQ.Connections.Channels.RabbitMqChannelPool.Get(String channelOwner, CancellationToken cancellationToken)
The last one seems like a multithreading issue. However I double checked - I have await in every code part here, so whole code is executed sequentially and awaited until completes.
Behavior can be quite reliably replicated during deployment to the server, but I was not able to reproduce it on my local machine or other developers machines. Machine which is deployed is also on slower ones, which might suggest some timing, thread waiting issues or whatever.
However maybe that OperationInterruptedException can get you some clues - how this can happen and how it can be fixed?
Channels are not reused between threads in app. There are around 32 consumers created (and also about 32 channels - one for each consumer). I am also made sure that application is not stopping just after start.. so its something else which cancelling thread.
Reproduction steps
Not able to reproduce on any other machine except one server.
Expected behavior
Exception should not happen.
Additional context
No response