-
Notifications
You must be signed in to change notification settings - Fork 48
Add Redis PRQ observability infrastructure #690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add IMessageSerialiserAndDataStreamStorageExceptionObserver interface for monitoring message serialization exceptions - Create MessageSerialiserAndDataStreamStorageWithExceptionObserver decorator to wrap existing implementations - Add NoOpMessageSerialiserAndDataStreamStorageExceptionObserver as default implementation - Update RedisPendingRequestQueueFactory to accept and use exception observer - Add IRedisFacadeObserver interface for monitoring Redis connection events and retry operations - Create NoOpRedisFacadeObserver as default implementation - Update RedisFacade to accept observer and notify on: - Redis connection failures (OnRedisConnectionFailed) - Redis error messages (OnRedisServerRepliedWithAnErrorMessage) - Redis connection restoration (OnRedisConnectionRestored) - Redis operation failures and retries (OnRedisOperationFailed) - Update RedisFacadeBuilder to accept observer parameter - Add comprehensive tests for both observer implementations - Update existing code to use no-op observers by default This enables monitoring and logging of Redis infrastructure issues and message processing errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds observer interfaces and no-op implementations to provide observability for Redis connectivity issues and message serialization/data stream storage exceptions. Key changes introduce IRedisFacadeObserver and IMessageSerialiserAndDataStreamStorageExceptionObserver along with decorator wiring and corresponding tests.
- Added observer pattern to RedisFacade for connection, error, retry, and restoration events.
- Added exception observer decorator for message serialization/data stream storage operations and factory integration.
- Added tests for both observer mechanisms (Redis connectivity and serialization/storage failures).
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
RedisPendingRequestQueueFactory.cs | Injects exception observer and optional queue decorator; wraps storage with observer decorator. |
RedisFacade.cs | Adds observer integration and modifies retry logic to surface retry/final failure state. |
NoOpRedisFacadeObserver.cs | Provides no-op implementation for optional Redis observer. |
IRedisFacadeObserver.cs | Defines Redis observer interface. |
NoOpMessageSerialiserAndDataStreamStorageExceptionObserver.cs | Provides no-op implementation for serialization/storage exception observer. |
MessageSerialiserAndDataStreamStorageWithExceptionObserver.cs | Decorator capturing and forwarding exceptions to observer. |
IMessageSerialiserAndDataStreamStorageExceptionObserver.cs | Defines exception observer interface for message serialization/storage. |
RedisFacadeBuilder.cs | Test builder updated to accept optional observer. |
RedisFacadeObserverTest.cs | New tests for Redis observer notifications (connection failure/restoration and retry behavior). |
MessageSerialiserAndDataStreamStorageExceptionObserverTest.cs | New test validating exception observer invocation during storage failures. |
ManyPollingTentacleTests.cs | Adds using for new message storage namespace. |
RedisPendingRequestQueueBuilder.cs | Wraps message storage with no-op exception observer in tests. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
source/Halibut.Tests/Queue/Redis/RedisHelpers/RedisFacadeObserverTest.cs
Show resolved
Hide resolved
source/Halibut.Tests/Queue/Redis/RedisHelpers/RedisFacadeObserverTest.cs
Show resolved
Hide resolved
IMessageSerialiserAndDataStreamStorage inner, | ||
IMessageSerialiserAndDataStreamStorageExceptionObserver exceptionObserver) | ||
{ | ||
this.inner = inner ?? throw new ArgumentNullException(nameof(inner)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we guarding for null? This isn't a pattern we tend to do at Octopus, as we tend to use the compiler for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we mostly don't really need to as the type system mostly guards against this although not in all cases I think we might still have some null checks off in halibut I can't remember. In any case "the ai wrote this, and I don't think it hurts to have it."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but still, if the AI wrote this, can we make it so the AI doesn't write code that's outside our conventions? It caught my eye, and will likely catch the eye of others.
It just feels like the start of a bad precedent to me, that's all.
var messageSerializer = new QueueMessageSerializerBuilder().Build(); | ||
var messageReaderWriter = new MessageSerialiserAndDataStreamStorage(messageSerializer, dataStreamStore); | ||
var baseMessageReaderWriter = new MessageSerialiserAndDataStreamStorage(messageSerializer, dataStreamStore); | ||
var messageReaderWriter = new MessageSerialiserAndDataStreamStorageWithExceptionObserver(baseMessageReaderWriter, NoOpMessageSerialiserAndDataStreamStorageExceptionObserver.Instance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, why do we wrap baseMessageReaderWriter
in messageReaderWriter
here?
We are using the "no op" version of the observer.
Also, it's a decorator, so we can pass in baseMessageReaderWriter
as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do I not see this comment on the Files tab, removing the decorator nonsense
@@ -0,0 +1,131 @@ | |||
#if NET8_0_OR_GREATER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while since I've been in this code base, but why are we doing this for .NET 8 or higher only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, because the queue is only supported in net8 or higher. The Redis PRQ does not exist for net48. Because screw net48!
var getStringTask = redisFacade.GetString("foo", CancellationToken); | ||
|
||
// Wait a bit for retries to happen, then restore connection | ||
await Task.Delay(6000); // By-default (somewhere) the redis client will wait 5s for a request to get to Redis so we need to wait longer than that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to wait until the first error is thrown or something else? Delays in tests make me nervous, as they can lead to flaky tests.
Ok if not possible, but nice if we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call out, we can do that instead it is waaay better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just some questions and minor comments
Summary
This PR adds the ability to observe the Redis PRQ, this will allow us to monitor how the Redis PRQ is behaving during roll out.
The changes allow us to monitor specific parts of of the queue for example:
IMessageSerialiserAndDataStreamStorageExceptionObserver
: Allows us to observe a section of the queue that should never throw exceptions.IRedisFacadeObserver
: Allows us to observe problematic communications with Redis.RedisPendingRequestQueueFactory
will allow us to observe exceptions coming out of the queue.This will make it very easy to setup alerting for example, we would create an alert should any exception bubble out of
IMessageSerialiserAndDataStreamStorageExceptionObserver
AI Summary
This PR introduces comprehensive observability for Redis operations in Halibut, enabling monitoring and logging of Redis infrastructure issues and message processing errors.
Changes
Redis Facade Observer
IRedisFacadeObserver
interface for monitoring Redis eventsOnRedisConnectionFailed
)OnRedisServerRepliedWithAnErrorMessage
)OnRedisConnectionRestored
)OnRedisOperationFailed
)RedisFacade
to accept and use observerMessage Serializer Observer
IMessageSerialiserAndDataStreamStorageExceptionObserver
for monitoring serialization errorsRedisPendingRequestQueueFactory
to use observer patternTesting
RedisFacadeObserverTest
with scenarios for connection disruptionsMessageSerialiserAndDataStreamStorageExceptionObserverTest
This enables applications to monitor Redis health and respond to infrastructure issues proactively.
How to review this PR
Quality ✔️
Pre-requisites