Skip to content

Conversation

LukeButters
Copy link
Contributor

@LukeButters LukeButters commented Sep 22, 2025

Summary

This PR adds the ability to observe the Redis PRQ, this will allow us to monitor how the Redis PRQ is behaving during roll out.

The changes allow us to monitor specific parts of of the queue for example:

  • IMessageSerialiserAndDataStreamStorageExceptionObserver: Allows us to observe a section of the queue that should never throw exceptions.
  • IRedisFacadeObserver: Allows us to observe problematic communications with Redis.
  • The Queue decorator in RedisPendingRequestQueueFactory will allow us to observe exceptions coming out of the queue.

This will make it very easy to setup alerting for example, we would create an alert should any exception bubble out of IMessageSerialiserAndDataStreamStorageExceptionObserver

AI Summary

This PR introduces comprehensive observability for Redis operations in Halibut, enabling monitoring and logging of Redis infrastructure issues and message processing errors.

Changes

Redis Facade Observer

  • Added IRedisFacadeObserver interface for monitoring Redis events
  • Implemented observer notifications for:
    • Connection failures (OnRedisConnectionFailed)
    • Server error messages (OnRedisServerRepliedWithAnErrorMessage)
    • Connection restoration (OnRedisConnectionRestored)
    • Operation failures and retries (OnRedisOperationFailed)
  • Updated RedisFacade to accept and use observer
  • Added comprehensive tests demonstrating functionality

Message Serializer Observer

  • Added IMessageSerialiserAndDataStreamStorageExceptionObserver for monitoring serialization errors
  • Created decorator pattern implementation for exception tracking
  • Updated RedisPendingRequestQueueFactory to use observer pattern

Testing

  • Added RedisFacadeObserverTest with scenarios for connection disruptions
  • Added MessageSerialiserAndDataStreamStorageExceptionObserverTest
  • All tests pass and demonstrate observer functionality

This enables applications to monitor Redis health and respond to infrastructure issues proactively.

How to review this PR

Quality ✔️

Pre-requisites

  • I have read How we use GitHub Issues for help deciding when and where it's appropriate to make an issue.
  • I have considered informing or consulting the right people, according to the ownership map.
  • I have considered appropriate testing for my change.

- Add IMessageSerialiserAndDataStreamStorageExceptionObserver interface for monitoring message serialization exceptions
- Create MessageSerialiserAndDataStreamStorageWithExceptionObserver decorator to wrap existing implementations
- Add NoOpMessageSerialiserAndDataStreamStorageExceptionObserver as default implementation
- Update RedisPendingRequestQueueFactory to accept and use exception observer
- Add IRedisFacadeObserver interface for monitoring Redis connection events and retry operations
- Create NoOpRedisFacadeObserver as default implementation
- Update RedisFacade to accept observer and notify on:
  - Redis connection failures (OnRedisConnectionFailed)
  - Redis error messages (OnRedisServerRepliedWithAnErrorMessage)
  - Redis connection restoration (OnRedisConnectionRestored)
  - Redis operation failures and retries (OnRedisOperationFailed)
- Update RedisFacadeBuilder to accept observer parameter
- Add comprehensive tests for both observer implementations
- Update existing code to use no-op observers by default

This enables monitoring and logging of Redis infrastructure issues and message processing errors.
@LukeButters LukeButters requested a review from a team as a code owner September 22, 2025 05:03
@LukeButters LukeButters requested review from Copilot and removed request for a team September 22, 2025 05:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds observer interfaces and no-op implementations to provide observability for Redis connectivity issues and message serialization/data stream storage exceptions. Key changes introduce IRedisFacadeObserver and IMessageSerialiserAndDataStreamStorageExceptionObserver along with decorator wiring and corresponding tests.

  • Added observer pattern to RedisFacade for connection, error, retry, and restoration events.
  • Added exception observer decorator for message serialization/data stream storage operations and factory integration.
  • Added tests for both observer mechanisms (Redis connectivity and serialization/storage failures).

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
RedisPendingRequestQueueFactory.cs Injects exception observer and optional queue decorator; wraps storage with observer decorator.
RedisFacade.cs Adds observer integration and modifies retry logic to surface retry/final failure state.
NoOpRedisFacadeObserver.cs Provides no-op implementation for optional Redis observer.
IRedisFacadeObserver.cs Defines Redis observer interface.
NoOpMessageSerialiserAndDataStreamStorageExceptionObserver.cs Provides no-op implementation for serialization/storage exception observer.
MessageSerialiserAndDataStreamStorageWithExceptionObserver.cs Decorator capturing and forwarding exceptions to observer.
IMessageSerialiserAndDataStreamStorageExceptionObserver.cs Defines exception observer interface for message serialization/storage.
RedisFacadeBuilder.cs Test builder updated to accept optional observer.
RedisFacadeObserverTest.cs New tests for Redis observer notifications (connection failure/restoration and retry behavior).
MessageSerialiserAndDataStreamStorageExceptionObserverTest.cs New test validating exception observer invocation during storage failures.
ManyPollingTentacleTests.cs Adds using for new message storage namespace.
RedisPendingRequestQueueBuilder.cs Wraps message storage with no-op exception observer in tests.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@LukeButters LukeButters changed the title Add Redis observability infrastructure Add Redis PRQ observability infrastructure Sep 22, 2025
IMessageSerialiserAndDataStreamStorage inner,
IMessageSerialiserAndDataStreamStorageExceptionObserver exceptionObserver)
{
this.inner = inner ?? throw new ArgumentNullException(nameof(inner));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we guarding for null? This isn't a pattern we tend to do at Octopus, as we tend to use the compiler for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we mostly don't really need to as the type system mostly guards against this although not in all cases I think we might still have some null checks off in halibut I can't remember. In any case "the ai wrote this, and I don't think it hurts to have it."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but still, if the AI wrote this, can we make it so the AI doesn't write code that's outside our conventions? It caught my eye, and will likely catch the eye of others.

It just feels like the start of a bad precedent to me, that's all.

var messageSerializer = new QueueMessageSerializerBuilder().Build();
var messageReaderWriter = new MessageSerialiserAndDataStreamStorage(messageSerializer, dataStreamStore);
var baseMessageReaderWriter = new MessageSerialiserAndDataStreamStorage(messageSerializer, dataStreamStore);
var messageReaderWriter = new MessageSerialiserAndDataStreamStorageWithExceptionObserver(baseMessageReaderWriter, NoOpMessageSerialiserAndDataStreamStorageExceptionObserver.Instance);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why do we wrap baseMessageReaderWriter in messageReaderWriter here?

We are using the "no op" version of the observer.

Also, it's a decorator, so we can pass in baseMessageReaderWriter as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do I not see this comment on the Files tab, removing the decorator nonsense

@@ -0,0 +1,131 @@
#if NET8_0_OR_GREATER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while since I've been in this code base, but why are we doing this for .NET 8 or higher only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, because the queue is only supported in net8 or higher. The Redis PRQ does not exist for net48. Because screw net48!

var getStringTask = redisFacade.GetString("foo", CancellationToken);

// Wait a bit for retries to happen, then restore connection
await Task.Delay(6000); // By-default (somewhere) the redis client will wait 5s for a request to get to Redis so we need to wait longer than that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to wait until the first error is thrown or something else? Delays in tests make me nervous, as they can lead to flaky tests.

Ok if not possible, but nice if we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out, we can do that instead it is waaay better.

Copy link
Contributor

@sburmanoctopus sburmanoctopus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just some questions and minor comments

@LukeButters LukeButters enabled auto-merge (squash) September 22, 2025 23:44
@LukeButters LukeButters merged commit 2e55041 into main Sep 23, 2025
17 checks passed
@LukeButters LukeButters deleted the luke/support-observing-RedisPRQ branch September 23, 2025 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants