Skip to content

Conversation

johnsimons
Copy link
Member

@johnsimons johnsimons commented Aug 29, 2025

Summary of the PostgreSQL spike

  • We are storing an audit message in a single row, including the body (up to the maximum body to store configuration).
  • We are using jsonb columns to store both headers and message_metadata.
  • The indexed columns are stored as native columns for performance reasons, so there is a bit of data duplication happening here, but according to my research indexing json stored in jsonb columns via a GIN index is not as performance as a native column, example:
    • Native BOOLEAN: WHERE is_system_message = true → B-tree index, instant
    • JSON cast: WHERE (message_metadata->>'IsSystemMessage')::boolean = true → JSON extraction + cast + comparison
  • Full-text search is on by default, and we are indexing both the body and the headers.
  • The indexes are tuned to the pattern usage of ServicePulse v2, where we removed quite a few sorting options.
  • IFailedAuditStorage is not implemented. I assume we need to create another table where we store messages that could not be imported.
  • The saga snapshot is not really tuned and completed, at the moment we are storing all the "changes" in a jsonb column, but this design will not scale well for never ending sagas. So I think we would need to improve this and maybe only store a maximum of 50 changes.
  • The clean-up based on configured retention is done but has not been properly tested, and if performance is an issue we would need to consider using partitioned tables based on either daily or hourly rotation.
  • The know_endpoints storage is a bit more complex, because we need to update the last_seen column, what we found is that on a busy system, there would be too many deadlocks because of collision on the id. To mitigate this we ended up creating an insert only table and then have a background job that updates the "real" know_endpoints table. This design would also need to be considered for the saga snapshot if there are lots of concurrent updates to the same saga_id.

Testing

We used a MBP with an Apple M3 Pro chip and 36GB of memory.
We used RabbitMQ and latest PostgreSQL database, all running in docker.
On a single Audit instance we were able to ingests on average 1400msg/s, this is just ingestion rate, no other queries or cleanup.
When scaled out to 4 instances, we could get the ingestion rate at 5000msg/s.

@johnsimons johnsimons self-assigned this Aug 29, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really required to be implemented, given that we are storing the message body with the message row.

Comment on lines +36 to +41
var contentType = reader.GetFieldValue<Dictionary<string, string>>(reader.GetOrdinal("headers")).GetValueOrDefault(Headers.ContentType, "text/xml");
using var stream = await reader.GetStreamAsync(reader.GetOrdinal("body"), cancellationToken);
var responseStream = manager.GetStream();
await stream.CopyToAsync(responseStream, cancellationToken);
responseStream.Position = 0;
return MessageBodyView.FromStream(responseStream, contentType, (int)stream.Length, string.Empty);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out a way to pass the stream down the pipeline without allocating.
It's possible that our abstraction is preventing it, or perhaps it's me.

@SzymonPobiega @danielmarbach any thoughts?

Comment on lines 125 to 132
cmd.CommandText = @"
INSERT INTO known_endpoints (
id, name, host_id, host, last_seen
) VALUES (
@id, @name, @host_id, @host, @last_seen
)
ON CONFLICT (id) DO UPDATE SET
last_seen = GREATEST(known_endpoints.last_seen, EXCLUDED.last_seen);";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After running a test scaling out the audit instance, this query caused too many deadlocks.
This would have to be redesigned to prevent that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be fixed with:

Suggested change
cmd.CommandText = @"
INSERT INTO known_endpoints (
id, name, host_id, host, last_seen
) VALUES (
@id, @name, @host_id, @host, @last_seen
)
ON CONFLICT (id) DO UPDATE SET
last_seen = GREATEST(known_endpoints.last_seen, EXCLUDED.last_seen);";
cmd.CommandText = @"
SELECT pg_advisory_xact_lock(hashtext(@id));
INSERT INTO known_endpoints (
id, name, host_id, host, last_seen
) VALUES (
@id, @name, @host_id, @host, @last_seen
)
ON CONFLICT (id) DO UPDATE SET
last_seen = GREATEST(known_endpoints.last_seen, EXCLUDED.last_seen);";

But this may impact performance based on the number of collisions.
So, not sure if this is the right option or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the pg_advisory_xact_lock did not work 😢

Comment on lines +107 to +110
INSERT INTO saga_snapshots (id, saga_id, saga_type, changes)
VALUES (@saga_id, @saga_id, @saga_type, @new_change)
ON CONFLICT (id) DO UPDATE SET
changes = COALESCE(saga_snapshots.changes, '[]'::jsonb) || @new_change::jsonb;";
Copy link
Member Author

@johnsimons johnsimons Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +169 to +175
CREATE TABLE IF NOT EXISTS saga_snapshots (
id UUID PRIMARY KEY,
saga_id UUID,
saga_type TEXT,
changes JSONB,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);", connection))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design of this table is incorrect, needs to be updated to the same design as the know_endpoints where we use an insert only table for the hot path and then we use a background job to reconcile the results in the more permanent table.

id UUID PRIMARY KEY,
saga_id UUID,
saga_type TEXT,
changes JSONB,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this design is based on the Raven design where we append changes to the changes column.
But for never ending sagas this could get quite large and in essence not work.
I think we need to keep rotating changes where we keep a maximum of x changes that is it. Maybe a bit smarter so that we always keep the first change, ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant