Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR removes the socket pooling infrastructure from the plugin execution system, replacing it with a simpler per-execution socket model. Changes span documentation, TypeScript executor implementation, Rust configuration and constants, shared socket service logic, and corresponding tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files
Flags with carried forward coverage won't be shown. Click here to find out more. @@ Coverage Diff @@
## main #684 +/- ##
==========================================
+ Coverage 90.91% 90.94% +0.02%
==========================================
Files 288 288
Lines 117665 118002 +337
==========================================
+ Hits 106981 107315 +334
- Misses 10684 10687 +3
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/plugins/shared_socket.rs (1)
1274-1311:⚠️ Potential issue | 🟠 Major
test_idle_connection_cleaned_up_by_safety_timeoutdoes not validate cleanup.This test never asserts timeout-triggered cleanup/closure; it only sleeps briefly and drops the client. It can pass even if safety-timeout logic regresses.
💡 Suggested direction for a deterministic assertion
- #[tokio::test] + #[tokio::test(start_paused = true)] async fn test_idle_connection_cleaned_up_by_safety_timeout() { ... - tokio::time::sleep(Duration::from_millis(200)).await; - drop(client); + tokio::time::advance(Duration::from_secs( + crate::constants::DEFAULT_PLUGIN_TIMEOUT_SECONDS + 61 + )) + .await; + + let write_result = client + .write_all((serde_json::to_string(&PluginMessage::Shutdown).unwrap() + "\n").as_bytes()) + .await; + assert!(write_result.is_err(), "Connection should be closed by safety timeout");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/plugins/shared_socket.rs` around lines 1274 - 1311, The test test_idle_connection_cleaned_up_by_safety_timeout currently never asserts that the safety-timeout cleanup occurs; update it to deterministically verify the connection is closed by either (A) configuring the SharedSocketService to use a much shorter safety timeout for the test (override DEFAULT_PLUGIN_TIMEOUT_SECONDS or provide a test-only ctor/option) then await past that timeout and assert the socket is closed (e.g., attempt a read/write on the UnixStream and expect EOF or an error), or (B) expose/ call the internal cleanup/check method on SharedSocketService from the test to trigger cleanup immediately and assert the client receives a disconnect; reference SharedSocketService::new, register_execution, and the safety timeout constant (DEFAULT_PLUGIN_TIMEOUT_SECONDS) when locating code to modify. Ensure the test uses tokio::time::timeout to avoid flakiness and asserts the connection closure rather than just dropping the client.
🧹 Nitpick comments (2)
src/services/plugins/shared_socket.rs (1)
497-501: Doc comment still describes pooled-socket behavior.Lines 499-501 mention reused/pooled sockets, but this code path now uses the simplified non-pooled model. This will confuse future maintenance and incident debugging.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/plugins/shared_socket.rs` around lines 497 - 501, The doc comment above the plugin connection handler in shared_socket.rs still refers to reused/pooled sockets; update it to reflect the current simplified non-pooled model by removing references to pooling/reuse and clarifying that the inactivity timeout resets on each message but connections are not pooled or reused—mention that connections are per-client and closed when idle rather than retained for reuse so future maintainers debugging the connection handler (the plugin connection handling code) won't be misled.plugins/tests/lib/pool-executor-reconnect.test.ts (1)
166-168: Strengthen unavailable-server assertions to lock in API contract.This test should also assert the sanitized error code/status (
SERVICE_UNAVAILABLE,503), not onlysuccess=false.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/tests/lib/pool-executor-reconnect.test.ts` around lines 166 - 168, The test currently only asserts result.success is false; update the assertions after expect(result.success).toBe(false) to also verify the sanitized error code and status on result.error by adding assertions such as expect(result.error?.code).toBe('SERVICE_UNAVAILABLE') and expect(result.error?.status).toBe(503) (or expect(result.error?.statusCode).toBe(503) if the error uses statusCode), ensuring you reference the same result variable used in this test block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/lib/pool-executor.ts`:
- Around line 231-239: The race happens because
handleSocketError/handleSocketClose unconditionally clear the shared
connectionPromise while connectWithRetry() may still be running, allowing send()
to start a second connect loop; to fix, make connectionPromise the single source
of truth and only clear or replace it when its specific promise settles — e.g.,
in ensureConnected() set connectionPromise = connectWithRetry(); in
connectWithRetry() ensure it resolves/rejects without external clears, and
update handleSocketError/handleSocketClose to compare the current
connectionPromise to the promise they observed before clearing (or set a
separate reconnecting flag) so they only clear/reset connectionPromise when it
matches the in-flight promise or when that promise has settled; update
references in ensureConnected, connectWithRetry, handleSocketError,
handleSocketClose and send accordingly.
In `@plugins/tests/lib/pool-executor-reconnect.test.ts`:
- Around line 117-128: The current nested setTimeout/setInterval polling around
destroyFirstConnection is flaky; replace the wall-clock polling with a
deterministic trigger tied to the first request/response event: wait for the
first request to be completed (or subscribe to the pool/executor
response/acquire event used in the test) and call destroyFirstConnection()
immediately after that event instead of using the checkInterval/20ms
delays—locate the timers surrounding destroyFirstConnection, remove the
setInterval/setTimeout logic, and invoke destroyFirstConnection from the
first-response callback or resolved promise produced by the test's request
invocation.
In `@src/services/plugins/shared_socket.rs`:
- Around line 540-546: The safety_timeout currently uses
DEFAULT_PLUGIN_TIMEOUT_SECONDS and can close valid long-running requests; change
the calculation to use the configured request timeout
(PLUGIN_POOL_REQUEST_TIMEOUT_SECS) and take the max between that and
DEFAULT_PLUGIN_TIMEOUT_SECONDS (then add the +60 margin) so the guard never
fires earlier than the request budget; update the safety_timeout assignment (and
the analogous logic around lines 548-563) to
Duration::from_secs(std::cmp::max(crate::constants::DEFAULT_PLUGIN_TIMEOUT_SECONDS,
crate::constants::PLUGIN_POOL_REQUEST_TIMEOUT_SECS) + 60) so it respects
configured timeouts.
---
Outside diff comments:
In `@src/services/plugins/shared_socket.rs`:
- Around line 1274-1311: The test
test_idle_connection_cleaned_up_by_safety_timeout currently never asserts that
the safety-timeout cleanup occurs; update it to deterministically verify the
connection is closed by either (A) configuring the SharedSocketService to use a
much shorter safety timeout for the test (override
DEFAULT_PLUGIN_TIMEOUT_SECONDS or provide a test-only ctor/option) then await
past that timeout and assert the socket is closed (e.g., attempt a read/write on
the UnixStream and expect EOF or an error), or (B) expose/ call the internal
cleanup/check method on SharedSocketService from the test to trigger cleanup
immediately and assert the client receives a disconnect; reference
SharedSocketService::new, register_execution, and the safety timeout constant
(DEFAULT_PLUGIN_TIMEOUT_SECONDS) when locating code to modify. Ensure the test
uses tokio::time::timeout to avoid flakiness and asserts the connection closure
rather than just dropping the client.
---
Nitpick comments:
In `@plugins/tests/lib/pool-executor-reconnect.test.ts`:
- Around line 166-168: The test currently only asserts result.success is false;
update the assertions after expect(result.success).toBe(false) to also verify
the sanitized error code and status on result.error by adding assertions such as
expect(result.error?.code).toBe('SERVICE_UNAVAILABLE') and
expect(result.error?.status).toBe(503) (or
expect(result.error?.statusCode).toBe(503) if the error uses statusCode),
ensuring you reference the same result variable used in this test block.
In `@src/services/plugins/shared_socket.rs`:
- Around line 497-501: The doc comment above the plugin connection handler in
shared_socket.rs still refers to reused/pooled sockets; update it to reflect the
current simplified non-pooled model by removing references to pooling/reuse and
clarifying that the inactivity timeout resets on each message but connections
are not pooled or reused—mention that connections are per-client and closed when
idle rather than retained for reuse so future maintainers debugging the
connection handler (the plugin connection handling code) won't be misled.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
docs/plugins/index.mdxplugins/ARCHITECTURE.mdplugins/lib/pool-executor.tsplugins/tests/lib/direct-executor.test.tsplugins/tests/lib/pool-executor-reconnect.test.tssrc/constants/plugins.rssrc/services/plugins/config.rssrc/services/plugins/shared_socket.rs
💤 Files with no reviewable changes (2)
- src/constants/plugins.rs
- plugins/ARCHITECTURE.md
Summary
This PR will remove socket connection pool logic in favor of simpler model in order to avoid issues with connection reuse and timeouts.
Testing Process
Checklist
Note
If you are using Relayer in your stack, consider adding your team or organization to our list of Relayer Users in the Wild!
Summary by CodeRabbit
Release Notes
Bug Fixes
Documentation
Chores