refactor(trie): remove proof task manager #18934

yongkangc · 2025-10-10T10:21:50Z

Context:

As part of our performance work to reduce overhead and improve scheduling, we added worker pooling for multiproof generation.
This PR aims to perform cleanup and remove ProofTaskManager as an abstraction as we can now directly dispatch the proofs jobs to workers.

impact:

Change	Impact
Remove `run()` loop thread	-1 thread, -1 channel hop
Direct channel sends	~some time saved per task
Eliminate enum wrapping	~2 allocations saved per task

reference PRs:

- Replaced the ProofTaskManager with a new spawn_proof_workers function for better clarity and maintainability. - Updated related code to utilize the new function, simplifying the worker spawning process. - Enhanced metrics tracking for storage and account proof requests, ensuring thread-safe operations. - Improved error handling and code structure across proof task implementations.

- Added a constant `MIN_WORKER_COUNT` to enforce a minimum number of workers for storage and account proof tasks. - Updated `default_storage_worker_count` and `default_account_worker_count` functions to utilize the new minimum constraint. - Enhanced setter methods in `TreeConfig` to ensure worker counts do not fall below the minimum. - Modified command-line argument parsing to validate worker counts against the minimum requirement.

- Added a debug assertion to ensure active_handles does not underflow when dropping a ProofTaskManagerHandle. - Implemented metrics recording to flush before exit when the last handle is dropped, enhancing monitoring capabilities.

Copilot

Pull Request Overview

This PR refactors the proof task management by removing the ProofTaskManager abstraction and replacing it with direct worker pool spawning. The change eliminates the routing thread overhead by providing direct channel access to storage and account worker pools, simplifying the architecture while maintaining the same worker pool functionality.

Key changes:

Replaced ProofTaskManager with spawn_proof_workers function for direct worker spawning
Converted ProofTaskManagerHandle to provide type-safe queue methods with direct channel access
Updated metrics to use lock-free atomic counters for thread-safe operations

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
crates/trie/parallel/src/proof_task_metrics.rs	Converts metrics fields to atomic counters for lock-free thread safety
crates/trie/parallel/src/proof_task.rs	Replaces ProofTaskManager with spawn_proof_workers function and updates handle interface
crates/trie/parallel/src/proof.rs	Updates proof generation to use new direct queue methods
crates/node/core/src/args/engine.rs	Adds minimum worker count validation to CLI arguments
crates/engine/tree/src/tree/payload_validator.rs	Updates error message to reflect new spawning approach
crates/engine/tree/src/tree/payload_processor/multiproof.rs	Updates multiproof manager to use new queue methods
crates/engine/tree/src/tree/payload_processor/mod.rs	Replaces ProofTaskManager instantiation with spawn_proof_workers
crates/engine/primitives/src/config.rs	Adds MIN_WORKER_COUNT constant and enforces minimum worker limits

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

Copilot

Pull Request Overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

- Introduced helper functions to streamline error conversion from ProviderError and channel receive errors to SparseTrieError. - Enhanced readability and maintainability of the trie_node method by reducing repetitive error handling code.

Copilot

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/engine/primitives/src/config.rs

Copilot

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T10:59:04Z

crates/trie/parallel/src/proof_task.rs

+        debug_assert_ne!(
+            previous_handles, 0,
+            "active_handles underflow in ProofTaskManagerHandle::drop (previous={})",
+            previous_handles
+        );


The debug assertion checks for underflow after the fetch_sub operation, but this creates a race condition. If multiple threads drop handles simultaneously, one could observe 0 while another decrements below 0. Move the check before fetch_sub or use compare_and_swap to prevent underflow.

crates/engine/primitives/src/config.rs

- Introduced a `clamp_worker_count` function to centralize the logic for enforcing the minimum worker count. - Updated setter methods in `TreeConfig` to utilize the new clamping function, improving code readability and maintainability.

crates/engine/primitives/src/config.rs

Copilot

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T11:06:51Z

crates/trie/parallel/src/proof_task.rs

+    executor: Handle,
+    view: ConsistentDbView<Factory>,
+    task_ctx: ProofTaskCtx,
    storage_worker_count: usize,
+    account_worker_count: usize,


The spawn_proof_workers function lacks documentation for its parameters. Consider adding parameter documentation to explain what each argument does, especially the worker count parameters and their impact on performance.

Copilot · 2025-10-10T11:06:52Z

crates/trie/parallel/src/proof_task.rs

 impl TrieNodeProvider for ProofTaskTrieNodeProvider {
    fn trie_node(&self, path: &Nibbles) -> Result<Option<RevealedNode>, SparseTrieError> {
-        let (tx, rx) = channel();
+        /// Helper to convert `ProviderError` to `SparseTrieError`
+        fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError {
+            SparseTrieErrorKind::Other(Box::new(std::io::Error::other(e.to_string()))).into()
+        }
+
+        /// Helper to convert channel recv error to `SparseTrieError`
+        fn recv_err_to_trie_err(_: std::sync::mpsc::RecvError) -> SparseTrieError {
+            SparseTrieErrorKind::Other(Box::new(std::io::Error::other("channel closed"))).into()
+        }
+


These helper functions are defined inside the trie_node method, which creates unnecessary code nesting. Consider moving these helper functions outside the method or to a module level for better maintainability and potential reuse.

crates/engine/primitives/src/config.rs

shekhirin · 2025-10-10T11:48:33Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

+    /// Handle to the proof worker pool for storage proofs.
    storage_proof_task_handle: ProofTaskManagerHandle,
-    /// Handle to the proof task manager used for account multiproofs.
+    /// Handle to the proof worker pool for account multiproofs.
    account_proof_task_handle: ProofTaskManagerHandle,


any reason why do we need to separate them here, if it's always the same proof_task_handle passed to both fields?

thanks for the note

thanks for the note, addresssed them

shekhirin · 2025-10-10T12:07:45Z

crates/trie/parallel/src/proof_task.rs

-        let (tx, rx) = channel();
+        /// Helper to convert `ProviderError` to `SparseTrieError`
+        fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError {
+            SparseTrieErrorKind::Other(Box::new(std::io::Error::other(e.to_string()))).into()


can't we do SparseTrieErrorKind::Other(Box::new(e)) here?

yup, good note - addressed that as well

addressed this in commit

crates/trie/parallel/src/proof_task_metrics.rs

shekhirin

LGTM overall, just have some nits regarding the usage of std::io::Error and separate account/storage proof handles.

- Merged separate storage and account proof task handles into a single proof task handle for improved code clarity and maintainability. - Updated related methods to utilize the consolidated handle, streamlining the management of proof tasks.

- Updated the ProofTaskMetrics struct to derive Default, removing the manual implementation of the default method. - This change enhances code clarity and reduces boilerplate, while maintaining the same functionality.

- Updated the error conversion helper function in ProofTaskTrieNodeProvider to directly wrap the ProviderError, enhancing clarity and maintainability. - This change simplifies the error handling logic within the trie_node method.

yongkangc · 2025-10-10T12:43:52Z

@shekhirin thanks for the review, just addressed all your comments in the commits

shekhirin · 2025-10-10T12:55:20Z

crates/trie/parallel/src/proof_task.rs

+        /// Helper to convert `ProviderError` to `SparseTrieError`
+        fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError {
+            SparseTrieErrorKind::Other(Box::new(e)).into()
+        }
+
+        /// Helper to convert channel recv error to `SparseTrieError`
+        fn recv_err_to_trie_err(_: std::sync::mpsc::RecvError) -> SparseTrieError {
+            SparseTrieErrorKind::Other(Box::new(std::io::Error::other("channel closed"))).into()
+        }


can we just do this without helper functions?

reth/crates/trie/trie/src/proof/trie_node.rs

Line 96 in f5840fc

.map_err(|error| SparseTrieErrorKind::Other(Box::new(error)))?;

RecvError already implements Error, so should work?

shekhirin · 2025-10-10T12:59:20Z

crates/engine/primitives/src/config.rs

+/// Clamps the worker count to the minimum allowed value.
+///
+/// Ensures that the worker count is at least [`MIN_WORKER_COUNT`].
+const fn clamp_worker_count(count: usize) -> usize {


is this just .max(MIN_WORKER_COUNT)? Let's move this to with_*_worker_count functions, no need to have a separate helper fn for this

yongkangc added 2 commits October 10, 2025 09:49

refactor: yeet proof task manager

ed45ebd

github-project-automation bot added this to Reth Tracker Oct 10, 2025

github-project-automation bot moved this to Backlog in Reth Tracker Oct 10, 2025

yongkangc self-assigned this Oct 10, 2025

yongkangc moved this from Backlog to In Progress in Reth Tracker Oct 10, 2025

yongkangc added 3 commits October 10, 2025 10:25

fix comment

d44180d

yongkangc requested a review from Copilot October 10, 2025 10:36

yongkangc changed the title ~~refactor: remove proof task manager~~ refactor(trie): remove proof task manager Oct 10, 2025

Copilot AI reviewed Oct 10, 2025

View reviewed changes

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

yongkangc requested a review from Copilot October 10, 2025 10:38

yongkangc added 2 commits October 10, 2025 10:39

clippy

8e00a4a

fmt

2b90133

Copilot AI reviewed Oct 10, 2025

View reviewed changes

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

yongkangc force-pushed the yk/pool_clean branch from 2c025cd to 2b90133 Compare October 10, 2025 10:40

yongkangc mentioned this pull request Oct 10, 2025

perf(tree): worker pooling for account proofs #18901

Open

yongkangc requested a review from Copilot October 10, 2025 10:51

yongkangc marked this pull request as ready for review October 10, 2025 10:51

yongkangc requested review from Rjected, fgimenez, mattsse, mediocregopher, rkrasiuk and shekhirin as code owners October 10, 2025 10:51

Copilot AI reviewed Oct 10, 2025

View reviewed changes

crates/engine/primitives/src/config.rs Outdated Show resolved Hide resolved

crates/engine/primitives/src/config.rs Outdated Show resolved Hide resolved

fix count

f302447

yongkangc requested a review from Copilot October 10, 2025 10:58

Copilot AI reviewed Oct 10, 2025

View reviewed changes

refactor: streamline worker count validation

833b031

- Introduced a `clamp_worker_count` function to centralize the logic for enforcing the minimum worker count. - Updated setter methods in `TreeConfig` to utilize the new clamping function, improving code readability and maintainability.

yongkangc commented Oct 10, 2025

View reviewed changes

crates/engine/primitives/src/config.rs Outdated Show resolved Hide resolved

yongkangc requested a review from Copilot October 10, 2025 11:05

paradigmxyz deleted a comment from Copilot AI Oct 10, 2025

Copilot AI reviewed Oct 10, 2025

View reviewed changes

shekhirin reviewed Oct 10, 2025

View reviewed changes

crates/trie/parallel/src/proof_task_metrics.rs Outdated Show resolved Hide resolved

shekhirin reviewed Oct 10, 2025

View reviewed changes

yongkangc added 3 commits October 10, 2025 12:42

refactor: simplify ProofTaskMetrics default implementation

e49791e

- Updated the ProofTaskMetrics struct to derive Default, removing the manual implementation of the default method. - This change enhances code clarity and reduces boilerplate, while maintaining the same functionality.

refactor: improve error handling in trie_node method

c02a68d

- Updated the error conversion helper function in ProofTaskTrieNodeProvider to directly wrap the ProviderError, enhancing clarity and maintainability. - This change simplifies the error handling logic within the trie_node method.

shekhirin reviewed Oct 10, 2025

View reviewed changes

refactor(trie): remove proof task manager #18934

Are you sure you want to change the base?

refactor(trie): remove proof task manager #18934

Uh oh!

Conversation

yongkangc commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context:

impact:

reference PRs:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shekhirin Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

yongkangc Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

yongkangc Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

shekhirin Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

yongkangc Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

yongkangc Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shekhirin left a comment

Choose a reason for hiding this comment

Uh oh!

yongkangc commented Oct 10, 2025

Uh oh!

shekhirin Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shekhirin Oct 10, 2025

yongkangc commented Oct 10, 2025 •

edited

Loading

shekhirin Oct 10, 2025 •

edited

Loading