perf(tree): worker pooling for account proofs #18901

yongkangc · 2025-10-08T08:09:22Z

This PR builds on top of #18887 by adding worker pooling to account and blinded account proofs, which allows us to reduce overhead from spawning the worker.

Changes:

Move Account to pooling logic
Move Blinded Account Nodes to pooling logic. Removed underlying structure and functions (pending tasks)

Core logic:

Determine workers based on parallelism
Spawn x workers each with its own tx
Queue storage proofs via crossbeam channels to the workers
Await Storage proofs to build account proofs

Caller
    ↓ (requests account multiproof)
ProofTaskManager
    ↓ (routes to account worker pool)
Account Worker
    ↓ (calls collect_storage_proofs)
    ↓ (queues all storage proofs to storage worker pool)
Storage Workers (parallel computation)
    ↓ (return storage proofs)
Account Worker
    ↓ (calls build_account_multiproof_with_storage_roots)
    ↓ (walks account trie, looks up pre-computed storage roots)
Return to Caller (final account multiproof)

Issues closed:

References:

Overall POC: perf(trie): proofmanager optimisation WIP #18829

Next Steps:

Removing ProofTaskManager + other cleanups

Copilot

Pull Request Overview

This PR adds worker pooling support for account proof operations to complement the existing storage proof worker pool. The main goal is to reduce overhead from spawning workers by reusing dedicated workers with long-lived database transactions for account multiproof computations.

Key changes:

Added account worker pool alongside existing storage worker pool
Integrated account multiproof requests into the worker pooling system
Modified configuration to support separate account worker count parameter

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`crates/trie/parallel/src/proof_task.rs`	Core implementation of account worker pool, job structures, and worker loop functions
`crates/trie/parallel/src/proof.rs`	Refactored multiproof generation to use account worker pool and extracted helper functions
`crates/engine/tree/src/tree/payload_processor/multiproof.rs`	Updated multiproof manager to use account worker pool for proof calculations
`crates/engine/tree/src/tree/payload_processor/mod.rs`	Added account worker count configuration parameter
`crates/engine/primitives/src/config.rs`	Added account_worker_count field and related configuration methods

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

yongkangc · 2025-10-08T12:13:06Z

@mediocregopher

re #18887 (comment)

I am thinking about it again, do we wanna expect or might be better to log an error? i am thinking there might not be a cost to erroring instead of expect?

Copilot

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/engine/tree/src/tree/payload_processor/multiproof.rs

crates/trie/parallel/src/proof_task.rs

mediocregopher · 2025-10-08T12:27:50Z

@mediocregopher

re #18887 (comment)

I am thinking about it again, do we wanna expect or might be better to log an error? i am thinking there might not be a cost to erroring instead of expect?

My opinion on these situations is that if something isn't supposed to be possible we should acknowledge it in the code; adding unnecessary error handling/logging creates mental overhead when reading the code because the reader might now think through this extra code-path, not realizing it's not really possible.

In this case, if it were possible for sending to the worker channel to fail, I don't think just logging an error would be sufficient; the proof task is getting silently dropped, meaning the sparse trie won't be revealed, meaning possibly a failure to calculate the state root. I would rather, as a reader, not have to think through all of that, and just assert that this can't even really happen.

Put another way: less code == less problems

crates/trie/parallel/src/proof.rs

crates/trie/parallel/src/proof_task.rs

crates/trie/parallel/src/proof.rs

- Added `AccountMultiproofInput` struct for input parameters related to account multiproof computation. - Introduced `AccountMultiproofJob` struct for managing account multiproof tasks. - Updated `ProofTaskKind` enum to include an account multiproof variant. - Modified task routing logic to ensure account multiproof operations are directed to the appropriate worker pools.

- Introduced `account_worker_count` field in `TreeConfig` struct. - Updated default implementation and constructor to include `account_worker_count`. - Modified `ProofTaskManager` instantiation to utilize `account_worker_count`. - Added getter and setter methods for `account_worker_count` in `TreeConfig`.

- Added support for account multiproof operations with dedicated account workers. - Introduced `account_work_tx` and `account_worker_count` fields in `ProofTaskManager`. - Updated worker loop to handle account multiproof jobs and collect storage proofs. - Enhanced task routing to ensure proper handling of account multiproof requests. - Improved documentation for clarity on worker lifecycle and transaction reuse.

- Implemented account worker functionality to handle account multiproof operations. - Introduced `account_work_tx` and `account_worker_count` fields for managing account jobs. - Updated worker loop to process account multiproof tasks and collect associated storage proofs. - Improved task routing to ensure efficient handling of account multiproof requests. - Enhanced documentation for better understanding of worker lifecycle and transaction reuse.

- Introduced `WorkerType` enum for better logging of worker types. - Updated `ProofTaskManager` to spawn account workers alongside storage workers. - Implemented `account_worker_loop` to handle account multiproof operations and collect storage proofs. - Improved documentation on worker lifecycle and transaction reuse. - Refactored task management to streamline the spawning of worker pools.

- Added `account_proof_task_handle` to manage account multiproof tasks. - Updated `MultiproofManager` constructor to accept the new account proof task handle. - Enhanced task queuing for account multiproof operations, including error handling for task submission and channel communication. - Improved prefix set handling for account multiproof inputs.

- Added `account_worker_loop` function to handle account multiproof tasks. - Introduced `collect_storage_proofs` function for queuing and collecting storage proofs. - Enhanced error handling for task submission and result collection. - Updated documentation to clarify the workflow of account multiproof operations and storage proof handling.

- Introduced `extend_prefix_sets_with_targets` method to streamline prefix set preparation for multiproof generation. - Updated `decoded_multiproof` to utilize the new method, improving clarity and reducing code duplication. - Refactored account multiproof task queuing to enhance error handling and communication with worker pools. - Improved documentation for methods related to account multiproof operations and storage proof handling.

- Updated the field name from `storage_proof_task_handle` to `proof_task_handle` in the `ParallelProof` struct to better reflect its purpose. - Adjusted related method signatures and internal references to maintain consistency throughout the codebase.

- Removed the generic type parameter `Factory` from the `ProofTaskManager` struct and its implementation, streamlining the code. - Updated the `new` and `spawn_worker_pool` methods to maintain functionality while simplifying type handling. - Enhanced clarity and reduced complexity in the proof task management logic.

…Manager` - Updated comments for `storage_proof_task_handle` and `account_proof_task_handle` to clarify their roles in managing proof tasks. - Adjusted method calls to use the updated handle references, enhancing code readability and maintainability.

Copilot

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

crates/engine/tree/src/tree/payload_processor/multiproof.rs

…n in parallel - replace standard mpsc channels with crossbeam channels in `ProofTaskManager` - Updated `proof_task_rx` and `proof_task_tx` to use `CrossbeamReceiver` and `CrossbeamSender` for improved performance and non-blocking behavior. - Renamed `collect_storage_proofs` to `queue_storage_proofs` to better reflect its functionality of returning receivers for storage proofs. - Enhanced documentation to clarify the interleaved parallelism between account trie traversal and storage proof computation.

- Before this change the code did: StorageRootTargets::new( prefix_sets.account_prefix_set.iter().map(|n| B256::from_slice(&nibbles.pack())), prefix_sets.storage_prefix_sets.clone(), ).len() StorageRootTargets::new(..) builds a brand-new B256Map<PrefixSet> by: - cloning every entry in storage_prefix_sets (each PrefixSet can hold dozens of nibble prefixes), and - allocating a fresh B256Map to hold them. - We only needed the resulting length, so those clones/allocations were immediately thrown away.

…age jobs - Replaced the `storage_proof_handle` parameter with `storage_work_tx` in the `account_worker_loop` function to facilitate direct communication with the storage worker queue. - Modified the `queue_storage_proofs` function to accept a mutable reference to `storage_prefix_sets`, improving efficiency by avoiding unnecessary clones. - Enhanced the initialization of `collected_decoded_storages` to use pre-allocated capacity, optimizing memory usage during proof collection.

…nation - Introduced a new function `default_account_worker_count` to calculate the default number of account worker threads, set to 1.5 times the storage worker count. - Updated the `TreeConfig` implementation to use the new account worker count function, enhancing performance for I/O-bound tasks related to storage proof collection and account trie traversal.

yongkangc · 2025-10-09T07:18:52Z

Mainnet reth4 On Commit: 9e177f4

yongkangc · 2025-10-09T07:27:36Z

Base on commit 9e177f4

Copilot

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

Copilot · 2025-10-09T07:30:46Z

crates/trie/parallel/src/proof_task.rs


-impl<Factory> ProofTaskManager<Factory>
+// TODO: Refactor this with storage_worker_loop. ProofTaskManager should be removed in the following
+// pr and `MultiproofManager` should be used instead to dispatch jobs directly.


This TODO comment indicates technical debt. Consider creating a tracking issue or estimate for this refactoring to avoid it being forgotten.

Suggested change

// pr and `MultiproofManager` should be used instead to dispatch jobs directly.

// pr and `MultiproofManager` should be used instead to dispatch jobs directly.

// Tracking issue: https://github.com/your-org/your-repo/issues/1234

crates/engine/primitives/src/config.rs

- Updated comments for `proof_task_handle` parameters in the `MultiproofManager` initialization to specify their roles for storage and account proof workers, improving code clarity and maintainability.

Copilot

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

Copilot

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T08:07:54Z

crates/trie/parallel/src/proof_task.rs

+#[allow(clippy::too_many_arguments)]
+fn build_account_multiproof_with_storage_roots<C, H>(
+    trie_cursor_factory: C,
+    hashed_cursor_factory: H,
+    targets: &MultiProofTargets,
+    prefix_set: PrefixSet,
+    collect_branch_node_masks: bool,
+    multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>,
+    mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>,
+    missed_leaves_storage_roots: &DashMap<B256, B256>,


Function has too many arguments (8). Consider grouping related parameters into a struct to improve readability and maintainability. For example, the cursor factories and configuration parameters could be grouped.

Suggested change

#[allow(clippy::too_many_arguments)]

fn build_account_multiproof_with_storage_roots<C, H>(

trie_cursor_factory: C,

hashed_cursor_factory: H,

targets: &MultiProofTargets,

prefix_set: PrefixSet,

collect_branch_node_masks: bool,

multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>,

mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>,

missed_leaves_storage_roots: &DashMap<B256, B256>,

/// Struct grouping the cursor factories for account multiproof.

struct CursorFactories<C, H> {

trie_cursor_factory: C,

hashed_cursor_factory: H,

}

/// Struct grouping configuration parameters for account multiproof.

struct AccountMultiproofConfig<'a> {

collect_branch_node_masks: bool,

multi_added_removed_keys: Option<&'a Arc<MultiAddedRemovedKeys>>,

storage_proof_receivers: B256Map<Receiver<StorageProofResult>>,

missed_leaves_storage_roots: &'a DashMap<B256, B256>,

}

#[allow(clippy::too_many_arguments)]

fn build_account_multiproof_with_storage_roots<C, H>(

cursor_factories: CursorFactories<C, H>,

targets: &MultiProofTargets,

prefix_set: PrefixSet,

config: AccountMultiproofConfig,

agree with the comment that this is a lot of arguments, I think it would be nice to make this simpler by introducing some sort of input or context type

yongkangc · 2025-10-09T08:08:05Z

@mediocregopher addressed ur changes :)

@shekhirin @Rjected would appreciate a review here

mediocregopher

One final question

crates/trie/parallel/src/proof.rs

Rjected

The structure for workers makes sense to me, I have some comments on docs and style

Rjected · 2025-10-09T15:04:43Z

crates/engine/primitives/src/config.rs

+/// Account workers primarily coordinate storage proof collection and account trie traversal.
+/// They spend significant time blocked on `receiver.recv()` calls waiting for storage proofs,
+/// so we use higher concurrency (1.5x storage workers) to maximize throughput and overlap.
+/// While storage workers are CPU-bound, account workers are I/O-bound coordinators.


is this comment true? I would expect the storage workers to be I/O bound as well

crates/trie/parallel/src/proof.rs

Rjected · 2025-10-09T15:19:59Z

crates/trie/parallel/src/proof_task.rs

 }

-impl<Factory> ProofTaskManager<Factory>
+// TODO: Refactor this with storage_worker_loop. ProofTaskManager should be removed in the following


I do think it is helpful to have the state required for each looping task, to be encapsulated in a struct rather than be a free function

Rjected · 2025-10-09T15:22:30Z

crates/trie/parallel/src/proof_task.rs

+#[allow(clippy::too_many_arguments)]
+fn build_account_multiproof_with_storage_roots<C, H>(
+    trie_cursor_factory: C,
+    hashed_cursor_factory: H,
+    targets: &MultiProofTargets,
+    prefix_set: PrefixSet,
+    collect_branch_node_masks: bool,
+    multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>,
+    mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>,
+    missed_leaves_storage_roots: &DashMap<B256, B256>,


agree with the comment that this is a lot of arguments, I think it would be nice to make this simpler by introducing some sort of input or context type

Rjected · 2025-10-09T15:29:44Z

crates/trie/parallel/src/proof_task.rs

+        // spawns workers that will execute worker_fn
+        for worker_id in 0..worker_count {
+            let provider_ro = view.provider_ro()?;
            let tx = provider_ro.into_tx();
-            self.total_transactions += 1;
-            return Ok(Some(ProofTaskTx::new(tx, self.task_ctx.clone(), self.total_transactions)));
-        }
+            let proof_task_tx = ProofTaskTx::new(tx, task_ctx.clone(), worker_id);
+            let work_rx_clone = work_rx.clone();
+            let worker_fn_clone = worker_fn.clone();

-        Ok(None)
-    }
+            executor
+                .spawn_blocking(move || worker_fn_clone(proof_task_tx, work_rx_clone, worker_id));

-    /// Spawns the next queued proof task on the executor with the given input, if there are any
-    /// transactions available.
-    ///
-    /// This will return an error if a transaction must be created on-demand and the consistent view
-    /// provider fails.
-    pub fn try_spawn_next(&mut self) -> ProviderResult<()> {
-        let Some(task) = self.pending_tasks.pop_front() else { return Ok(()) };
-
-        let Some(proof_task_tx) = self.get_or_create_tx()? else {
-            // if there are no txs available, requeue the proof task
-            self.pending_tasks.push_front(task);
-            return Ok(())
-        };
-
-        let tx_sender = self.tx_sender.clone();
-        self.executor.spawn_blocking(move || match task {
-            ProofTaskKind::BlindedAccountNode(path, sender) => {
-                proof_task_tx.blinded_account_node(path, sender, tx_sender);
-            }
-            // Storage trie operations should never reach here as they're routed to worker pool
-            ProofTaskKind::BlindedStorageNode(_, _, _) | ProofTaskKind::StorageProof(_, _) => {
-                unreachable!("Storage trie operations should be routed to worker pool")
-            }
-        });
+            spawned_workers += 1;
+
+            tracing::debug!(
+                target: "trie::proof_task",
+                worker_id,
+                spawned_workers,
+                worker_type = %worker_type,
+                "{} worker spawned successfully", worker_type
+            );
+        }


I understand that the Fn here is used to be generic over storage and account workers, and have only one function for spawning worker pools. But IMO this makes it harder to follow what's going on. What would be more clear to me, would be structs that encapsulate:

an account worker / loop

a storage worker / loop

And either duplicating the spawning code or finding another way to abstract the tasks

github-project-automation bot added this to Reth Tracker Oct 8, 2025

github-project-automation bot moved this to Backlog in Reth Tracker Oct 8, 2025

yongkangc self-assigned this Oct 8, 2025

yongkangc moved this from Backlog to In Progress in Reth Tracker Oct 8, 2025

yongkangc marked this pull request as ready for review October 8, 2025 12:03

yongkangc requested review from rkrasiuk, Rjected, shekhirin, mediocregopher, mattsse and fgimenez as code owners October 8, 2025 12:03

yongkangc requested a review from Copilot October 8, 2025 12:03

Copilot AI reviewed Oct 8, 2025

View reviewed changes

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Show resolved Hide resolved

yongkangc requested a review from Copilot October 8, 2025 12:17

Copilot AI reviewed Oct 8, 2025

View reviewed changes

crates/engine/tree/src/tree/payload_processor/multiproof.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Show resolved Hide resolved

mediocregopher reviewed Oct 8, 2025

View reviewed changes

crates/trie/parallel/src/proof.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/trie/parallel/src/proof.rs Outdated Show resolved Hide resolved

yongkangc added 12 commits October 8, 2025 13:01

refactor spawning workers into a generic function

aeff6d2

fmt

4e30bba

pass in missed_leaves_storage_roots

7892d54

remove db

c48085d

yongkangc added 4 commits October 9, 2025 01:59

Merge branch 'yk/worker_pool_storage_2' into yk/worker_pool_acc

498cd20

yongkangc requested review from mediocregopher and Copilot October 9, 2025 02:05

Copilot AI reviewed Oct 9, 2025

View reviewed changes

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/engine/tree/src/tree/payload_processor/multiproof.rs Outdated Show resolved Hide resolved

yongkangc added 8 commits October 9, 2025 02:40

clippy

1bbfc7c

rm clone from StorageProofInput

a654cf8

Removed the unused _proof_tx parameter from queue_storage_proofs

e427485

clippy

b4d109b

yongkangc requested a review from Copilot October 9, 2025 07:29

Copilot AI reviewed Oct 9, 2025

View reviewed changes

refactor: clarify comments for proof task handles in MultiproofManager

4596366

- Updated comments for `proof_task_handle` parameters in the `MultiproofManager` initialization to specify their roles for storage and account proof workers, improving code clarity and maintainability.

yongkangc requested a review from Copilot October 9, 2025 07:41

Copilot AI reviewed Oct 9, 2025

View reviewed changes

yongkangc added 2 commits October 9, 2025 07:45

comment

19a9e8b

expect workers to be avaliable

2091f0d

yongkangc requested a review from Copilot October 9, 2025 08:07

Copilot AI reviewed Oct 9, 2025

View reviewed changes

mediocregopher reviewed Oct 9, 2025

View reviewed changes

crates/trie/parallel/src/proof.rs Show resolved Hide resolved

yongkangc requested a review from mediocregopher October 9, 2025 15:19

Rjected reviewed Oct 9, 2025

View reviewed changes

	// pr and `MultiproofManager` should be used instead to dispatch jobs directly.
	// pr and `MultiproofManager` should be used instead to dispatch jobs directly.
	// Tracking issue: https://github.com/your-org/your-repo/issues/1234

perf(tree): worker pooling for account proofs #18901

Are you sure you want to change the base?

perf(tree): worker pooling for account proofs #18901

Conversation

yongkangc commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

yongkangc commented Oct 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mediocregopher commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

yongkangc commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yongkangc commented Oct 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Rjected Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

yongkangc commented Oct 9, 2025

Uh oh!

mediocregopher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Rjected left a comment

Choose a reason for hiding this comment

Uh oh!

Rjected Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Rjected Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

yongkangc commented Oct 8, 2025 •

edited

Loading

yongkangc commented Oct 9, 2025 •

edited

Loading

Rjected Oct 9, 2025 •

edited

Loading