-
Notifications
You must be signed in to change notification settings - Fork 1.9k
perf(tree): worker pooling for account proofs #18901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: yk/worker_pool_storage_2
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds worker pooling support for account proof operations to complement the existing storage proof worker pool. The main goal is to reduce overhead from spawning workers by reusing dedicated workers with long-lived database transactions for account multiproof computations.
Key changes:
- Added account worker pool alongside existing storage worker pool
- Integrated account multiproof requests into the worker pooling system
- Modified configuration to support separate account worker count parameter
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
crates/trie/parallel/src/proof_task.rs |
Core implementation of account worker pool, job structures, and worker loop functions |
crates/trie/parallel/src/proof.rs |
Refactored multiproof generation to use account worker pool and extracted helper functions |
crates/engine/tree/src/tree/payload_processor/multiproof.rs |
Updated multiproof manager to use account worker pool for proof calculations |
crates/engine/tree/src/tree/payload_processor/mod.rs |
Added account worker count configuration parameter |
crates/engine/primitives/src/config.rs |
Added account_worker_count field and related configuration methods |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
I am thinking about it again, do we wanna expect or might be better to log an error? i am thinking there might not be a cost to erroring instead of expect? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
My opinion on these situations is that if something isn't supposed to be possible we should acknowledge it in the code; adding unnecessary error handling/logging creates mental overhead when reading the code because the reader might now think through this extra code-path, not realizing it's not really possible. In this case, if it were possible for sending to the worker channel to fail, I don't think just logging an error would be sufficient; the proof task is getting silently dropped, meaning the sparse trie won't be revealed, meaning possibly a failure to calculate the state root. I would rather, as a reader, not have to think through all of that, and just assert that this can't even really happen. Put another way: less code == less problems |
- Added `AccountMultiproofInput` struct for input parameters related to account multiproof computation. - Introduced `AccountMultiproofJob` struct for managing account multiproof tasks. - Updated `ProofTaskKind` enum to include an account multiproof variant. - Modified task routing logic to ensure account multiproof operations are directed to the appropriate worker pools.
- Introduced `account_worker_count` field in `TreeConfig` struct. - Updated default implementation and constructor to include `account_worker_count`. - Modified `ProofTaskManager` instantiation to utilize `account_worker_count`. - Added getter and setter methods for `account_worker_count` in `TreeConfig`.
- Added support for account multiproof operations with dedicated account workers. - Introduced `account_work_tx` and `account_worker_count` fields in `ProofTaskManager`. - Updated worker loop to handle account multiproof jobs and collect storage proofs. - Enhanced task routing to ensure proper handling of account multiproof requests. - Improved documentation for clarity on worker lifecycle and transaction reuse.
- Implemented account worker functionality to handle account multiproof operations. - Introduced `account_work_tx` and `account_worker_count` fields for managing account jobs. - Updated worker loop to process account multiproof tasks and collect associated storage proofs. - Improved task routing to ensure efficient handling of account multiproof requests. - Enhanced documentation for better understanding of worker lifecycle and transaction reuse.
- Introduced `WorkerType` enum for better logging of worker types. - Updated `ProofTaskManager` to spawn account workers alongside storage workers. - Implemented `account_worker_loop` to handle account multiproof operations and collect storage proofs. - Improved documentation on worker lifecycle and transaction reuse. - Refactored task management to streamline the spawning of worker pools.
- Added `account_proof_task_handle` to manage account multiproof tasks. - Updated `MultiproofManager` constructor to accept the new account proof task handle. - Enhanced task queuing for account multiproof operations, including error handling for task submission and channel communication. - Improved prefix set handling for account multiproof inputs.
- Added `account_worker_loop` function to handle account multiproof tasks. - Introduced `collect_storage_proofs` function for queuing and collecting storage proofs. - Enhanced error handling for task submission and result collection. - Updated documentation to clarify the workflow of account multiproof operations and storage proof handling.
- Introduced `extend_prefix_sets_with_targets` method to streamline prefix set preparation for multiproof generation. - Updated `decoded_multiproof` to utilize the new method, improving clarity and reducing code duplication. - Refactored account multiproof task queuing to enhance error handling and communication with worker pools. - Improved documentation for methods related to account multiproof operations and storage proof handling.
- Updated the field name from `storage_proof_task_handle` to `proof_task_handle` in the `ParallelProof` struct to better reflect its purpose. - Adjusted related method signatures and internal references to maintain consistency throughout the codebase.
- Removed the generic type parameter `Factory` from the `ProofTaskManager` struct and its implementation, streamlining the code. - Updated the `new` and `spawn_worker_pool` methods to maintain functionality while simplifying type handling. - Enhanced clarity and reduced complexity in the proof task management logic.
…Manager` - Updated comments for `storage_proof_task_handle` and `account_proof_task_handle` to clarify their roles in managing proof tasks. - Adjusted method calls to use the updated handle references, enhancing code readability and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…n in parallel - replace standard mpsc channels with crossbeam channels in `ProofTaskManager` - Updated `proof_task_rx` and `proof_task_tx` to use `CrossbeamReceiver` and `CrossbeamSender` for improved performance and non-blocking behavior. - Renamed `collect_storage_proofs` to `queue_storage_proofs` to better reflect its functionality of returning receivers for storage proofs. - Enhanced documentation to clarify the interleaved parallelism between account trie traversal and storage proof computation.
- Before this change the code did: StorageRootTargets::new( prefix_sets.account_prefix_set.iter().map(|n| B256::from_slice(&nibbles.pack())), prefix_sets.storage_prefix_sets.clone(), ).len() StorageRootTargets::new(..) builds a brand-new B256Map<PrefixSet> by: - cloning every entry in storage_prefix_sets (each PrefixSet can hold dozens of nibble prefixes), and - allocating a fresh B256Map to hold them. - We only needed the resulting length, so those clones/allocations were immediately thrown away.
…age jobs - Replaced the `storage_proof_handle` parameter with `storage_work_tx` in the `account_worker_loop` function to facilitate direct communication with the storage worker queue. - Modified the `queue_storage_proofs` function to accept a mutable reference to `storage_prefix_sets`, improving efficiency by avoiding unnecessary clones. - Enhanced the initialization of `collected_decoded_storages` to use pre-allocated capacity, optimizing memory usage during proof collection.
…nation - Introduced a new function `default_account_worker_count` to calculate the default number of account worker threads, set to 1.5 times the storage worker count. - Updated the `TreeConfig` implementation to use the new account worker count function, enhancing performance for I/O-bound tasks related to storage proof collection and account trie traversal.
Mainnet reth4 On Commit: 9e177f4 ![]() ![]() |
Base on commit 9e177f4 ![]() ![]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
impl<Factory> ProofTaskManager<Factory> | ||
// TODO: Refactor this with storage_worker_loop. ProofTaskManager should be removed in the following | ||
// pr and `MultiproofManager` should be used instead to dispatch jobs directly. |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO comment indicates technical debt. Consider creating a tracking issue or estimate for this refactoring to avoid it being forgotten.
// pr and `MultiproofManager` should be used instead to dispatch jobs directly. | |
// pr and `MultiproofManager` should be used instead to dispatch jobs directly. | |
// Tracking issue: https://github.com/your-org/your-repo/issues/1234 |
Copilot uses AI. Check for mistakes.
- Updated comments for `proof_task_handle` parameters in the `MultiproofManager` initialization to specify their roles for storage and account proof workers, improving code clarity and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
#[allow(clippy::too_many_arguments)] | ||
fn build_account_multiproof_with_storage_roots<C, H>( | ||
trie_cursor_factory: C, | ||
hashed_cursor_factory: H, | ||
targets: &MultiProofTargets, | ||
prefix_set: PrefixSet, | ||
collect_branch_node_masks: bool, | ||
multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>, | ||
mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>, | ||
missed_leaves_storage_roots: &DashMap<B256, B256>, |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function has too many arguments (8). Consider grouping related parameters into a struct to improve readability and maintainability. For example, the cursor factories and configuration parameters could be grouped.
#[allow(clippy::too_many_arguments)] | |
fn build_account_multiproof_with_storage_roots<C, H>( | |
trie_cursor_factory: C, | |
hashed_cursor_factory: H, | |
targets: &MultiProofTargets, | |
prefix_set: PrefixSet, | |
collect_branch_node_masks: bool, | |
multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>, | |
mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>, | |
missed_leaves_storage_roots: &DashMap<B256, B256>, | |
/// Struct grouping the cursor factories for account multiproof. | |
struct CursorFactories<C, H> { | |
trie_cursor_factory: C, | |
hashed_cursor_factory: H, | |
} | |
/// Struct grouping configuration parameters for account multiproof. | |
struct AccountMultiproofConfig<'a> { | |
collect_branch_node_masks: bool, | |
multi_added_removed_keys: Option<&'a Arc<MultiAddedRemovedKeys>>, | |
storage_proof_receivers: B256Map<Receiver<StorageProofResult>>, | |
missed_leaves_storage_roots: &'a DashMap<B256, B256>, | |
} | |
#[allow(clippy::too_many_arguments)] | |
fn build_account_multiproof_with_storage_roots<C, H>( | |
cursor_factories: CursorFactories<C, H>, | |
targets: &MultiProofTargets, | |
prefix_set: PrefixSet, | |
config: AccountMultiproofConfig, |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with the comment that this is a lot of arguments, I think it would be nice to make this simpler by introducing some sort of input or context type
@mediocregopher addressed ur changes :) @shekhirin @Rjected would appreciate a review here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure for workers makes sense to me, I have some comments on docs and style
/// Account workers primarily coordinate storage proof collection and account trie traversal. | ||
/// They spend significant time blocked on `receiver.recv()` calls waiting for storage proofs, | ||
/// so we use higher concurrency (1.5x storage workers) to maximize throughput and overlap. | ||
/// While storage workers are CPU-bound, account workers are I/O-bound coordinators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this comment true? I would expect the storage workers to be I/O bound as well
} | ||
|
||
impl<Factory> ProofTaskManager<Factory> | ||
// TODO: Refactor this with storage_worker_loop. ProofTaskManager should be removed in the following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think it is helpful to have the state required for each looping task, to be encapsulated in a struct rather than be a free function
#[allow(clippy::too_many_arguments)] | ||
fn build_account_multiproof_with_storage_roots<C, H>( | ||
trie_cursor_factory: C, | ||
hashed_cursor_factory: H, | ||
targets: &MultiProofTargets, | ||
prefix_set: PrefixSet, | ||
collect_branch_node_masks: bool, | ||
multi_added_removed_keys: Option<&Arc<MultiAddedRemovedKeys>>, | ||
mut storage_proof_receivers: B256Map<Receiver<StorageProofResult>>, | ||
missed_leaves_storage_roots: &DashMap<B256, B256>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with the comment that this is a lot of arguments, I think it would be nice to make this simpler by introducing some sort of input or context type
// spawns workers that will execute worker_fn | ||
for worker_id in 0..worker_count { | ||
let provider_ro = view.provider_ro()?; | ||
let tx = provider_ro.into_tx(); | ||
self.total_transactions += 1; | ||
return Ok(Some(ProofTaskTx::new(tx, self.task_ctx.clone(), self.total_transactions))); | ||
} | ||
let proof_task_tx = ProofTaskTx::new(tx, task_ctx.clone(), worker_id); | ||
let work_rx_clone = work_rx.clone(); | ||
let worker_fn_clone = worker_fn.clone(); | ||
|
||
Ok(None) | ||
} | ||
executor | ||
.spawn_blocking(move || worker_fn_clone(proof_task_tx, work_rx_clone, worker_id)); | ||
|
||
/// Spawns the next queued proof task on the executor with the given input, if there are any | ||
/// transactions available. | ||
/// | ||
/// This will return an error if a transaction must be created on-demand and the consistent view | ||
/// provider fails. | ||
pub fn try_spawn_next(&mut self) -> ProviderResult<()> { | ||
let Some(task) = self.pending_tasks.pop_front() else { return Ok(()) }; | ||
|
||
let Some(proof_task_tx) = self.get_or_create_tx()? else { | ||
// if there are no txs available, requeue the proof task | ||
self.pending_tasks.push_front(task); | ||
return Ok(()) | ||
}; | ||
|
||
let tx_sender = self.tx_sender.clone(); | ||
self.executor.spawn_blocking(move || match task { | ||
ProofTaskKind::BlindedAccountNode(path, sender) => { | ||
proof_task_tx.blinded_account_node(path, sender, tx_sender); | ||
} | ||
// Storage trie operations should never reach here as they're routed to worker pool | ||
ProofTaskKind::BlindedStorageNode(_, _, _) | ProofTaskKind::StorageProof(_, _) => { | ||
unreachable!("Storage trie operations should be routed to worker pool") | ||
} | ||
}); | ||
spawned_workers += 1; | ||
|
||
tracing::debug!( | ||
target: "trie::proof_task", | ||
worker_id, | ||
spawned_workers, | ||
worker_type = %worker_type, | ||
"{} worker spawned successfully", worker_type | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the Fn
here is used to be generic over storage and account workers, and have only one function for spawning worker pools. But IMO this makes it harder to follow what's going on. What would be more clear to me, would be structs that encapsulate:
- an account worker / loop
- a storage worker / loop
And either duplicating the spawning code or finding another way to abstract the tasks
This PR builds on top of #18887 by adding worker pooling to account and blinded account proofs, which allows us to reduce overhead from spawning the worker.
Changes:
Core logic:
Issues closed:
ProofTaskManager
for account proofs fromMultiProofTask
#18734ProofTaskManager
for Blinded Node Migration #18833References:
Next Steps: