-
Notifications
You must be signed in to change notification settings - Fork 1.9k
refactor(trie): remove proof task manager #18934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: refactor/remove-factory-generic
Are you sure you want to change the base?
Conversation
- Replaced the ProofTaskManager with a new spawn_proof_workers function for better clarity and maintainability. - Updated related code to utilize the new function, simplifying the worker spawning process. - Enhanced metrics tracking for storage and account proof requests, ensuring thread-safe operations. - Improved error handling and code structure across proof task implementations.
- Added a constant `MIN_WORKER_COUNT` to enforce a minimum number of workers for storage and account proof tasks. - Updated `default_storage_worker_count` and `default_account_worker_count` functions to utilize the new minimum constraint. - Enhanced setter methods in `TreeConfig` to ensure worker counts do not fall below the minimum. - Modified command-line argument parsing to validate worker counts against the minimum requirement.
- Added a debug assertion to ensure active_handles does not underflow when dropping a ProofTaskManagerHandle. - Implemented metrics recording to flush before exit when the last handle is dropped, enhancing monitoring capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the proof task management by removing the ProofTaskManager
abstraction and replacing it with direct worker pool spawning. The change eliminates the routing thread overhead by providing direct channel access to storage and account worker pools, simplifying the architecture while maintaining the same worker pool functionality.
Key changes:
- Replaced
ProofTaskManager
withspawn_proof_workers
function for direct worker spawning - Converted
ProofTaskManagerHandle
to provide type-safe queue methods with direct channel access - Updated metrics to use lock-free atomic counters for thread-safe operations
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
crates/trie/parallel/src/proof_task_metrics.rs | Converts metrics fields to atomic counters for lock-free thread safety |
crates/trie/parallel/src/proof_task.rs | Replaces ProofTaskManager with spawn_proof_workers function and updates handle interface |
crates/trie/parallel/src/proof.rs | Updates proof generation to use new direct queue methods |
crates/node/core/src/args/engine.rs | Adds minimum worker count validation to CLI arguments |
crates/engine/tree/src/tree/payload_validator.rs | Updates error message to reflect new spawning approach |
crates/engine/tree/src/tree/payload_processor/multiproof.rs | Updates multiproof manager to use new queue methods |
crates/engine/tree/src/tree/payload_processor/mod.rs | Replaces ProofTaskManager instantiation with spawn_proof_workers |
crates/engine/primitives/src/config.rs | Adds MIN_WORKER_COUNT constant and enforces minimum worker limits |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
2c025cd
to
2b90133
Compare
- Introduced helper functions to streamline error conversion from ProviderError and channel receive errors to SparseTrieError. - Enhanced readability and maintainability of the trie_node method by reducing repetitive error handling code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
debug_assert_ne!( | ||
previous_handles, 0, | ||
"active_handles underflow in ProofTaskManagerHandle::drop (previous={})", | ||
previous_handles | ||
); |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug assertion checks for underflow after the fetch_sub operation, but this creates a race condition. If multiple threads drop handles simultaneously, one could observe 0 while another decrements below 0. Move the check before fetch_sub or use compare_and_swap to prevent underflow.
Copilot uses AI. Check for mistakes.
- Introduced a `clamp_worker_count` function to centralize the logic for enforcing the minimum worker count. - Updated setter methods in `TreeConfig` to utilize the new clamping function, improving code readability and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
executor: Handle, | ||
view: ConsistentDbView<Factory>, | ||
task_ctx: ProofTaskCtx, | ||
storage_worker_count: usize, | ||
account_worker_count: usize, |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spawn_proof_workers
function lacks documentation for its parameters. Consider adding parameter documentation to explain what each argument does, especially the worker count parameters and their impact on performance.
Copilot uses AI. Check for mistakes.
impl TrieNodeProvider for ProofTaskTrieNodeProvider { | ||
fn trie_node(&self, path: &Nibbles) -> Result<Option<RevealedNode>, SparseTrieError> { | ||
let (tx, rx) = channel(); | ||
/// Helper to convert `ProviderError` to `SparseTrieError` | ||
fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError { | ||
SparseTrieErrorKind::Other(Box::new(std::io::Error::other(e.to_string()))).into() | ||
} | ||
|
||
/// Helper to convert channel recv error to `SparseTrieError` | ||
fn recv_err_to_trie_err(_: std::sync::mpsc::RecvError) -> SparseTrieError { | ||
SparseTrieErrorKind::Other(Box::new(std::io::Error::other("channel closed"))).into() | ||
} | ||
|
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helper functions are defined inside the trie_node
method, which creates unnecessary code nesting. Consider moving these helper functions outside the method or to a module level for better maintainability and potential reuse.
Copilot uses AI. Check for mistakes.
/// Handle to the proof worker pool for storage proofs. | ||
storage_proof_task_handle: ProofTaskManagerHandle, | ||
/// Handle to the proof task manager used for account multiproofs. | ||
/// Handle to the proof worker pool for account multiproofs. | ||
account_proof_task_handle: ProofTaskManagerHandle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why do we need to separate them here, if it's always the same proof_task_handle
passed to both fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the note, addresssed them
let (tx, rx) = channel(); | ||
/// Helper to convert `ProviderError` to `SparseTrieError` | ||
fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError { | ||
SparseTrieErrorKind::Other(Box::new(std::io::Error::other(e.to_string()))).into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we do SparseTrieErrorKind::Other(Box::new(e))
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, good note - addressed that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed this in commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just have some nits regarding the usage of std::io::Error
and separate account/storage proof handles.
- Merged separate storage and account proof task handles into a single proof task handle for improved code clarity and maintainability. - Updated related methods to utilize the consolidated handle, streamlining the management of proof tasks.
- Updated the ProofTaskMetrics struct to derive Default, removing the manual implementation of the default method. - This change enhances code clarity and reduces boilerplate, while maintaining the same functionality.
- Updated the error conversion helper function in ProofTaskTrieNodeProvider to directly wrap the ProviderError, enhancing clarity and maintainability. - This change simplifies the error handling logic within the trie_node method.
@shekhirin thanks for the review, just addressed all your comments in the commits |
/// Helper to convert `ProviderError` to `SparseTrieError` | ||
fn provider_err_to_trie_err(e: ProviderError) -> SparseTrieError { | ||
SparseTrieErrorKind::Other(Box::new(e)).into() | ||
} | ||
|
||
/// Helper to convert channel recv error to `SparseTrieError` | ||
fn recv_err_to_trie_err(_: std::sync::mpsc::RecvError) -> SparseTrieError { | ||
SparseTrieErrorKind::Other(Box::new(std::io::Error::other("channel closed"))).into() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just do this without helper functions?
.map_err(|error| SparseTrieErrorKind::Other(Box::new(error)))?; |
RecvError
already implements Error
, so should work?
/// Clamps the worker count to the minimum allowed value. | ||
/// | ||
/// Ensures that the worker count is at least [`MIN_WORKER_COUNT`]. | ||
const fn clamp_worker_count(count: usize) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this just .max(MIN_WORKER_COUNT)
? Let's move this to with_*_worker_count
functions, no need to have a separate helper fn for this
Context:
closes: #18801
impact:
run()
loop threadreference PRs: