Skip to content

Conversation

yongkangc
Copy link
Member

@yongkangc yongkangc commented Oct 16, 2025

Problem

During parallel proof computation, we had an 85.99% miss ratio where accounts encountered during the account trie walk were not pre-dispatched to parallel workers. This caused high synchronous database access on the critical path.

Metrics Before Fix:

  • Miss ratio: 85.99%
  • Missed leaves: 3,291/sec
  • Precomputed: 527/sec
  • StoragesTrie DB ops: 24,851/sec
  • Total DB ops: 92,958/sec
  • page_get_inline (MDBX): 6.7% CPU time in flamegraph

Root Cause:
Only accounts in targets (modified accounts from block) were pre-dispatched:

dispatch_storage_proofs(&storage_work_tx, &input.targets, ...)

But the trie walk encounters ALL accounts in the prefix set (modified + witnesses), so 86% of accounts triggered synchronous fallback at proof_task.rs:914-957.

Goal:

The change aims to ensure every account encountered during the trie walk has its storage root pre-computed in parallel.

…set accounts

Reduces "missed leaf" ratio from ~86% to near-zero by dispatching storage proofs
for ALL accounts in storage_prefix_sets instead of only those in targets.

## Problem

Previously, only accounts in `targets` (modified accounts) were pre-dispatched
to parallel workers for storage proof computation. However, during the account
trie walk, we encounter many more accounts that are part of the proof path
(unmodified witness accounts). This caused:

- 85.99% miss ratio (3,291 missed leaves/sec vs 527 precomputed/sec)
- 24,851 synchronous StoragesTrie DB ops/sec
- 92,958 total DB ops/sec
- page_get_inline showing 6.7% in flamegraph

## Solution

Dispatch storage proofs for ALL accounts in `storage_prefix_sets` before
starting the account trie walk. This ensures every account encountered during
the walk has a pre-computed storage root ready.

## Expected Impact

- Miss ratio: 86% → ~5%
- Missed leaves: 3,291/sec → ~200/sec (94% reduction)
- StoragesTrie ops: 24,851/sec → ~1,600/sec (93% reduction)
- Total DB ops: 92,958/sec → ~65,000/sec (30% reduction)
- page_get_inline CPU: 6.7% → ~2-3% (50%+ reduction)

## Changes

Modified `proof_task.rs:322-343` to:
1. Collect ALL account addresses from `storage_prefix_sets`
2. Merge with target slots from `input.targets`
3. Dispatch storage proofs for the combined set

This leverages existing parallel infrastructure without architectural changes.
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Oct 16, 2025
@yongkangc yongkangc moved this from Backlog to In Review in Reth Tracker Oct 16, 2025
@yongkangc yongkangc moved this from In Review to Done in Reth Tracker Oct 16, 2025
@yongkangc
Copy link
Member Author

Experiment that didnt give good results

@yongkangc yongkangc closed this Oct 16, 2025
@yongkangc yongkangc self-assigned this Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant