Skip to content

Conversation

@joshieDo
Copy link
Collaborator

@joshieDo joshieDo commented Nov 6, 2025

Before transactions can be pruned, we need to ensure we won't delete any transactions that the tx lookup job will need for its own run.

@joshieDo joshieDo requested a review from mattsse November 6, 2025 16:43
@joshieDo joshieDo self-assigned this Nov 6, 2025
@joshieDo joshieDo added C-bug An unexpected or incorrect behavior A-pruning Related to pruning or full node labels Nov 6, 2025
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Nov 6, 2025
Comment on lines +89 to +95
return Ok(SegmentOutput {
progress: PruneProgress::Finished,
pruned: 0,
checkpoint: input
.previous_checkpoint
.map(SegmentOutputCheckpoint::from_prune_checkpoint),
})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by and unsure.

on one hand returning an empty checkpoint is wrong, on the other deleted_headers is empty because input.to_block falls in the middle of a static file (eg. block 10 in a range 0..=499_999k)

opted for finished, but could also be not_done with a interrupt reason, thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we checking deleted_headers here?

let deleted_headers = provider
            .static_file_provider()
            .delete_segment_below_block(StaticFileSegment::Transactions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure what the impact of this is

Copy link
Collaborator Author

@joshieDo joshieDo Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted_headers refers to a list of SegmentHeader { block_range, tx_range} of static files that we were able to delete.

if the prune target is a block in the middle of a static file, we would return empty here

@joshieDo joshieDo force-pushed the joshie/coordinate-tx-pruning branch from 7c106f5 to 718feb9 Compare November 6, 2025 16:47
@joshieDo joshieDo marked this pull request as ready for review November 6, 2025 16:49
@joshieDo joshieDo requested a review from shekhirin as a code owner November 6, 2025 16:49
Comment on lines 48 to 52
// Check if tx_lookup will eventually need to prune any blocks <= our target
Ok(tx_lookup_mode
.next_pruned_block(tx_lookup_checkpoint)
.is_some_and(|next_block| next_block <= target)
.not())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it not enough to just check tx_lookup_checkpoint >= target here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could have a Before(100) for txlookup and a Before(200) for bodies. (lets forget sf ranges for now)

our tx_lookup_checkpoint would always be 99 and bodies would never run

Comment on lines +89 to +95
return Ok(SegmentOutput {
progress: PruneProgress::Finished,
pruned: 0,
checkpoint: input
.previous_checkpoint
.map(SegmentOutputCheckpoint::from_prune_checkpoint),
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we checking deleted_headers here?

let deleted_headers = provider
            .static_file_provider()
            .delete_segment_below_block(StaticFileSegment::Transactions

Comment on lines +89 to +95
return Ok(SegmentOutput {
progress: PruneProgress::Finished,
pruned: 0,
checkpoint: input
.previous_checkpoint
.map(SegmentOutputCheckpoint::from_prune_checkpoint),
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure what the impact of this is

Comment on lines 78 to 81
return Ok(SegmentOutput::not_done(
PruneInterruptReason::WaitingOnSegment(PruneSegment::TransactionLookup),
input.previous_checkpoint.map(SegmentOutputCheckpoint::from_prune_checkpoint),
))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this I understand

@joshieDo joshieDo marked this pull request as draft November 7, 2025 12:27
@joshieDo joshieDo force-pushed the joshie/coordinate-tx-pruning branch from c04a960 to 5c00ece Compare November 7, 2025 12:50
@joshieDo joshieDo marked this pull request as ready for review November 7, 2025 12:50
@joshieDo joshieDo marked this pull request as draft November 7, 2025 13:01
@joshieDo joshieDo marked this pull request as ready for review November 7, 2025 13:11
Comment on lines +127 to +129
// The highest block number in the deleted files is the actual checkpoint. to_block might
// refer to a block in the middle of a undeleted file.
let checkpoint_block = deleted_headers
Copy link
Collaborator Author

@joshieDo joshieDo Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if checkpoint_block != input.to_block we can't return PruneProgress::not_done, otherwise the PrunerStage would get stuck trying over and over again

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more question

Comment on lines +92 to +97
/// # Examples
///
/// - `Before(10)` with checkpoint at block 5 returns `Some(6)`
/// - `Before(10)` with checkpoint at block 9 returns `None` (done)
/// - `Distance(100)` with checkpoint at block 1000 returns `Some(1001)` (always has more)
/// - `Full` always returns the next block after checkpoint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are helpful

pub const fn next_pruned_block(&self, checkpoint: Option<BlockNumber>) -> Option<BlockNumber> {
let next = match checkpoint {
Some(c) => c + 1,
None => 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we start at genesis here or 1?

Comment on lines +75 to +78
// Transaction lookup (should be run before bodies)
.segment_opt(transaction_lookup.map(TransactionLookup::new))
// Bodies
.segment_opt(bodies_history.map(|mode| Bodies::new(mode, transaction_lookup)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add more note for dummies: because we still need bodies for txlookup

Comment on lines +20 to +22
/// Transaction lookup prune mode. Used to determine if we need to wait for tx lookup pruning
/// before deleting transaction bodies.
tx_lookup_mode: Option<PruneMode>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, now this makes sense to me

Comment on lines +46 to +48
let tx_lookup_checkpoint = provider
.get_prune_checkpoint(PruneSegment::TransactionLookup)?
.and_then(|cp| cp.block_number);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also have the uncommitted txlookip checkpoint, right?

even if not, this wouldnt be an issue because this would then just trail 1 prune run behind

// Determine the safe prune target, if any.
let to_block = match tx_lookup_mode.next_pruned_block(tx_lookup_checkpoint) {
None => Some(input.to_block), /* tx_lookup is done */
Some(tx_lookup_next) if tx_lookup_next > input.to_block => Some(input.to_block), /* tx_lookup is ahead */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document what this means exactly,

basically why can't we also prune up to tx_lookup_next?

let pruned = tx_ranges.clone().map(|range| range.len()).sum::<u64>() as usize;

// The highest block number in the deleted files is the actual checkpoint. to_block might
// refer to a block in the middle of a undeleted file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// refer to a block in the middle of a undeleted file.
// refer to a block in the middle of an undeleted file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-pruning Related to pruning or full node C-bug An unexpected or incorrect behavior

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

4 participants