Skip to content

Simplex reconfiguration mechanism specification #4124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yacovm
Copy link
Contributor

@yacovm yacovm commented Jul 27, 2025

Why this should be merged

Specifies the Simplex reconfiguration mechanism.

Preview is available here.

How this works

Just documentation

How this was tested

CI

Need to be documented in RELEASES.md?

No

@Copilot Copilot AI review requested due to automatic review settings July 27, 2025 21:16
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces comprehensive documentation for the Simplex L1 reconfiguration mechanism, detailing how the consensus protocol handles validator set changes and epoch management.

  • Defines the distinction between ICM epochs and Simplex epochs with their respective purposes and behaviors
  • Introduces the Metadata State-Machine (MSM) for managing consensus protocol state transitions
  • Specifies the onboarding process for new nodes to validate blocks across different epochs
Comments suppressed due to low confidence (1)

simplex/reconfiguration.md:97

  • The reference to PChainHeight should be PChainReferenceHeight to match the field name defined in the structure above
- The validator set of the epoch numbered `EpochNumber` is therefore derived from `PChainHeight`.

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
@yacovm yacovm force-pushed the simplex-reconfig branch from fe15094 to ffb7341 Compare July 27, 2025 21:19

- The `PChainReferenceHeight` hasn't changed from the previous block $B_i$.
- `NextPChainReferenceHeight > PChainReferenceHeight`.
- They have locally the P-chain height corresponding to `NextPChainReferenceHeight`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it often happen that the node has just received a new P-Chain height and sends a block before a quorum of nodes have synced that P-Chain height? I'm concerned this could lead to a lot of nodes saying blocks are not valid, then timing out only to repeat this process with the next block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. What will happen as a result, is the block proposal rejected and we immediately agree on the empty block (this is the reason for making this optimization) and the next leader will propose the block. Eventually we will get to a point where we have a correct leader and a quorum of nodes with that P-chain height.

I did consider this problem and while I think there is a heuristic that may minimize the chance we encounter this (only advance the epoch once enough time has passed from the timestamp of the P-chain block) I don't think it's really a problem, as this has zero effect on user experience if we do the empty block agreement right away and don't time out.

By the way, I think we have a similar problem for the proposerVM:

A block must have a PChainHeight that is less or equal to current P-Chain height.

Consider for example, the Simplex consensus protocol, where a block may be notarized in some round but not finalized in that round,
but only in later rounds.
In such a case, not enough nodes will finalize the block in that round, and in order for that block to be finalized, additional blocks need to be built on top of it,
and finalized. It is only when the additional blocks are finalized, that the aforementioned block can be considered as finalized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is weirdly worded as it implies blocks of round i + 1 must be finalized before blocks of round i. If i get this point, what your saying is we may see i + 1 was finalized, but not i meaning we will initiate the replication process to finalize i.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only weirdly worded because you know the internals of how our Simplex implementation works.
We collect finalizations recursively because we commit blocks along with the finalization certificate in a single API call.

We only collect finalization certificates on each each block because it gives builtin validity for block replication and also we could built a threshold VRF for this in the future.

However, there could be other implementations of Simplex, where you only collect finalization certificates for blocks in rounds for which an empty block was not notarized, and the application above the consensus layer doesn't care for finalization certificate to be there for each block sequence.

Consider for instance, an SQL database replicated via Simplex. It only needs total order among its transactions so it matters not if each block has a finalization certificate or not.


Since Telocks are purged once an epoch changes, their block sequence numbers are also available for reuse in the next epoch.
For example, if in epoch `e` the sealing block is $B_k$, there can be several telocks $B_{k+1}$, $B_{k+2}, ..., B_{k+l}$ belonging to epoch $e$.
In the successive epoch $k$, the first block will have a sequence of $B_{k+1}$, the second block will have a sequence of $B_{k+2}$ and so on and so forth.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the rounds of the block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, so I think the same holds for round numbers. I will add it in the next revision.

1. If the current block (the block being built) is the sealing block of the current epoch, `SealingBlockSeq` is set to the current block sequence
and `NextPChainReferenceHeight` is set to be the P-chain height sampled.
2. If the current block isn't the last block of the current epoch, it is either in the epoch of the sealing block or in the successive epoch:
- 2.1 If `SealingBlockSeq` of the previous block and is smaller than the current block sequence but is greater than `0`, then:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 2.1 If `SealingBlockSeq` of the previous block and is smaller than the current block sequence but is greater than `0`, then:
- 2.1 If the `SealingBlockSeq` of the previous block is smaller than the current block sequence but is greater than `0`, then:

and `NextPChainReferenceHeight` is set to be the P-chain height sampled.
2. If the current block isn't the last block of the current epoch, it is either in the epoch of the sealing block or in the successive epoch:
- 2.1 If `SealingBlockSeq` of the previous block and is smaller than the current block sequence but is greater than `0`, then:
- 2.1.1 If the finalization certificate of the sealing block is available, then the MSM sets `SealingBlockSeq` to be 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming we plan on updating the MSM state of finalizations on Storage.Index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think the state will be updated not only upon Index but also as we build blocks on top of other blocks.

```
{
PChainReferenceHeight: 151
EpochNumber: 31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the epoch be 30 since it is set to the sequence of the sealing block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, oops...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to move this to simplex/docs/reconfiguration.md just to avoid file bloat in the simplex package? or simplex/specs/..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I can move it in a follow-up PR, sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants