Skip to content

quic: Resurrect prq svc queue #5481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

akhinvasara-jumptrading
Copy link
Contributor

@akhinvasara-jumptrading akhinvasara-jumptrading commented Jun 27, 2025

This PR re-applies the previously reverted commit to use a prq as the underlying data structure for the quic service queue. It also fixes the associated bugs that caused the initial reversion. This has been tested on both testnet and mainnet validators.

Probably worth looking at the commits separately - the first is just reverting the revert.

This reverts commit 543e6f0fb72451eb6294f249c135322e05b8a50d.
@@ -2020,7 +2020,6 @@ fd_quic_lazy_ack_pkt( fd_quic_t * quic,
( ( pkt->enc_level == fd_quic_enc_level_initial_id ) |
( pkt->enc_level == fd_quic_enc_level_handshake_id ) );
if( ack_sz_threshold_hit | force_instant_ack ) {
conn->unacked_sz = 0UL;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already reset after the acks are actually sent out, and shouldn't be here.

@@ -2462,6 +2467,11 @@ fd_quic_process_packet( fd_quic_t * quic,
}

rc = fd_quic_process_quic_packet_v1( quic, &pkt, cur_ptr, cur_sz );
if( FD_UNLIKELY( fd_quic_svc_cnt_events( state->svc_timers ) != quic->metrics.conn_active_cnt ) ) {
FD_LOG_WARNING(( "only %lu out of %lu connections are in timer",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love some feedback on these checks/logs - it's a good invariant to 'test' live, and fairly cheap, but probs not super useful for operators?

@@ -3573,7 +3593,8 @@ fd_quic_conn_tx( fd_quic_t * quic,
ulong now = fd_quic_get_state( quic )->now;

/* initialize expiry and tx_time */
fd_quic_pkt_meta_t pkt_meta_tmpl[1] = {{.expiry = now+500000000UL, .tx_time = now}};
ulong const retx_timeout = 1250UL*1000000UL; /* 1250M ticks ~ 500 ms?*/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the tick rate. This should probably be replaced soon by the PTO stuff anyway, so thought simpler/safer for now to just hardcode ticks instead of throwing the conversion in there?? Open to changing that also!

      reset next_timeout in svc_schedule

      increase retx timeout to 1250M ticks

      rm unnecessary unacked_sz reset
      WARN if svc_timer_cnt != active_conn_cnt

      test_quic_conformance: update for service heap
@ripatel-fd
Copy link
Contributor

ripatel-fd commented Jul 2, 2025

@akhinvasara-jumptrading Kevin is working on a new fast nanosecond-based clock source. This allows us to retire fd_tickcount(). We might want to wait for that feature to arrive before merging this, because it eliminates the time scaling class of bugs. (No strong opinion on this though. If you think it's a good idea to merge this now, we shall do so.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants