Skip to content

Conversation

sstone
Copy link
Member

@sstone sstone commented Jun 9, 2025

This PR implements lightning/bolts#995 which introduces a new channel format where funding transactions send to an aggregated musig2 public key instead of a 2-of-2 multisig address:

funding and closing transactions become cheaper (by about 15%)
on-chain footprint becomes more private: funding and closing transactions are impossible to distinguish from other p2tr transactions.

This PR is compatible with lightning/bolts@1ad02ee with the assumption that option_simple_close is mandatory for simple taproot channels (which is not yet reflected in the BOLT proposal).

Support for dual-funding and splicing has also been implemented through extensions to the interactive transaction construction protocol.

EDIT: This PR also includes code to upgrade channels to the new taproot commitment format during a splice.

@sstone sstone force-pushed the simple-taproot-channels-v3 branch 5 times, most recently from 8292732 to 4c933f6 Compare June 19, 2025 08:02
@sstone sstone force-pushed the simple-taproot-channels-v3 branch 5 times, most recently from f45142a to 7e101ce Compare June 30, 2025 16:20
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice that this doesn't require touching too many parts of the codebase now that we've done some refactoring in previous PRs, it's easy to focus on the important taproot-related features!

Overall, I think the main issue is that there is a lot of unsafe patterns where you introduce require, casting to Right(_), .get on Options and throwing exceptions. This must be improved: we cannot risk having an exception in most of those cases, because it would mean that the actor would crash, which could potentially be exploited to prevent us from publishing penalty transactions if our peer published a revoked commitment while our channel actor constantly crashes.

One of the culprits for that is the partial signing function and the signature aggregation: we must either make the functions that call them return an Either (bubble up the failure), or if we know that we've previously validated the input and it will always work, we should change the low-level signing function to always return a partial signature without ever throwing.

Similarly, when we depend on remote nonces, we should probably use different function overloads that take a non-optional nonce (to prove that we've verified that it existed before calling the function) instead of doing a .get inside the function. That may require creating different function overloads for taproot commitment formats in some cases, but that seems acceptable?

@sstone
Copy link
Member Author

sstone commented Jul 1, 2025

Overall, I think the main issue is that there is a lot of unsafe patterns where you introduce require, casting to Right(_), .get on Options and throwing exceptions. This must be improved: we cannot risk having an exception in most of those cases, because it would mean that the actor would crash, which could potentially be exploited to prevent us from publishing penalty transactions if our peer published a revoked commitment while our channel actor constantly crashes.

One of the culprits for that is the partial signing function and the signature aggregation: we must either make the functions that call them return an Either (bubble up the failure), or if we know that we've previously validated the input and it will always work, we should change the low-level signing function to always return a partial signature without ever throwing.

Yes, I first wanted to get everything to work including splices and changes required for the interactive sessions, but I'll change this to safer code now.

Similarly, when we depend on remote nonces, we should probably use different function overloads that take a non-optional nonce (to prove that we've verified that it existed before calling the function) instead of doing a .get inside the function. That may require creating different function overloads for taproot commitment formats in some cases, but that seems acceptable?

I had something similar in mind.

By far the biggest issue I had was nonce management, especially with splices and re-connections, it got much easier when we switched to sending funding tx id -> nonce maps.

@sstone sstone force-pushed the simple-taproot-channels-v3 branch 2 times, most recently from 2755e8d to f74c1e6 Compare July 10, 2025 14:26
@sstone sstone force-pushed the simple-taproot-channels-v3 branch from f74c1e6 to 227a8e6 Compare July 15, 2025 17:36
@t-bast
Copy link
Member

t-bast commented Jul 16, 2025

Here is a (non-exhaustive) list of important tests that must be added to ensure that we have good enough coverage compared to other commitment formats:

  • add unit tests of the various stages of channel opening (for single-funding and dual-funding), verifying in particular that nonces are different every time and that we handle the cases where nonces or partial sigs are missing
  • add tests for the interactive-tx process: verify that nonces change every time a new tx_add_input / tx_add_output is added, signatures work, that we reject invalid stuff, etc
  • add tests for reconnection while a taproot interactive-tx is being signed (commit_sig / tx_signatures retransmission)
  • add tests for normal channel operations (adding HTLCs, fulfill / fail, etc)
  • verify that we correctly force-close if a commit_sig is missing partial sigs, or if commit_sig or revoke_and_ack is missing nonces
  • add tests for reconnection during normal operation:
    • disconnect when one side has sent update_add_htlc and commit_sig but the other side has not received commit_sig (should be re-signed with new nonces)
    • disconnect when one side has sent commit_sig, the other side received it and sent revoke_and_ack, which hasn't been received yet (retransmist revoke_and_ack and re-sign commit_sig)
    • resume using the channel for HTLCs (I'm not sure we correctly clear the nonces maps on disconnection yet)
  • in ClosingStateSpec.scala, there are a lot of tests that we run for each commitment format: add one instance for each of the two taproot commitment formats
  • Commitment upgrade during a splice tests:
    • add test for the normal scenario where we upgrade from anchor outputs to taproot: exchange HTLCs after the splice before the taproot tx confirms (the batch of commit_sig messages will contain a mix of taproot and segwit v0 transactions)
    • add force-close tests for various scenarios, such as:
      • our peer force-closes using the pre-taproot commitment (non-revoked, with pending HTLCs in both directions)
      • our peer force-closes using the taproot commitment (non-revoked, with pending HTLCs in both directions)
      • our peer force-closes using an active pre-taproot revoked commitment (with pending HTLCs in both directions)
      • our peer force-closes using an inactive pre-taproot revoked commitment (with pending HTLCs in both directions)
      • our peer force-closes using a revoked taproot commitment (with pending HTLCs in both directions) while we also have a non-taproot active commitment (we haven't yet received splice_locked for the taproot migration)
      • our peer force-closes using an inactive revoked taproot commitment
  • in ChannelIntegrationSpec.scala, add a test suite for the taproot commitment format

One thing that could make sense would be to update many of the existing channel unit tests that currently use the default commitment format (which we plan to remove) to instead use taproot (mostly in NormalStateSpec): this would be less effort than writing new tests, would provide some test coverage, and would make it easier to remove the default commitment format in the future.

@sstone sstone force-pushed the simple-taproot-channels-v3 branch 3 times, most recently from 1f96546 to d1659b9 Compare July 22, 2025 17:42
@sstone sstone force-pushed the simple-taproot-channels-v3 branch 7 times, most recently from 3dadcc6 to be3eb9e Compare July 29, 2025 14:55
@sstone
Copy link
Member Author

sstone commented Jul 29, 2025

Here is a (non-exhaustive) list of important tests that must be added to ensure that we have good enough coverage compared to other commitment formats:

  • add unit tests of the various stages of channel opening (for single-funding and dual-funding), verifying in particular that nonces are different every time and that we handle the cases where nonces or partial sigs are missing

  • add tests for the interactive-tx process: verify that nonces change every time a new tx_add_input / tx_add_output is added, signatures work, that we reject invalid stuff, etc

  • add tests for reconnection while a taproot interactive-tx is being signed (commit_sig / tx_signatures retransmission)

  • add tests for normal channel operations (adding HTLCs, fulfill / fail, etc)

  • verify that we correctly force-close if a commit_sig is missing partial sigs, or if commit_sig or revoke_and_ack is missing nonces

  • add tests for reconnection during normal operation:

    • disconnect when one side has sent update_add_htlc and commit_sig but the other side has not received commit_sig (should be re-signed with new nonces)
    • disconnect when one side has sent commit_sig, the other side received it and sent revoke_and_ack, which hasn't been received yet (retransmist revoke_and_ack and re-sign commit_sig)
    • resume using the channel for HTLCs (I'm not sure we correctly clear the nonces maps on disconnection yet)
  • in ClosingStateSpec.scala, there are a lot of tests that we run for each commitment format: add one instance for each of the two taproot commitment formats

  • Commitment upgrade during a splice tests:

    • add test for the normal scenario where we upgrade from anchor outputs to taproot: exchange HTLCs after the splice before the taproot tx confirms (the batch of commit_sig messages will contain a mix of taproot and segwit v0 transactions)

    • add force-close tests for various scenarios, such as:

      • our peer force-closes using the pre-taproot commitment (non-revoked, with pending HTLCs in both directions)
      • our peer force-closes using the taproot commitment (non-revoked, with pending HTLCs in both directions)
      • our peer force-closes using an active pre-taproot revoked commitment (with pending HTLCs in both directions)
      • our peer force-closes using an inactive pre-taproot revoked commitment (with pending HTLCs in both directions)
      • our peer force-closes using a revoked taproot commitment (with pending HTLCs in both directions) while we also have a non-taproot active commitment (we haven't yet received splice_locked for the taproot migration)
      • our peer force-closes using an inactive revoked taproot commitment
  • in ChannelIntegrationSpec.scala, add a test suite for the taproot commitment format

One thing that could make sense would be to update many of the existing channel unit tests that currently use the default commitment format (which we plan to remove) to instead use taproot (mostly in NormalStateSpec): this would be less effort than writing new tests, would provide some test coverage, and would make it easier to remove the default commitment format in the future.

Almost all of these have been implemented by creating a superclass of the original test that uses taproot channels.
Upgrading NormalStateSpec first to use anchor outputs (and then taproot channels) would be a significant change that is better done is a specific PR.
ChannelIntegrationSpec tests rely on channel announcements and there is no clean way to upgrade them for taproot channels (which for now is restricted to private channels).

@t-bast
Copy link
Member

t-bast commented Aug 8, 2025

Almost all of these have been implemented by creating a superclass of the original test that uses taproot channels.
Upgrading NormalStateSpec first to use anchor outputs (and then taproot channels) would be a significant change that is better done is a specific PR.

Unfortunately, duplicating test suites like you did wasn't a good strategy: it gave a false sense of security but wasn't actually adding coverage where we need it. I've reverted most of those changes in #3136 (with more details about the motivation behind it) and added more focused tests (which let me find many subtle bugs).

We add new commitment formats and TLV extensions to include musig2 nonces. This includes a specific commitment format for phoenix
taproot channels.

The old v1 channel establishment protocol is updated to include nonces and partial signatures.

The v2 channel estalishment protocol, based on the interactive tx constuction protocol, is also updated, and the
interactive tx session now includes:
- an optional funding nonce for the shared input (i.e. the funding tx that is being spent)
- a nonce for the commit tx that is being created, and another nonce that will become the channel's "next remote nonce" once the session completes
The funding nonce is random and its lifecycle is bound to the interactive session.

Side note: the new v2 protocol is both simpler to extend and gives us support for dual-funding and splices.

Since there can be several different commitment transactions that valid at the same time while splices are pending, revoke_and_ack
and channel_restablish are extended to include a list of funding_tx_id -> nonce tuples (one for each active commitment).

channel_restablish also includes as an optional "current commit nonce": if we got disconnected while a splice was in progress
before both nodes exchanged their commit signatures: if that is the case, we need to re-send our peer's current signature
and will use this nonce to compute it.

We also update the simple close protocol to include closing nonces.

We allow upgrading channels to taproot during splices, with an optional channel_type TLV added to splice_init/splice_ack.
This is not part of the BOLT proposal, and upgrading is currently limited to phoenix taproot channels from phoenix anchor channels.
@sstone sstone force-pushed the simple-taproot-channels-v3 branch from bb2016a to f8a9860 Compare August 19, 2025 14:25
t-bast
t-bast previously approved these changes Aug 19, 2025
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM let's goooo 🚀

@t-bast t-bast marked this pull request as ready for review August 19, 2025 16:40
@sstone sstone merged commit d8ce91b into master Aug 19, 2025
1 of 5 checks passed
@sstone sstone deleted the simple-taproot-channels-v3 branch August 19, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants