Skip to content

Conversation

@wpaulino
Copy link
Contributor

HolderCommitmentPoint currently tracks the current and next point used on counterparty commitments, which are unrevoked. When we reestablish a channel, the counterparty sends us the commitment height, along with the corresponding secret, for the state they believe to be the latest. We compare said secret to the derived point we fetch from the signer to know if the peer is being honest.

Since the protocol does not allow peers (assuming no data loss) to be behind the current state by more than one update, we can cache the two latest revoked commitment points alongside HolderCommitmentPoint, such that we no longer need to reach the signer asynchronously when handling channel_reestablish messages throughout the happy path. By doing so, we avoid complexity in needing to pause the state machine (which may also result in needing to stash any update messages from the counterparty) while the signer response is pending.

The only remaining case left to handle is when the counterparty presents a channel_reestablish with a state later than what we know. This can only result in two terminal cases: either they provided a valid commitment secret proving we are behind and we need to panic, or they lied and we force close the channel. This is the only case we choose to handle asynchronously as it's relatively trivial to handle.

@wpaulino wpaulino added this to the 0.3 milestone Oct 30, 2025
@wpaulino wpaulino requested a review from TheBlueMatt October 30, 2025 23:02
@wpaulino wpaulino self-assigned this Oct 30, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 30, 2025

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 86.74699% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.86%. Comparing base (d88f0f8) to head (7c25f35).

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 80.61% 19 Missing ⚠️
lightning/src/ln/channelmanager.rs 75.00% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #4197    +/-   ##
========================================
  Coverage   88.85%   88.86%            
========================================
  Files         180      180            
  Lines      137901   138038   +137     
  Branches   137901   138038   +137     
========================================
+ Hits       122533   122666   +133     
- Misses      12553    12563    +10     
+ Partials     2815     2809     -6     
Flag Coverage Δ
fuzzing 21.43% <14.54%> (-0.01%) ⬇️
tests 88.70% <86.74%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

.ok();
if expected_point.is_none() {
self.context.signer_pending_stale_state_verification = Some((commitment_number, given_secret));
return Err(ChannelError::Ignore("Waiting on async signer to verify stale state proof".to_owned()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice I think this means we'll often never panic - the peer will reconnect, we'll ignore the message, then they'll send some other message which will cause us to, for example, ChannelError::close("Got commitment signed message when channel was not in an operational state"). We'll either have to have logic in ~every message handler to ignore the message if signer_pending_stale_state_verification is set or we can just disconnect them here and let them be in a reconnect loop until the signer resolves (which I think is fine?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, ended up disconnecting. Is there any reason for us to close in those cases though? We could just make those ChannelError::close a WarnAndDisconnect instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, those cases could definitely move to a warn-and-disconnect. Historically we've been pretty happy to just close if the peer does something dumb, and in 95% of the cases we've never seen peers do anything so dumb, so we've never really had a motivation to change it. Not crazy to do though.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@TheBlueMatt
Copy link
Collaborator

fwiw clippy is unhappy.

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 7c25f35 to 6b8123d Compare November 3, 2025 22:04
@TheBlueMatt
Copy link
Collaborator

ln::async_signer_tests::test_async_force_close_on_invalid_secret_for_stale_state is failing in CI.

`HolderCommitmentPoint` currently tracks the current and next point used
on counterparty commitments, which are unrevoked. When we reestablish a
channel, the counterparty sends us the commitment height, along with the
corresponding secret, for the state they believe to be the latest. We
compare said secret to the derived point we fetch from the signer to
know if the peer is being honest.

Since the protocol does not allow peers (assuming no data loss) to be
behind the current state by more than one update, we can cache the two
latest revoked commitment points alongside `HolderCommitmentPoint`, such
that we no longer need to reach the signer asynchronously when handling
`channel_reestablish` messages throughout the happy path. By doing so,
we avoid complexity in needing to pause the state machine (which may
also result in needing to stash any update messages from the
counterparty) while the signer response is pending.

The only remaining case left to handle is when the counterparty presents
a `channel_reestablish` with a state later than what we know. This can
only result in two terminal cases: either they provided a valid
commitment secret proving we are behind and we need to panic, or they
lied and we force close the channel. This is the only case we choose to
handle asynchronously as it's relatively trivial to handle.
@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 6b8123d to fa13381 Compare November 4, 2025 17:50
@wpaulino wpaulino requested a review from TheBlueMatt November 4, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants