Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-claim forwarded HTLCs on startup #2364

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Jun 20, 2023

Because ChannelMonitorUpdates can complete asynchronously and
out-of-order now, a commitment_signed ChannelMonitorUpdate from
a downstream channel could complete prior to the preimage
ChannelMonitorUpdate on the upstream channel. In that case, we may
not get a update_fulfill_htlc replay on startup. Thus, we have to
ensure any payment preimages contained in that downstream update are
re-claimed on startup.

Here we do this during the existing walk of the ChannelMonitor
preimages for closed channels.

This was originally a part of #2167 but got dropped as it was buggy. Its now been fixed. This should ideally go in 116 so that async users of 117 can safely downgrade, but if it doesn't make it that's somewhat okay.

Based on #2362.

@codecov-commenter
Copy link

codecov-commenter commented Jun 20, 2023

Codecov Report

Patch coverage: 79.22% and project coverage change: +0.11 🎉

Comparison is base (0f2c4c0) 90.32% compared to head (192c5d2) 90.43%.

❗ Current head 192c5d2 differs from pull request most recent head 9ce7e8e. Consider uploading reports for the commit 9ce7e8e to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2364      +/-   ##
==========================================
+ Coverage   90.32%   90.43%   +0.11%     
==========================================
  Files         106      106              
  Lines       54948    58208    +3260     
  Branches    54948    58208    +3260     
==========================================
+ Hits        49633    52642    +3009     
- Misses       5315     5566     +251     
Impacted Files Coverage Δ
lightning/src/ln/channel.rs 89.42% <ø> (ø)
lightning/src/ln/channelmanager.rs 89.70% <79.22%> (+3.44%) ⬆️

... and 10 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

A few nits that came up in review to make the docs clearer, but not
anything super critical.
@TheBlueMatt
Copy link
Collaborator Author

Rebased.

Copy link
Contributor

@alecchendev alecchendev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks pretty good to me, just one comment

Comment on lines 4766 to 4772
// Note that while its safe to use `ClosingMonitorUpdateRegeneratedOnStartup` here (the
// channel is already closed) we need to ultimately handle the monitor update
// completion action only after we've completed the monitor update. This is the only
// way to guarantee this update *will* be regenerated on startup (otherwise if this was
// from a forwarded HTLC the downstream preimage may be deleted before we claim
// upstream). Thus, we need to transition to some new `BackgroundEvent` type which will
// complete the monitor update completion action from `completion_action`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused with this comment - I'm confused why only handling the completion_action after completing the monitor update is the only way to guarantee regenerating this update on startup? I'm thinking the completion_action that's used for claiming a forwarded HTLC upstream is just emitting a PaymentForwarded event right (I'm just looking at where this is used in claim_funds_internal), why would doing that before waiting for the monitor update to complete possibly mean the downstream preimage may be deleted before claiming upstream?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an HTLC which is forwarded is claimed, first we receive the preimage from the outbound edge, and we store the preimage in that channel's monitor. Then, we go and store the preimage in the inbound edge's monitor as well when we go to claim upstream. In the rare case that the outbound edge not only completes the initial monitor update but also one further monitor update, we will remove the preimage from that monitor and call it a day. By that point, we must make sure that the inbound edge's monitor has been safely updated and the preimage is durably in the previous channel, that is what the completion action does - unblocks the downstream channel monitor so that it can be updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm okay, that part makes sense, thanks.

Just checking, we only unblock the downstream channel monitor when handling a MonitorUpdateCompletionAction::EmitEventAndFreeOtherChannel where downstream_counterparty_and_funding_outpoint = Some(..) right?

I'm wondering, is this then talking about the case where we generate this BackgroundEvent::ClosingMonitorUpdateRegeneratedOnStartup on startup, but then because we immediately handle the completion_action (instead of having a different kind of BackgroundEvent that does the completion_action after the monitor update completed like the comment suggests), in the situation that claim_funds_from_hop is called with a completion_action that returns a monitor update completion action that would unblock the outbound channel monitor, we'd risk the outbound channel completing a further monitor update and deleting the preimage? But in its current state this is safe because right now we only hit this path with a completion_action that returns a monitor completion action that doesn't free up an outbound channel (downstream_counterparty_and_funding_outpoint = None)?

I think my main confusion is around which parts of the comment are talking about this specific code path versus generally within claim_funds_from_hop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in its current state this is safe..

No, in the current state it is not safe. We need to fix this on both this (the during-startup) case and on the not-during-startup case, but in either case its possible we remove the preimage before its durably on the inbound edge.

lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
lightning/src/ln/channelmanager.rs Show resolved Hide resolved
@valentinewallace
Copy link
Contributor

CI sad, otherwise LGTM.

@TheBlueMatt TheBlueMatt force-pushed the 2023-06-htlc-preimage-replay branch from 192c5d2 to aabd35e Compare July 7, 2023 21:14
@TheBlueMatt
Copy link
Collaborator Author

Oops, background_events_processed_since_startup was test-only, I made it always-on.

@TheBlueMatt TheBlueMatt force-pushed the 2023-06-htlc-preimage-replay branch from aabd35e to d3811cd Compare July 7, 2023 21:34
@TheBlueMatt
Copy link
Collaborator Author

Squashed without further changes:

$ git diff-tree -U1 aabd35e7 d3811cd7
$ 

wpaulino
wpaulino previously approved these changes Jul 7, 2023
Because `ChannelMonitorUpdate`s can complete asynchronously and
out-of-order now, a `commitment_signed` `ChannelMonitorUpdate` from
a downstream channel could complete prior to the preimage
`ChannelMonitorUpdate` on the upstream channel. In that case, we may
not get a `update_fulfill_htlc` replay on startup. Thus, we have to
ensure any payment preimages contained in that downstream update are
re-claimed on startup.

Here we do this during the existing walk of the `ChannelMonitor`
preimages for closed channels.
Now that we also use the "Closing" `BackgroundEvent` for
already-closed channels we need to rename it and tweak the docs.
@TheBlueMatt
Copy link
Collaborator Author

Grrr, I missed one cfg:

$ git diff-tree -U1 d3811cd7 9ce7e8e6
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index faed19407..398975c65 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -8888,3 +8888,2 @@ where
 			total_consistency_lock: RwLock::new(()),
-			#[cfg(debug_assertions)]
 			background_events_processed_since_startup: AtomicBool::new(false),

@wpaulino wpaulino merged commit dba3e8f into lightningdevkit:main Jul 10, 2023
Sharmalm added a commit to Sharmalm/rust-lightning that referenced this pull request Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants