-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add infra to block ChannelMonitorUpdates on forwarded claims #2167
Add infra to block ChannelMonitorUpdates on forwarded claims #2167
Conversation
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #2167 +/- ##
==========================================
+ Coverage 90.94% 92.11% +1.17%
==========================================
Files 104 104
Lines 52750 66889 +14139
Branches 52750 66889 +14139
==========================================
+ Hits 47971 61612 +13641
- Misses 4779 5277 +498
☔ View full report in Codecov by Sentry. |
Slipping to 116. |
13c72ab
to
8624570
Compare
Rebased, now no longer based on anything. |
37a595d
to
ab7c327
Compare
lightning/src/ln/channelmanager.rs
Outdated
// The ChannelMonitor that gave us this | ||
// preimage is for a now-closed channel - | ||
// no further updates to that channel can | ||
// happen which would result in the | ||
// preimage being removed, thus we're | ||
// guaranteed to regenerate this claim on | ||
// restart as long as the source monitor | ||
// sticks around. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean we need to further keep monitors around even if they have zero claimable balances until we can unblock any dependent channels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhhh, yea, kinda. I mean only until the upstream channelmonitor gets its update persisted, but indeed we don't currently have any infrastructure to let the user know whether that's the case when pruning the downstream monitor. We may want to add something like that (eg "no removing monitors while any monitors are syncing") but a few blocks of extra time should suffice in most cases.
Will rebase this on #2287 in a day or two, but that should go first i think. |
3c3fb09
to
6f0c817
Compare
34c6292
to
bdd8f78
Compare
lightning/src/ln/channel.rs
Outdated
@@ -5044,10 +5044,20 @@ impl<Signer: WriteableEcdsaChannelSigner> Channel<Signer> { | |||
self.pending_monitor_updates.is_empty() | |||
} | |||
|
|||
pub fn complete_all_mon_updates_through(&mut self, update_id: u64) { | |||
self.pending_monitor_updates.retain(|upd| upd.update.update_id > update_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat unrelated to this method, but can a counterparty force us to drop these updates in any way and then play a commitment onchain for which we'd need one of those dropped updates to claim funds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That shouldn't be the case - if a monitor update is blocked/hasn't completed we should never ever ever give our peer whatever message would be required to broadcast the state included in that message. This is obviously different for payment preimages, which is why they "jump the queue".
/// completes a monitor update containing the payment preimage. In that case, after the inbound | ||
/// edge completes, we will surface an [`Event::PaymentForwarded`] as well as unblock the | ||
/// outbound edge. | ||
EmitEventAndFreeOtherChannel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just name it PaymentForwarded
and rename the action below to downstream_action
? That fully describes the action we're performing after the update completes.
// support async monitor updates even in LDK 0.0.115 and once we do we'll require no | ||
// downgrades to prior versions. Thus, while this would break on downgrade, we don't | ||
// support it even without downgrade, so if it breaks its not on us ¯\_(ツ)_/¯. | ||
(1, downstream_counterparty_and_funding_outpoint, option), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means we could make this even once we land the follow-up PR as part of 117?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no mechanism currently to do that. Sadly, downstream_counterparty_and_funding_outpoint
is always set, so we can't make it even or we break all downgrades. Instead, we should make channel
's pending_monitor_updates
even, which it for some reason currently isn't. We'll want to do that for 116 - #2317
Still need to do a few more followups but pushed a bunch of the more trivial changes here. |
Haven't looked at tests yet, but the rest LGTM so far. |
Did you mean to push @TheBlueMatt? Also feel free to squash IMO. |
2935e00
to
9f2e9f1
Compare
I did. |
Our `no-std` locks simply panic if a lock cannot be taken as there should be no lock contention in a single-threaded environment. However, the `held_by_thread` debug methods were delegating to the lock methods which resulted in a panic when asserting that a lock *is* held by the current thread. Instead, they are updated here to call the relevant `RefCell` testing methods.
This allows us to make the `force_shutdown` definition less verbose
In the coming commits we'll need the counterparty node_id when handling a background monitor update as we may need to resume normal channel operation as a result. Thus, we go ahead and pipe it through from the shutdown end, as it makes the codepaths consistent. Sadly, the monitor-originated shutdown case doesn't allow for a required counterparty node_id as some versions of LDK didn't have it present in the ChannelMonitor.
Rather than letting `AChannelManager` be bounded by all traits being `Sized` we make them explicitly `?Sized`. We also make the trait no longer test-only as it will be used in a coming commit.
9f2e9f1
to
5509788
Compare
Squashed with one further commit added at the top to fix the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending the follow-up work, but CI needs a small fix.
`BackgroundEvent` was used to store `ChannelMonitorUpdate`s which result in a channel force-close, avoiding relying on `ChannelMonitor`s having been loaded while `ChannelManager` block-connection methods are called during startup. In the coming commit(s) we'll also generate non-channel-closing `ChannelMonitorUpdate`s during startup, which will need to be replayed prior to any other `ChannelMonitorUpdate`s generated from normal operation. In the next commit we'll handle that by handling `BackgroundEvent`s immediately after locking the `total_consistency_lock`.
When we generated a `ChannelMonitorUpdate` during `ChannelManager` deserialization, we must ensure that it gets processed before any other `ChannelMonitorUpdate`s. The obvious hook for this is when taking the `total_consistency_lock`, which makes it unlikely we'll regress by forgetting this. Here we add that call in the `PersistenceNotifierGuard`, with a test-only atomic bool to test that this criteria is met.
If a `ChannelMonitorUpdate` was created and given to the user but left uncompleted when the `ChannelManager` is persisted prior to a restart, the user likely lost the `ChannelMonitorUpdate`(s). Thus, we need to replay them for the user, which we do here using the new `BackgroundEvent::MonitorUpdateRegeneratedOnStartup` variant.
When we forward a payment and receive an `update_fulfill_htlc` message from the downstream channel, we immediately claim the HTLC on the upstream channel, before even doing a `commitment_signed` dance on the downstream channel. This implies that our `ChannelMonitorUpdate`s "go out" in the right order - first we ensure we'll get our money by writing the preimage down, then we write the update that resolves giving money on the downstream node. This is safe as long as `ChannelMonitorUpdate`s complete in the order in which they are generated, but of course looking forward we want to support asynchronous updates, which may complete in any order. Here we add infrastructure to handle downstream `ChannelMonitorUpdate`s which are blocked on an upstream preimage-containing one. We don't yet actually do the blocking which will come in a future commit.
5509788
to
394f54d
Compare
Oops, sorry, doc bug on an intermediary commit, shuffled diff around across commits to fix it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking!
) -> bool { | ||
actions_blocking_raa_monitor_updates | ||
.get(&channel_funding_outpoint.to_channel_id()).map(|v| !v.is_empty()).unwrap_or(false) | ||
|| self.pending_events.lock().unwrap().iter().any(|(_, action)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be cleaner to get rid of EventCompletionAction::ReleaseRAAChannelMonitorUpdate
and store that data in actions_blocking_raa_monitor_updates
now, so RAA monitor upd blockers are only stored in one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe? I really hate the actions_blocking_raa_monitor_updates
thing - its redundant state in two places that has to always be in sync, vs the event stuff is one place that isn't redundant and checked in-place - its comparatively harder to screw up.
The only reason for actions_blocking_raa_monitor_updates
is that without it we'd have to lock + walk each peer and all our channels to figure out if we're being blocked anywhere. Originally I was gonna do that if we found a blocking action but figured it was overengineering, but either way I'd kinda rather not have the issue twice, even if we have to have it once.
I don't think its worth waiting, some of the followups have to wait for 0.0.117, and the next round of followups that do have to go in 116 I havent finished writing yet 😭 |
Gonna get this out of the way. Given some followups are required for 116 anyway will address the above doc comments there. |
Ok, I'll rebase in the morning and we'll see. |
When we forward a payment and receive an
update_fulfill_htlc
message from the downstream channel, we immediately claim the HTLC
on the upstream channel, before even doing a
commitment_signed
dance on the downstream channel. This implies that our
ChannelMonitorUpdate
s "go out" in the right order - first weensure we'll get our money by writing the preimage down, then we
write the update that resolves giving money on the downstream node.
This is safe as long as
ChannelMonitorUpdate
s complete in theorder in which they are generated, but of course looking forward we
want to support asynchronous updates, which may complete in any
order.
Here we add infrastructure to handle downstream
ChannelMonitorUpdate
s which are blocked on an upstreampreimage-containing one. We don't yet actually do the blocking which
will come in a future commit.
Based on #2111,the follow up is be based on #2112 plus this.