-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi: log additional information for local force closures #7787
Conversation
A local force close could be due to one of the following reasons: - User initiated - Link failure (automatic) - On-chain HTLC conditions such as timeout (automatic) This commit adds a new struct that can encapsulate these different reasons, and logs that struct to the channel's ArbitratorLog.
In this commit we use the functional optional for specifying the local force close reason that we built in the previous commit. The option can be for user-initiated FC, or link failure or FC, or just logging the ChainActionMap in the case of chainTrigger.
During the local force close flow, at the time of actually marking the channel as closed, we need to fetch the local force close info from the channel's arbitrator log and write it into the persistent closed channels bucket in the DB. This commit achieves that by creating a new exported struct inside the channeldb package, populates the struct, and writes it to the DB.
We add a new proto message to populate the local force close info in the ClosedChannels response and generate the new RPC files. We also add the info to the response.
1c67ef3
to
08ca9fc
Compare
Hey @saubyk this is mostly ready for review (I just need to investigate some failing tests) but I've fixed all the lint issues and added a release notes entry for 0.17.1 per the milestone you assigned. Let me know whom we can ask for reviews 🙂 |
Hi @shaurya947 great work so far. thanks. Will start assigning reviewers when we start the dev cycle for 17.1 |
looking forward to this PR ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea for approaching this quite complex task. Very nice work, looks pretty far along.
Main comments are around using TLV to serialize the force close info and to share the code between the arbitrator and the channel DB if possible.
b := new(bytes.Buffer) | ||
|
||
// first byte is userInitiated bool | ||
err = binary.Write(b, endian, info.userInitiated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use TLV for any new data structure that we save to the DB. That will make it way easier to extend this info structure as well. See the example in fetchChanInfo()
in channeldb/channel.go
.
return err | ||
} | ||
|
||
// next bytes are actual map data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: all inline comments should also be full sentences with proper capitalization and punctuation.
func UserInitiatedForceClose(ca *ChannelArbitrator) { | ||
// Ignore errors since logging force close info is not a critical part | ||
// of the force close flow. | ||
_ = ca.log.LogLocalForceCloseInfo(localForceCloseInfo{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should at least log any errors?
// of ForceCloseContract(). We don't export this type directly, but rather the | ||
// concrete functions implementing this type, so that callers can choose the | ||
// concrete function that best describes their use case for force closing. | ||
type forceCloseContractOption func(*ChannelArbitrator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this doesn't look like the normal option pattern, as it's just a function type. Maybe just call it forceCloseContractCallback
instead?
@@ -1071,6 +1104,7 @@ func (c *ChainArbitrator) ForceCloseContract(chanPoint wire.OutPoint) (*wire.Msg | |||
} | |||
|
|||
log.Infof("Attempting to force close ChannelPoint(%v)", chanPoint) | |||
opt(arbitrator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice design with the callback 👍
@@ -870,6 +870,11 @@ func (c *ChannelArbitrator) stateStep( | |||
// next state, as we still need to broadcast the commitment | |||
// transaction. | |||
case chainTrigger: | |||
// Ignore errors since logging force close info is not a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a named callback for this as well? Then we have all the code in one place.
// this may be extended in the future. For example, we may decide to log HTLC | ||
// information that doesn't automatically result in a force close, but leaves | ||
// the channel in a state whereby the user has no option but to force close. | ||
type LocalForceCloseInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use this in the contractcourt
directly? I see that the HtlcActions isn't exactly the same type, but maybe we could convert? Otherwise we need encode/decode code in two places.
If we share this struct, it can have a public Encode()
and Decode()
method.
@@ -3579,6 +3606,18 @@ func serializeChannelCloseSummary(w io.Writer, cs *ChannelCloseSummary) error { | |||
} | |||
} | |||
|
|||
// Write whether the local force close info is present. | |||
if err := WriteElements(w, cs.LocalFCInfo != nil); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work without changes to the WriteElement
and ReadElement
functions in channeldb/codec.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you say this? bool
is handled by WriteElement
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I misread it as just WriteElements(w, cs.LocalFCInfo)
, missed the != nil
part. So maybe extract to a bool to make it more obvious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the general pattern of writing it looking at the function.
for action, htlcs := range localFCInfo.HtlcActions { | ||
htlcHashes := make([][]byte, len(htlcs)) | ||
for i, htlc := range htlcs { | ||
htlcHashes[i] = htlc.RHash[:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use copy()
here to avoid running into the famous golang loop variable issues.
@@ -1601,6 +1601,25 @@ enum Initiator { | |||
INITIATOR_BOTH = 3; | |||
} | |||
|
|||
message ListHTLCHash { | |||
// Hash of HTLC | |||
repeated bytes hash = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should be plural since it's a slice of hashes.
Thanks @Chinwendu20 |
|
||
// htlcActions is a non-empty map iff the force close was automatically | ||
// initiated by an on-chain trigger such as HTLC timeout. | ||
htlcActions ChainActionMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @guggero would be a good idea to just indicate that the closure was due to ahtlcAction
and we just put the particular htlc action responsible for the closure. Would there be need for the actual htlc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would be nice to be able to correlate the closure resulting from an HTLC with the actual HTLC in the channel (to see why it timed out, what its CLTV delta was and so on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay thanks.
@shaurya947, remember to re-request review from reviewers when ready |
Continued here: #8072 |
Replaced by #8072. |
Change Description
Addresses #7696 by logging the reason for a local force close in the DB, and echoing it within the ClosedChannels response. Since initiating a force close and marking the channel as closed happen at two distinct parts of the code, we utilize the ChannelArbitrator log to persist the necessary information between these two moments in time. Once we mark the channel as closed, we log that info in the closed channels bucket. In this PR we address 3 possible reasons of local force closure: user-initiated, link failure, and HTLC/chainTrigger-related.
Steps to Test
itests have been updated to test for all 3 conditions, but manual testing can be done by simulating any of the 3 local force close situations and then checking the channel information in the ClosedChannels response.
Pull Request Checklist
Testing
Code Style and Documentation
[skip ci]
in the commit message for small changes.📝 Please see our Contribution Guidelines for further guidance.