Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi: log additional information for local force closures #8072

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Chinwendu20
Copy link
Contributor

@Chinwendu20 Chinwendu20 commented Oct 7, 2023

Change Description

Follow up to: #7787

Fixes: #7696

@Chinwendu20 Chinwendu20 marked this pull request as draft October 7, 2023 20:07
@Chinwendu20 Chinwendu20 marked this pull request as ready for review October 9, 2023 05:58
@guggero guggero self-requested a review October 9, 2023 07:37
Copy link
Collaborator

@hieblmi hieblmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for working on extending force closure information. I've left some comments.

contractcourt/briefcase.go Show resolved Hide resolved
contractcourt/briefcase.go Outdated Show resolved Hide resolved
contractcourt/briefcase.go Outdated Show resolved Hide resolved
contractcourt/chain_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/chain_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/chain_arbitrator.go Outdated Show resolved Hide resolved
contractcourt/chain_arbitrator.go Outdated Show resolved Hide resolved
@saubyk
Copy link
Collaborator

saubyk commented Oct 12, 2023

cc: @ziggie1984 for review

@ziggie1984
Copy link
Collaborator

@Chinwendu20 can you please fix all the linter issues first: make lint or golangci-lint run

@Chinwendu20
Copy link
Contributor Author

I am aware there is a lint issue this is because the PR is dependent on this #8057 so I added a replace directive for tlv temporarily when a new tag for tlv is released and it is bumped in lnd, I will remove it. This is just for review sake. I do not know if there are any other lint issues, make lint or golangci-lint run times out on my system even when using multiple cores but I see no lint issues on the CI the workflow has not been approved, I would soon fix the file conflicts as well but hopefully that should not get in the way of reviews.

Copy link
Collaborator

@ziggie1984 ziggie1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look at the PR (first commit only for now), good work 🙌

Definitely a ConceptAck.
In terms of the approach I think we can do even better especially in terms of future enhancements.

Overall I think we can already combine the infos when a force-close happens meaning that we could always add the HTLC action map, why restrict it to closures which are explictily triggered by the chaintrigger. Meaning that why not add the detailed state of the HTLCs when the user or any linkfailure triggered the FC?
Now when we combine different infos, we see clearly that the current approach would always require a new DB transaction, this is something we should improve especially when we wanna try to add even more infos in the future. We should add all the necessary infos and commit it to the db when we mark the channel as closed.
Another point which we should add to the FC Info is the broadcast height of the FC, because that would fill the HTLC action map with more life under the condition that we add more infos of the HTLC in the summary other than just the Hash (mainly the timeout blockheights)

Will provide a comment how I envision the design of the datastruct shorty.

Let me know what you think.


// HtlcActions is a non-empty map iff the force close was automatically
// initiated by an on-chain trigger such as HTLC timeout.
HtlcActions map[string][]HTLC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use ChainAction instead of string here, which is way more space efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the ChainAction type in contractcourt package? The channeldb package is already imported there, so circular dependence.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand, so use the a byte (uint8) instead and convert to the string type when delivering to the user in the rpcserver ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can solve the circular dependence by making the ChainAction a common subpackage that is pulled into both scopes. I agree that we shouldn't be using strings when there is a finitely enumerable set of keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is an outdated version of this PR and why do we need a separate package just because we want to import the chainAction type.

channeldb/channel.go Outdated Show resolved Hide resolved

action := []byte(action)

actionStream, err := tlv.NewStream(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: naming actionStream => tlvStream

l uint64) error {

if v, ok := val.(*map[string][]HTLC); ok {
// First, we'll decode the varint that encodes how many set IDs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpy&err: set IDs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this.


var action []byte

actionStream, err := tlv.NewStream(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit actionStream => tlvStream

}

infoBytes := scopeBucket.Get(localForceCloseInfoKey)
if len(infoBytes) == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infoBytes == nil ...

}

// HtlcActionEncoder is a custom TLV encoder for the htlcAction record.
func HtlcActionEncoder(w io.Writer, val interface{}, buf *[8]byte) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add this customized encode decoders to the FuzzSuite as well, or what do you think @morehouse ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be great, though we currently have no fuzz tests for channeldb. Writing those tests is probably a bigger project to tackle separately, rather than holding up this PR.

See #8045 for some progress on channeldb fuzzing.

})

if err != nil {
log.Errorf("Error logging "+
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Format

})

if err != nil {
log.Errorf("Error logging "+
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Format

go.mod Outdated
@@ -208,6 +208,8 @@ replace github.com/gogo/protobuf => github.com/gogo/protobuf v1.3.2
// allows us to specify that as an option.
replace google.golang.org/protobuf => github.com/lightninglabs/protobuf-go-hex-display v1.30.0-hex-display

replace github.com/lightningnetwork/lnd/tlv => github.com/Chinwendu20/lnd/tlv v0.0.0-20231006214256-76b8755dfd3b
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you separate the temporary mod changes in a separate commit.

@ziggie1984
Copy link
Collaborator

ziggie1984 commented Oct 13, 2023

I thought about something like this:


type OptionalForceCloseInfo struct {
	userInitiated *bool

	linkFailureError *string

	htlcActions map[string][]HTLC
}

func (f *OptionalForceCloseInfo) Apply(optionalForceCloseFields ...OptionalForceCloseField) {
	for _, optionalForceCloseField := range optionalForceCloseFields {
		optionalForceCloseField(f)
	}
}

=> add the OptionalForceCloseInfo to the ChanArbitrator

type OptionalForceCloseField func(*OptionalForceCloseInfo)

func UserInitiated(userInitiated bool) OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.userInitiated = &userInitiated
	}
}
func LinkFailureError(linkFailureError string) OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.linkFailureError = &linkFailureError
	}
}


record the infos you want to add where necessary ?

....

func (b *boltArbitratorLog) LogLocalForceCloseInfo(
	info channeldb.LocalForceCloseInfo)
    
      //Log the infos when the channel is closed ....

...

}

Log infos when the channel close transaction is broadcasted and provide the information not only when channels are completely closed but also in pendingclose like for example channels which are in waiting_close_channels or pending_force_closing_channels?

This approach would reduce the db operations to only one when we combine the ForceClose infos.

Let me know what you think or whether this is helpful, if you need a more detailed picture I can provide it as well.

@Chinwendu20
Copy link
Contributor Author

Chinwendu20 commented Oct 14, 2023

Log infos when the channel is broadcasted

Do you mean when the channel is marked as open? @ziggie1984

@ziggie1984
Copy link
Collaborator

Do you mean when the channel is marked as open?

I updated the comment, was referring to the broadcasting of the channel close transaction.

@ziggie1984
Copy link
Collaborator

I am aware there is a lint issue this is because the PR is dependent on this #8057 so I added a replace directive for tlv temporarily when a new tag for tlv is released and it is bumped in lnd, I will remove it. This is just for review sake. I do not know if there are any other lint issues, make lint or golangci-lint run times out on my system even when using multiple cores but I see no lint issues on the CI the workflow has not been approved, I would soon fix the file conflicts as well but hopefully that should not get in the way of reviews.

FYI you can create a PR on your fork of the lnd repo in github and then the CI tests will run without you needing permission every time

@Chinwendu20
Copy link
Contributor Author

Took a look at the PR (first commit only for now), good work 🙌

Definitely a ConceptAck. In terms of the approach I think we can do even better especially in terms of future enhancements.

Overall I think we can already combine the infos when a force-close happens meaning that we could always add the HTLC action map, why restrict it to closures which are explictily triggered by the chaintrigger. Meaning that why not add the detailed state of the HTLCs when the user or any linkfailure triggered the FC?

Are you saying that beyond indicating that a force close was caused by a user using bool and indicating the link failure error when the cause of the force close is a link failure, the localForceCloseInfo should also provide the htlcs associated with the channel as well?

@ziggie1984
Copy link
Collaborator

Are you saying that beyond indicating that a force close was caused by a user using bool and indicating the link failure error when the cause of the force close is a link failure, the localForceCloseInfo should also provide the htlcs associated with the channel as well?

Exactly why not record this info as well, when it's there, it could also paint a more detailed picture when a link_error happens for whatever reason ? Currently when the user manually force closes or the link fails the gained info is very low , adding the htlc action map could reveal some potential problems in hindsight?
Example: User cannot co-op close a channel because of an unclean channel state, now he does not want to wait until an htlc times-out but rather goes for the manual force close of the channel. Let's say there was a remote dangling htlc, one would not see this in the resolutions which are already existent in the close_summary.

Moreover it would also give use the possibility to enhance the information we show when channels are in a pending_state later on. Wdyt ?

@Chinwendu20
Copy link
Contributor Author

Are you saying that beyond indicating that a force close was caused by a user using bool and indicating the link failure error when the cause of the force close is a link failure, the localForceCloseInfo should also provide the htlcs associated with the channel as well?

Exactly why not record this info as well, when it's there, it could also paint a more detailed picture when a link_error happens for whatever reason ? Currently when the user manually force closes or the link fails the gained info is very low , adding the htlc action map could reveal some potential problems in hindsight? Example: User cannot co-op close a channel because of an unclean channel state, now he does not want to wait until an htlc times-out but rather goes for the manual force close of the channel. Let's say there was a remote dangling htlc, one would not see this in the resolutions which are already existent in the close_summary.

Moreover it would also give use the possibility to enhance the information we show when channels are in a pending_state later on. Wdyt ?

Are you saying that beyond indicating that a force close was caused by a user using bool and indicating the link failure error when the cause of the force close is a link failure, the localForceCloseInfo should also provide the htlcs associated with the channel as well?

Exactly why not record this info as well, when it's there, it could also paint a more detailed picture when a link_error happens for whatever reason ? Currently when the user manually force closes or the link fails the gained info is very low , adding the htlc action map could reveal some potential problems in hindsight? Example: User cannot co-op close a channel because of an unclean channel state, now he does not want to wait until an htlc times-out but rather goes for the manual force close of the channel. Let's say there was a remote dangling htlc, one would not see this in the resolutions which are already existent in the close_summary.

Moreover it would also give use the possibility to enhance the information we show when channels are in a pending_state later on. Wdyt ?

I would see about implementing this.

@Chinwendu20
Copy link
Contributor Author

Another point which we should add to the FC Info is the broadcast height of the FC, because that would fill the HTLC action map with more life under the condition that we add more infos of the HTLC in the summary other than just the Hash (mainly the timeout blockheights)

Please confirm if this same as the broadcast height you refer to. It is present in the closed channel response already:
https://github.com/lightningnetwork/lnd/blob/e3c10c110f9d4946ea62aebb854c0eddc28fc888/lnrpc/lightning.proto#L1649-L1650

@ziggie1984
Copy link
Collaborator

ziggie1984 commented Oct 18, 2023

Please confirm if this same as the broadcast height you refer to. It is present in the closed channel response already:

Not quite, this is the height when the channel close transaction was actually confirmed in the blockchain, I was referring to the height the close tx was broadcasted to the network, because this is the height we are most interested in when we would analyse the HTLC Action Map in the feature. The spent can sometimes happen several blocks later, especially for STATIC_REMOTE Channels.

@guggero guggero removed their request for review October 27, 2023 09:38
@Chinwendu20
Copy link
Contributor Author

Chinwendu20 commented Oct 28, 2023

Now when we combine different infos, we see clearly that the current approach would always require a new DB transaction.

Could you please explain more about this? How would we always require a new DB transaction? Adding your new suggestion of including htlcs and broadcast height to this struct. I would add a new struct perhaps named, ForceCloseInfoSummary:

type ForceCloseInfoSummary {

HTLCActionMap map[string][]HTLC

BroadcastHeight int64

}
type OptionalForceCloseInfo struct {
	userInitiated *bool 

	linkFailureError *string 

	chainAction *bool 
	
	summary ForceCloseInfoSummary
}

Of userinitiated, linkFailureError and chainAction only one would be present in the struct and consequently in the rpc response. So the way I envision it, there would be DB transactions.

  1. To log whether the force is due to userinitiated, linkFailureError or chainAction
  2. Log the htlcs for user trigger.

(For chainAction these two steps could both be one as they are both in the same site. We would know the force close is due to a chainAction in the same site we would log the htlcs unlike others)

  1. Log the broadcast height for closetx.

This is the design that I am looking at what is your thought?

@ziggie1984
Copy link
Collaborator

I do not 100% understand your proposal do you have maybe already something to look at in your local repo? Happy to look into your approach if you have something up.

I thought I layout my thoughts in more detail in the following (maybe it helps):

So I contemplated about this PR maybe some thoughts which might be helpful:

  1. Because we cannot log the force-close info all in one place without passing arguments back and forth in function I think your initial idea in always logging the info where needed makes sense, otherwise we woud need to introduce some variable and mutex along side it which IMO is more complex that just hitting the db couple of times. Moreover we are talking about a code part which is only used when a channel is force closed, so adding more db-transaction shouldn't be a problem.
    The current approach has some downside, because it will always overwrite the old bucket_entry as soon as we try to log other information. If we want to make the local_force_close info more flexible we should make sure we can log parts of the force-close info more than once without deleting the previous entry. We can do this the following way:

  2. Create a Nested Bucket for the LocalForceClose Info and for every different info we introduce a different key

  3. Or before writing into the logs we check if we have something already available and only update the other part.

I tend to go with 2, because then we can reuse the serialization and deserialization already implemented ?

Based on choice 2, part of the code would look like this:

channeldb.go

type OptionalForceCloseInfo struct {
	// userInitiated describes whether a close was triggered by the user.
	userInitiated bool
	// linkFailureError ...
	linkFailureError string

	htlcActions map[string][]HTLC

	broadcastHeight uint64
}

func (f *OptionalForceCloseInfo) Apply(optionalForceCloseFields ...OptionalForceCloseField) {
	for _, optionalForceCloseField := range optionalForceCloseFields {
		optionalForceCloseField(f)
	}
}

type OptionalForceCloseField func(*OptionalForceCloseInfo)

func UserInitiated() OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.userInitiated = true
	}
}

func LinkFailureError(linkFailureError string) OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.linkFailureError = linkFailureError
	}
}

func HtlcActions(htlcActions map[string][]HTLC) OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.htlcActions = htlcActions
	}
}

func BroadCastHeight(broadcastHeight uint64) OptionalForceCloseField {
	return func(f *OptionalForceCloseInfo) {
		f.broadcastHeight = broadcastHeight
	}
}

briefcase.go

func (b *boltArbitratorLog) LogLocalForceCloseInfo(
	info ...channeldb.OptionalForceCloseField) error {

	return kvdb.Update(b.db, func(tx kvdb.RwTx) error {
		scopeBucket, err := tx.CreateTopLevelBucket(b.scopeKey[:])
		if err != nil {
			return err
		}

		// Check if we already have a value available.
		infoBytes := scopeBucket.Get(localForceCloseInfoKey)
		var forceCloseInfo channeldb.OptionalForceCloseInfo
		if infoBytes != nil {
			infoReader := bytes.NewReader(infoBytes)
			forceCloseInfo, err = channeldb.DeserializeLocalCloseInfo(
				infoReader)
			if err != nil {
				return err
			}
		}

		forceCloseInfo.Apply(info...)

		var buf bytes.Buffer
		if err := channeldb.SerializeLocalCloseInfo(&buf, &forceCloseInfo); err != nil {
			return err
		}

		return scopeBucket.Put(localForceCloseInfoKey, buf.Bytes())
	}, func() {})
}

...

Then call this function with the appropriate Infos where appropriate:

Example:

channel_arbitrator.go

	// If this is a chain trigger, then we'll go straight to the
		// next state, as we still need to broadcast the commitment
		// transaction.
		case chainTrigger:
			htlcMap := make(map[string][]channeldb.HTLC)
			for action, htlcs := range chainActions {
				htlcMap[action.String()] = htlcs
			}
			c.log.LogLocalForceCloseInfo(
				channeldb.HtlcActions(htlcMap),
				channeldb.BroadCastHeight(uint64(triggerHeight)))
			fallthrough
		case userTrigger:
			nextState = StateBroadcastCommit

chain_arbitrator.go

func (c *ChainArbitrator) ForceCloseContract(chanPoint wire.OutPoint,
	opt ...channeldb.OptionalForceCloseField) (*wire.MsgTx, error) {

	c.Lock()
	arbitrator, ok := c.activeChannels[chanPoint]
	c.Unlock()
	if !ok {
		return nil, fmt.Errorf("unable to find arbitrator")
	}

	log.Infof("Attempting to force close ChannelPoint(%v)", chanPoint)
	arbitrator.log.LogLocalForceCloseInfo(opt...)

....

and the calls in for example rpcserver.go:

closingTx, err := chainArbitrator.ForceCloseContract(
			*chanPoint,
			channeldb.UserInitiated(),
		)

@Chinwendu20
Copy link
Contributor Author

Chinwendu20 commented Oct 30, 2023

Thanks a lot for the explanation @ziggie1984

I do not 100% understand your proposal do you have maybe already something to look at in your local repo? Happy to look into your approach if you have something up.

I thought I layout my thoughts in more detail in the following (maybe it helps):

So I contemplated about this PR maybe some thoughts which might be helpful:

  1. Because we cannot log the force-close info all in one place without passing arguments back and forth in function I think your initial idea in always logging the info where needed makes sense, otherwise we woud need to introduce some variable and mutex along side it which IMO is more complex that just hitting the db couple of times. Moreover we are talking about a code part which is only used when a channel is force closed, so adding more db-transaction shouldn't be a problem.
    The current approach has some downside, because it will always overwrite the old bucket_entry as soon as we try to log other information. If we want to make the local_force_close info more flexible we should make sure we can log parts of the force-close info more than once without deleting the previous entry. We can do this the following way:
  2. Create a Nested Bucket for the LocalForceClose Info and for every different info we introduce a different key
  3. Or before writing into the logs we check if we have something already available and only update the other part.

I tend to go with 2, because then we can reuse the serialization and deserialization already implemented ?

Based on choice 2, part of the code would look like this:

channeldb.go

type OptionalForceCloseInfo struct {
	// userInitiated describes whether a close was triggered by the user.
	userInitiated bool
	// linkFailureError ...
	linkFailureError string

	htlcActions map[string][]HTLC

	broadcastHeight uint64
}

I agree with this except from the structure of the struct, we would still have to indicate if a force close was due to chain actions.

func (f *OptionalForceCloseInfo) Apply(optionalForceCloseFields ...OptionalForceCloseField) {
for _, optionalForceCloseField := range optionalForceCloseFields {
optionalForceCloseField(f)
}
}

It is not clear to me how this would be used, since there would only be one OptionalForceCloseInfo any way as the force close of a channel only happens once, would this be necessary?
This is one of the things I talked about in my previous comment as I did not understand what you meant about taking another approach to reduce DB transactions.

type OptionalForceCloseField func(*OptionalForceCloseInfo)

func UserInitiated() OptionalForceCloseField {
return func(f *OptionalForceCloseInfo) {
f.userInitiated = true
}
}

func LinkFailureError(linkFailureError string) OptionalForceCloseField {
return func(f *OptionalForceCloseInfo) {
f.linkFailureError = linkFailureError
}
}

func HtlcActions(htlcActions map[string][]HTLC) OptionalForceCloseField {
return func(f *OptionalForceCloseInfo) {
f.htlcActions = htlcActions
}
}

func BroadCastHeight(broadcastHeight uint64) OptionalForceCloseField {
return func(f *OptionalForceCloseInfo) {
f.broadcastHeight = broadcastHeight
}
}


briefcase.go

func (b *boltArbitratorLog) LogLocalForceCloseInfo(
info ...channeldb.OptionalForceCloseField) error {

return kvdb.Update(b.db, func(tx kvdb.RwTx) error {
scopeBucket, err := tx.CreateTopLevelBucket(b.scopeKey[:])
if err != nil {
return err
}

  // Check if we already have a value available.
  infoBytes := scopeBucket.Get(localForceCloseInfoKey)
  var forceCloseInfo channeldb.OptionalForceCloseInfo
  if infoBytes != nil {
  	infoReader := bytes.NewReader(infoBytes)
  	forceCloseInfo, err = channeldb.DeserializeLocalCloseInfo(
  		infoReader)
  	if err != nil {
  		return err
  	}
  }

  forceCloseInfo.Apply(info...)

  var buf bytes.Buffer
  if err := channeldb.SerializeLocalCloseInfo(&buf, &forceCloseInfo); err != nil {
  	return err
  }

  return scopeBucket.Put(localForceCloseInfoKey, buf.Bytes())

}, func() {})
}

...


Then call this function with the appropriate Infos where appropriate:

Example:

channel_arbitrator.go

// If this is a chain trigger, then we'll go straight to the
// next state, as we still need to broadcast the commitment
// transaction.
case chainTrigger:
htlcMap := make(map[string][]channeldb.HTLC)
for action, htlcs := range chainActions {
htlcMap[action.String()] = htlcs
}
c.log.LogLocalForceCloseInfo(
channeldb.HtlcActions(htlcMap),
channeldb.BroadCastHeight(uint64(triggerHeight)))
fallthrough

One of the things I was trying to pass by you in my previous comment is that we might me adding a function here as well to log htlc due to userTrigger.

  case userTrigger:
  	nextState = StateBroadcastCommit

I was trying to understand what you were saying about this current approach needing multiple DB transactions when we include htlcActions for userTrigger and broadcast height as part of the info. I was just pointing out the number of times we would be logging into the localInfo store:

  1. To log whether the force is due to userinitiated, linkFailureError or chainAction
  2. Log the htlcs for user trigger or chaintrigger
    (For chainAction these two steps could both be one as they are both in the same site. We would know the force close is due to a chainAction in the same site we would log the htlcs unlike others)

3.Log the broadcast height for closets.

All these happen at different sites and that is why we cannot log multiple parts of the info at the same time.

@ziggie1984
Copy link
Collaborator

Ok understand. So how do you wanna make sure you do not overwrite Infos you saved already into the logs in a prior state ? Because the current approach flushes the LocalForceCloseInfo everytime its called as a whole?

I think you mentioned before that there are possibilities were you can log more then one info at a time, for example the chainActions + broadcastHeight (it's not exactly the height were the tx is published, but the point in time were we decided to switch to the from the default value.)

Thats why I came up with this design:

	c.log.LogLocalForceCloseInfo(
				channeldb.HtlcActions(htlcMap),
				channeldb.BroadCastHeight(uint64(triggerHeight)))

Only an idea from my side, feel free to follow your ideas.

Yeah I think the chainAction boolean makes sense besides the other ones.

@Chinwendu20
Copy link
Contributor Author

Ok understand. So how do you wanna make sure you do not overwrite Infos you saved already into the logs in a prior state ? Because the current approach flushes the LocalForceCloseInfo everytime its called as a whole?

I think I would explore your idea of creating a different key for each info.Then see how it goes.

I think you mentioned before that there are possibilities were you can log more then one info at a time, for example the chainActions + broadcastHeight (it's not exactly the height were the tx is published, but the point in time were we decided to

chainActions Bool +_ HTLC map

switch to the from the default value.)

Thats why I came up with this design:

	c.log.LogLocalForceCloseInfo(
				channeldb.HtlcActions(htlcMap),
				channeldb.BroadCastHeight(uint64(triggerHeight)))

Only an idea from my side, feel free to follow your ideas.

Yeah I think the chainAction boolean makes sense besides the other ones.

Thanks a lot,I think I would start working on the PR now then get more feedback from you. I just wanted to be sure that am not completely off track. Thanks a lot.

// LocalForceCloseInsights holds information about a local force close.
type LocalForceCloseInsights struct {
// Initiator indicates what initiated the local force close.
Initiator LocalForceCloseInitiator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a "Local" ForceCloseInitiator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is only for local force close. Follow up PR could be for remote force close.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we should call this ForceCloseInsights and ForceCloseInitiator since we gonna also save the info for remote force close?


// fetchLocalForceCloseInsights fetches the local force close insights
// previously stored when commitment was broadcast.
func fetchLocalForceCloseInsights(chanBucket kvdb.RBucket,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should just return (*LocalForceCloseInsights, error) and not take the *OpenChannel argument. Favor returning things via expressions rather than mutating data structures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code that calls this function, you would see this is how the functions look like, followed with the pattern for uniformity. I think though if there is an issue with it, it should be raised in a new issue:
https://github.com/lightningnetwork/lnd/blob/3906a44f17abfb4b7ffb3963de6515accec89017/channeldb/channel.go#L1856-L1886

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. Yeah I see what you mean. I think this is a pretty bad pattern fwiw but I guess we can leave it as is. I'd request you rename it from fetchLocal... to attachLocal...

// fetchLocalForceCloseInsight fetches the details stored about the local
// force close previously stored in the arbitrator log and returns it as
// LocalForceCloseInsights.
func (c *ChannelArbitrator) fetchLocalForceCloseInsights() *channeldb.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this one actually seems to behave the way I would expect. Why do we have two functions named the same thing that do conceptually differeint things

https://github.com/lightningnetwork/lnd/pull/8072/files#r1417996512

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are in two different packages

// AssertChainAction asserts that the passed chainActions is of expected
// length and contains all the items in the actions slice as keys in the
// chainActions map.
func AssertChainAction(chainActions map[string]*lnrpc.ListHTLCHash,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this name more descriptive. It seems that it is asserting that the ChainActions match the supplied slice. Maybe call it AsserChainActionsMatch

// AssertLocalFCInsightsForClosedChannels asserts that the passed chanPoint
// is found in the list of passed closedChannels and also asserts that the
// local force close insights is as expected.
func (h *HarnessTest) AssertLocalFCInsightsForClosedChannels(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this describe what it is actually asserting. Maybe AssertCloseSummaryContainsLocalFCInsights

Comment on lines +3563 to +3589
// Append local force close insights if present.
if pendingClose.LocalFCInsights != nil {
htlcActions := make(map[string]*lnrpc.ListHTLCHash)
for action, htlcs := range pendingClose.LocalFCInsights.
ChainActions {
htlcHashes := make([][]byte, len(htlcs))
for i, htlc := range htlcs {
htlcHashes[i] = htlc.RHash[:]
}
htlcHashList := &lnrpc.ListHTLCHash{
Hash: htlcHashes,
}
htlcActions[action] = htlcHashList
}

channel.LocalFirstCloseInsights = &lnrpc.
LocalForceCloseInsights{
LocalForceCloseInitiator: pendingClose.
LocalFCInsights.Initiator.String(),
LocalForceCloseBroadcastHeight: pendingClose.
LocalFCInsights.BroadcastHeight,
LocalForceCloseChainActions: htlcActions,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

Comment on lines +3830 to +3856
// Append local force close insights if present.
if waitingClose.LocalFCInsights != nil {
htlcActions := make(map[string]*lnrpc.ListHTLCHash)
for action, htlcs := range waitingClose.LocalFCInsights.
ChainActions {
htlcHashes := make([][]byte, len(htlcs))
for i, htlc := range htlcs {
htlcHashes[i] = htlc.RHash[:]
}
htlcHashList := &lnrpc.ListHTLCHash{
Hash: htlcHashes,
}
htlcActions[action] = htlcHashList
}

channel.LocalFirstCloseInsights = &lnrpc.
LocalForceCloseInsights{
LocalForceCloseInitiator: waitingClose.
LocalFCInsights.Initiator.String(),
LocalForceCloseBroadcastHeight: waitingClose.
LocalFCInsights.BroadcastHeight,
LocalForceCloseChainActions: htlcActions,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

Comment on lines +4569 to +4594
// Append LocalFCInsights if present.
if dbChannel.LocalFCInsights != nil {
htlcActions := make(map[string]*lnrpc.ListHTLCHash)
for action, htlcs := range dbChannel.LocalFCInsights.
ChainActions {
htlcHashes := make([][]byte, len(htlcs))
for i, htlc := range htlcs {
htlcHashes[i] = htlc.RHash[:]
}
htlcHashList := &lnrpc.ListHTLCHash{
Hash: htlcHashes,
}
htlcActions[action] = htlcHashList
}

channel.LocalFirstCloseInsights = &lnrpc.
LocalForceCloseInsights{
LocalForceCloseInitiator: dbChannel.
LocalFCInsights.Initiator.String(),
LocalForceCloseBroadcastHeight: dbChannel.
LocalFCInsights.BroadcastHeight,
LocalForceCloseChainActions: htlcActions,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.

@ProofOfKeags
Copy link
Collaborator

Upon a second review I think the approach seems fine for the most part. I do want to echo @ziggie1984 's concern that given we are changing the OpenChannel data structure, extra care must be taken. Specifically I wonder if this change lacks a necessary migration. However, I am unsure under what circumstances one would be required CC @yyforyongyu

Left some comments on specific code but I think this is pretty close to good to go.

@Chinwendu20
Copy link
Contributor Author

Upon a second review I think the approach seems fine for the most part. I do want to echo @ziggie1984 's concern that given we are changing the OpenChannel data structure, extra care must be taken. Specifically I wonder if this change lacks a necessary migration. However, I am unsure under what circumstances one would be required CC @yyforyongyu

Left some comments on specific code but I think this is pretty close to good to go.

Thanks for the review, I think a migration would be needed if we want to backfill the local force close reason for existing channels but I think in this case, there would be no way to get the reason for a local force close for channels that existed before this change. So I guess that field would just be empty for such channels.

@Chinwendu20 Chinwendu20 force-pushed the forcetemp branch 2 times, most recently from 8b5f8bd to 86ff79c Compare December 17, 2023 06:34
shaurya947 and others added 7 commits December 17, 2023 16:10
LocalForceCloseInitiator type indicates what initiated the
local force close. It could be:
- User initiated
- Initiated due to link failure errors
- chain trigger initiated
This commit adds the LocalForceCloseInitiator type and logs it
in tlv format to the channel's ArbitratorLog.
In this commit we use the functional optional for specifying the
local force close initiator that we built in the previous commit.
The option can be for user-initiated FC, or link failure or
chain trigger FC.
This commit logs the Local Force Close insights when
the close transaction commit has been broadcasted and
persists it in the channel's bucket. The `OpenChannel`
and `CloseChannelSummary` is also updated with the
LocalForceCloseInsights field so that the insights can be
fetched during a call to fetch `WaitingCloseChannels`,
`PendingCloseChannels` and `ClosedChannels`.

Signed-off-by: Ononiwu Maureen <[email protected]>
We add a new proto message to populate the local force close insights
in the ClosedChannels, pendingCloseChannels and waitingCloseChannels
response and generate the new RPC files.
@Chinwendu20
Copy link
Contributor Author

I have updated this PR.

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think the info should not be stored in openChannelBucket since not every channel is force closed, plus that bucket will be deleted anyway once the FC process is finished. Instead, it should either be stored in historicalChannelBucket or closedChannelBucket. Another option is store it in ContractReport since we would fetch the contracts via arbitratorPopulateForceCloseResp for all force closed channels,

lnd/rpcserver.go

Line 3866 in 0df507e

func (r *rpcServer) arbitratorPopulateForceCloseResp(chanPoint *wire.OutPoint,

I think we need a pros and cons analysis to decide which approach is better.

Aside from that, I'd recommend reading this guideline first as there are some places violating our code formatting rules.

And what is the plan for logging the remote force close? That being said, I'm not sure how much this #7696 would help - for instance, if an FC happened, can the data saved here be more helpful than looking at the debug logs?

@@ -3252,6 +3252,96 @@ const (
Abandoned ClosureType = 5
)

// LocalForceCloseInitiator is a type that gives information about what led to a
// channel being force closed locally.
type LocalForceCloseInitiator string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use uint8 plus a String method to make this more space-efficient, something like

lnd/feature/set.go

Lines 49 to 64 in 0df507e

func (s Set) String() string {
switch s {
case SetInit:
return "SetInit"
case SetLegacyGlobal:
return "SetLegacyGlobal"
case SetNodeAnn:
return "SetNodeAnn"
case SetInvoice:
return "SetInvoice"
case SetInvoiceAmp:
return "SetInvoiceAmp"
default:
return "SetUnknown"
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this should be called ForceCloseInitiator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, my intial approach was to do just as you have stated concerning making it unit8 instead of a string but that would mean, creating a LocalForceCloseInitiator type for each of the LinkFailureError messages. Making the initiator a string, I can easily convert any link failure error message to the LocalForceCloseInitiator on the go instead of duplicating all the LinkFailure error message type the channel.go file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref:

type LinkError struct {
// msg returns the wire failure associated with the error.
// This value should *not* be nil, because we should always
// know the failure type for failures which occur at our own
// node.
msg lnwire.FailureMessage
// FailureDetail enriches the wire error with additional information.
FailureDetail
}
to the structure of LinkError type.

}

// HtlcActionDecoder is a custom TLV decoder for the htlcAction record.
func HtlcActionDecoder(r io.Reader, val interface{}, buf *[8]byte,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use interface value here? I think these encoder/decoder methods are only used for HTLC?

Copy link
Contributor Author

@Chinwendu20 Chinwendu20 Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where that function is used: https://github.com/lightningnetwork/lnd/blob/b22d72d44d5c71f732f15e5a796f5de7fb39a451/channeldb/channel.go#L3498-L3503C5
Also take a look at the function parameters for that function:

lnd/tlv/record.go

Lines 186 to 187 in 0df507e

func MakeDynamicRecord(typ Type, val interface{}, sizeFunc SizeFunc,
encoder Encoder, decoder Decoder) Record {

This is the function signature for Decoder type:
type Decoder func(r io.Reader, val interface{}, buf *[8]byte, l uint64) error

@@ -965,10 +965,10 @@ func (c *ChannelArbitrator) stateStep(
// Ignore errors since logging force close info is not a
// critical part of the force close flow.
ChainTriggerForceClose(chainActions)(c)
fallthrough
nextState = StateBroadcastCommit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we do not want this function called for https://github.com/lightningnetwork/lnd/pull/8072/files/c60ea993c2a31fe62c3e481cd40618455bf60afd..b22d72d44d5c71f732f15e5a796f5de7fb39a451#diff-a0b8064876b1b1d6085fa7ffdbfd38c81cb06c1ca3f34a08dbaacba203cda3ebL969-L970 on line 970 called for chain trigger force close. Chain action would be recorded twice in that case.

@@ -121,6 +121,10 @@ var (
// broadcasted when moving the channel to state CoopBroadcasted.
coopCloseTxKey = []byte("coop-closing-tx-key")

// localFCInsightsKe points to the insights gotten about a local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// localFCInsightsKe points to the insights gotten about a local
// localFCInsightsKey points to the insights gotten about a local

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

// LocalForceCloseInsights holds information about a local force close.
type LocalForceCloseInsights struct {
// Initiator indicates what initiated the local force close.
Initiator LocalForceCloseInitiator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we should call this ForceCloseInsights and ForceCloseInitiator since we gonna also save the info for remote force close?

return err
}

err = binary.Write(w, byteOrder, uint64(b.Len()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly like due to old practice - we've switched to use varint as it's more space-efficient.

channel *OpenChannel) error {

insightBytes := chanBucket.Get(localFCInsightsKey)
if insightBytes == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this be nil? Since it can only end up here when it has ChanStatusCommitBroadcasted? Unless it's a remote FC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even if it is a remote FC the channel would not have the ChanStatusCommitBroadcasted bit in the status. While fetching historical buckets we also use this function and then the ChanStatusCommitBroadcasted bit would be in the status but we do not store the force close insight in this bucket so it would be nil. I would update the comment in this line to reflect that.


// attachLocalForceCloseInsights fetches the local force close insights
// previously stored when commitment was broadcast.
func attachLocalForceCloseInsights(chanBucket kvdb.RBucket,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this can be a method on OpenChannel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that but I was just following the function signature of all the other functions in the function that call this and they are not methods though they modify the OpenChannel.

if cs.LocalFCInsights != nil {
if err := serializeLocalForceCloseInsights(w,
cs.LocalFCInsights); err != nil {
log.Warnf("Error serializing channel close "+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these errors from deserialization and serialization should be returned instead of logged.

@@ -1057,13 +1057,77 @@ type forceCloseReq struct {
closeTx chan *wire.MsgTx
}

// ForceCloseContractOption is used for providing functional options to callers
// of ForceCloseContract().
type ForceCloseContractOption func(*ChannelArbitrator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think it's the right approach - 1) it's not really an option, but an extra step in ForceCloseContract where we save the info 2) this can be expanded indefinitely, instead we should limit the function scope here. I think you could define a ForceCloseReason struct here and let other callers use it in ForceCloseContract. Just a rough thought, need to think about it more

@Chinwendu20
Copy link
Contributor Author

Thanks for the PR! I think the info should not be stored in openChannelBucket since not every channel is force closed, plus that bucket will be deleted anyway once the FC process is finished. Instead, it should either be stored in historicalChannelBucket or closedChannelBucket. Another option is store it in ContractReport since we would fetch the contracts via arbitratorPopulateForceCloseResp for all force closed channels,

lnd/rpcserver.go

Line 3866 in 0df507e

func (r *rpcServer) arbitratorPopulateForceCloseResp(chanPoint *wire.OutPoint,

Thanks for the review! This was initially stored only in the closedChannel bucket but this review comment (#8072 (comment)) stated that the force close reason should be included for "waiting close channels" which are essentially still in the openChannel bucket. I am aware that the bucket would be deleted once the FC process is complete that is why when the channel is marked as closed then it is stored along with the channel CloseSummary: https://github.com/lightningnetwork/lnd/pull/8072/files#diff-793204e241c4f89dc7184266600f1aca57cb8e3970ab714f0c65fba1a6826066L3647-L3648

So the force close reason is displayed for waiting close channels, pending close channels and close channels, just like the review comment that I linked suggested.

I think we need a pros and cons analysis to decide which approach is better.

You mean rethinking the entire approach?

Aside from that, I'd recommend reading this guideline first as there are some places violating our code formatting rules.

Thanks I would look into it.

And what is the plan for logging the remote force close? That being said, I'm not sure how much this #7696 would help - for instance, if an FC happened, can the data saved here be more helpful than looking at the debug logs?

Reading through the conversation of the issue linked to this PR, you would see that the original author suggested that, that could be done in follow up PR.: #7696 (comment)
I think there were preliminary conversations about the usefulness of this issue even before the initial PR was open, I think you were a part of it: #7696

@Chinwendu20
Copy link
Contributor Author

Thanks for the PR! I think the info should not be stored in openChannelBucket since not every channel is force closed, plus that bucket will be deleted anyway once the FC process is finished. Instead, it should either be stored in historicalChannelBucket or closedChannelBucket. Another option is store it in ContractReport since we would fetch the contracts via arbitratorPopulateForceCloseResp for all force closed channels,

lnd/rpcserver.go

Line 3866 in 0df507e

func (r *rpcServer) arbitratorPopulateForceCloseResp(chanPoint *wire.OutPoint,

I think we need a pros and cons analysis to decide which approach is better.

Aside from that, I'd recommend reading this guideline first as there are some places violating our code formatting rules.

And what is the plan for logging the remote force close? That being said, I'm not sure how much this #7696 would help - for instance, if an FC happened, can the data saved here be more helpful than looking at the debug logs?

Are you saying we just display the insights in the debug logs instead of saving them? I think the point is to make it available during rpcserver calls but you can advice on what would be more useful, @saubyk what do you think?

@saubyk
Copy link
Collaborator

saubyk commented Jan 18, 2024

I'm not sure how much this #7696 would help - for instance, if an FC happened, can the data saved here be more helpful than looking at the debug logs

@yyforyongyu looking at the debug logs is not an option every user has or is technically inclined to do, making it difficult to discern why the channel was closed. If we have this information available in the database and a rpc to pull that, it can be easily utilized by an application to inform users of the possible reason for force closures.

@saubyk saubyk removed this from the v0.18.0 milestone Jan 28, 2024
@lightninglabs-deploy
Copy link

@hieblmi: review reminder
@ziggie1984: review reminder
@Chinwendu20, remember to re-request review from reviewers when ready

@ziggie1984
Copy link
Collaborator

!lightninglabs-deploy mute

@hieblmi hieblmi removed their request for review August 14, 2024 08:35
@ziggie1984 ziggie1984 removed their request for review November 22, 2024 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature]: Store the channel force close error message in the closed channels database
9 participants