-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(relayer): prevent tx1 resubmission #28
Conversation
refactor notifier init merge reporter block handles
submitter/relayer/relayer.go
Outdated
} | ||
tx1 := rl.lastSubmittedCheckpoint.Tx1 | ||
// prevent resending tx1 if it was successful | ||
if rl.lastSubmittedCheckpoint.Tx1 == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relevant changes here, cache the TX1 if successful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any chance for rl.lastSubmittedCheckpoint.Tx1
be loaded with the tx1 sent from the previous epoch?
submitter/relayer/relayer.go
Outdated
rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch) | ||
|
||
if rl.lastSubmittedCheckpoint == nil { | ||
rl.lastSubmittedCheckpoint = &types.CheckpointInfo{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Init the lastSubmittedCheckpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this to the default structure at New(
vigilante/submitter/relayer/relayer.go
Line 47 in c8a5c36
wallet btcclient.BTCWallet, |
and then simplify the check by removing rl.lastSubmittedCheckpoint == nil
btcclient/client_wallet.go
Outdated
@@ -35,7 +35,7 @@ func NewWallet(cfg *config.BTCConfig, parentLogger *zap.Logger) (*Client, error) | |||
HTTPPostMode: true, | |||
User: cfg.Username, | |||
Pass: cfg.Password, | |||
DisableTLS: cfg.DisableClientTLS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove from config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait this was confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why this appeared to me as code change at first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in previous PR, basically we are not using it in bitcoind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking an alternative solution.
if we fail tx2 and retry the submission of both tx1 and tx2, tx1 would actually fail because the tx1 is already in mempool. In our impl, we would return error in this case so that tx2 will not be sent. So, I was thinking maybe we can achieve the goal by identifying the already in mempool
error of tx1 and ignore it. Wdyt? Probably we need e2e to test this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-wise lgtm! My major comment is that we kinda mixed the logic of handling the two cases. It would be good to separate the two cases.
Also, better to have both unit and e2e tests for this but can be done in a separate pr
submitter/relayer/relayer.go
Outdated
if rl.lastSubmittedCheckpoint == nil || | ||
rl.lastSubmittedCheckpoint.Tx1 == nil || | ||
rl.lastSubmittedCheckpoint.Epoch < ckptEpoch { | ||
rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch) | ||
|
||
if rl.lastSubmittedCheckpoint == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate the two cases? if shouldSendCompleCheckpoint()
--> convertCkptToTwoTxAndSubmit
, else if shouldSendTx2()
--> retrySendTx2
. I think this would be logically cleaner and some low-level funtions can be reused.
submitter/relayer/relayer.go
Outdated
sendCompleteCkpt := rl.lastSubmittedCheckpoint.Tx1 == nil || | ||
rl.lastSubmittedCheckpoint.Epoch < ckptEpoch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we should check whether
rl.lastSubmittedCheckpoint
isnil
first. Otherwise, this might - it would be good to have method for this like
shouldSendCompleteCkpt()
to hide details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check whether rl.lastSubmittedCheckpoint is nil first. Otherwise, this might
-> lastSubmittedCheckpoint is now initialized in the constructor so, it shouldn't panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. All good then
submitter/relayer/relayer.go
Outdated
shouldSendTx2 := (rl.lastSubmittedCheckpoint.Tx1 != nil || rl.lastSubmittedCheckpoint.Epoch < ckptEpoch) && | ||
rl.lastSubmittedCheckpoint.Tx2 == nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to have a method for this to hide details
submitter/relayer/relayer.go
Outdated
shouldSendTx2 := (rl.lastSubmittedCheckpoint.Tx1 != nil || rl.lastSubmittedCheckpoint.Epoch < ckptEpoch) && | ||
rl.lastSubmittedCheckpoint.Tx2 == nil | ||
|
||
if sendCompleteCkpt { | ||
rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch) | |
rl.logger.Infof("Submitting a raw checkpoint for epoch %v", ckptEpoch) |
maybe it's not the first time due to failure
submitter/relayer/relayer.go
Outdated
// this is to wait for btcwallet to update utxo database so that | ||
// the tx that tx1 consumes will not appear in the next unspent txs lit | ||
// todo(Lazar): is the arbitrary timeout here necessary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not after switching to using FundRawTransaction
. Let's check it in e2e
} | ||
|
||
tx1 := rl.lastSubmittedCheckpoint.Tx1 | ||
if tx1 == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to have a doc string for this function saying that the tx1 should not be nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this function uses tx1 implicitly through state, I'll add the doc string above, but I would live the check in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep yep, keeping the check is good here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good ! (though lets wait @gitferry approval)
One question I have, do you guys think we should have some persistence/check on BTC chain whether checkpoint was not submitted already ?
Two separate cases I have in mind:
- we send checkpoint to the mempool but we crashed. Now after restart we will send both transactions once again which will lead to money loss. (as will use different inputs for those transactions, those will be different transactions with the same data)
- if there are multiple vigilante reporters in the network, the checkpoint may be already BTC chain but not yet submitted to Babylon. In this case we will also probably lose money.
I think this can be solved by storing the last submitted epoch in db. We need it for quick bootstrapping anyway
This is a long-term thinking. Basically the submitter can have a separate goroutine looking for submitted checkpoints from new BTC blocks and decide whether the submit it So maybe we can start by adding a db because for phase-2 we are probably still the only org running vigilantes? |
Hmm agreed that solving the case Even if we do not solve the case So this is will be extension of this task if I am correct: babylonchain/vigilante#156. We need persistent storage to:
cc: @Lazar955 |
Yep, babylonchain/vigilante#156 was raised mostly for the monitor bootstrapping but I guess the submitter also needs it |
In case of a restart, we want to avoid wasting fees if we have already submitted transactions for a certain epoch. Basic idea: A crash occurs after sending both checkpoint transactions for epoch `n` to the BTC and recording them in the database. Upon restarting, we find that epoch `n` is still marked as sealed. Before resending the transactions we first check our database and confirm that the transactions for this epoch have already been sent. Since the transactions were previously sent, the next step is to verify their status on our Bitcoin node. If the Bitcoin node is aware of the transactions and they are present in the mempool, no further action is needed. However, if the node does not recognize the transactions, this indicates they were lost, and we must resend them to the network. [References](#28 (comment))
If tx2 fails and we are retrying, we shouldn't resend tx1 again.
Referencing issue