chore(relayer): prevent tx1 resubmission #28

Lazar955 · 2024-08-28T13:09:39Z

If tx2 fails and we are retrying, we shouldn't resend tx1 again.

Referencing issue

refactor notifier init merge reporter block handles

Lazar955 · 2024-08-28T13:12:15Z

submitter/relayer/relayer.go

-	}
+	tx1 := rl.lastSubmittedCheckpoint.Tx1
+	// prevent resending tx1 if it was successful
+	if rl.lastSubmittedCheckpoint.Tx1 == nil {


Relevant changes here, cache the TX1 if successful

is there any chance for rl.lastSubmittedCheckpoint.Tx1 be loaded with the tx1 sent from the previous epoch?

Lazar955 · 2024-08-28T13:12:48Z

submitter/relayer/relayer.go

 		rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch)

+		if rl.lastSubmittedCheckpoint == nil {
+			rl.lastSubmittedCheckpoint = &types.CheckpointInfo{}


Init the lastSubmittedCheckpoint

Can we add this to the default structure at New(

vigilante/submitter/relayer/relayer.go

Line 47 in c8a5c36

wallet btcclient.BTCWallet,

and then simplify the check by removing rl.lastSubmittedCheckpoint == nil

RafilxTenfen · 2024-08-29T18:46:34Z

btcclient/client_wallet.go

@@ -35,7 +35,7 @@ func NewWallet(cfg *config.BTCConfig, parentLogger *zap.Logger) (*Client, error)
 		HTTPPostMode: true,
 		User:         cfg.Username,
 		Pass:         cfg.Password,
-		DisableTLS:   cfg.DisableClientTLS,


why remove from config?

wait this was confusing

not sure why this appeared to me as code change at first

This was in previous PR, basically we are not using it in bitcoind.

gitferry

I was thinking an alternative solution.
if we fail tx2 and retry the submission of both tx1 and tx2, tx1 would actually fail because the tx1 is already in mempool. In our impl, we would return error in this case so that tx2 will not be sent. So, I was thinking maybe we can achieve the goal by identifying the already in mempool error of tx1 and ignore it. Wdyt? Probably we need e2e to test this case

gitferry

Code-wise lgtm! My major comment is that we kinda mixed the logic of handling the two cases. It would be good to separate the two cases.

Also, better to have both unit and e2e tests for this but can be done in a separate pr

gitferry · 2024-09-02T01:42:48Z

submitter/relayer/relayer.go

+	if rl.lastSubmittedCheckpoint == nil ||
+		rl.lastSubmittedCheckpoint.Tx1 == nil ||
+		rl.lastSubmittedCheckpoint.Epoch < ckptEpoch {
 		rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch)

+		if rl.lastSubmittedCheckpoint == nil {


Can we separate the two cases? if shouldSendCompleCheckpoint() --> convertCkptToTwoTxAndSubmit, else if shouldSendTx2() --> retrySendTx2. I think this would be logically cleaner and some low-level funtions can be reused.

gitferry · 2024-09-02T11:51:22Z

submitter/relayer/relayer.go

+	sendCompleteCkpt := rl.lastSubmittedCheckpoint.Tx1 == nil ||
+		rl.lastSubmittedCheckpoint.Epoch < ckptEpoch


we should check whether rl.lastSubmittedCheckpoint is nil first. Otherwise, this might

it would be good to have method for this like shouldSendCompleteCkpt() to hide details

we should check whether rl.lastSubmittedCheckpoint is nil first. Otherwise, this might

-> lastSubmittedCheckpoint is now initialized in the constructor so, it shouldn't panic

You are right. All good then

gitferry · 2024-09-02T11:52:59Z

submitter/relayer/relayer.go

+	shouldSendTx2 := (rl.lastSubmittedCheckpoint.Tx1 != nil || rl.lastSubmittedCheckpoint.Epoch < ckptEpoch) &&
+		rl.lastSubmittedCheckpoint.Tx2 == nil


would be good to have a method for this to hide details

gitferry · 2024-09-02T11:55:27Z

submitter/relayer/relayer.go

+	shouldSendTx2 := (rl.lastSubmittedCheckpoint.Tx1 != nil || rl.lastSubmittedCheckpoint.Epoch < ckptEpoch) &&
+		rl.lastSubmittedCheckpoint.Tx2 == nil
+
+	if sendCompleteCkpt {
 		rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch)


Suggested change

rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch)

rl.logger.Infof("Submitting a raw checkpoint for epoch %v", ckptEpoch)

maybe it's not the first time due to failure

gitferry · 2024-09-02T12:01:13Z

submitter/relayer/relayer.go

 	// this is to wait for btcwallet to update utxo database so that
 	// the tx that tx1 consumes will not appear in the next unspent txs lit
+	// todo(Lazar): is the arbitrary timeout here necessary?


Maybe not after switching to using FundRawTransaction. Let's check it in e2e

gitferry · 2024-09-02T12:03:32Z

submitter/relayer/relayer.go

+	}
+
+	tx1 := rl.lastSubmittedCheckpoint.Tx1
+	if tx1 == nil {


better to have a doc string for this function saying that the tx1 should not be nil

But this function uses tx1 implicitly through state, I'll add the doc string above, but I would live the check in

yep yep, keeping the check is good here

gitferry

Great work!

KonradStaniec

Code looks good ! (though lets wait @gitferry approval)

One question I have, do you guys think we should have some persistence/check on BTC chain whether checkpoint was not submitted already ?

Two separate cases I have in mind:

we send checkpoint to the mempool but we crashed. Now after restart we will send both transactions once again which will lead to money loss. (as will use different inputs for those transactions, those will be different transactions with the same data)
if there are multiple vigilante reporters in the network, the checkpoint may be already BTC chain but not yet submitted to Babylon. In this case we will also probably lose money.

gitferry · 2024-09-02T13:15:57Z

we send checkpoint to the mempool but we crashed. Now after restart we will send both transactions once again which will lead to money loss. (as will use different inputs for those transactions, those will be different transactions with the same data)

I think this can be solved by storing the last submitted epoch in db. We need it for quick bootstrapping anyway

if there are multiple vigilante reporters in the network, the checkpoint may be already BTC chain but not yet submitted to Babylon. In this case we will also probably lose money.

This is a long-term thinking. Basically the submitter can have a separate goroutine looking for submitted checkpoints from new BTC blocks and decide whether the submit it

So maybe we can start by adding a db because for phase-2 we are probably still the only org running vigilantes?

KonradStaniec · 2024-09-02T13:48:56Z

This is a long-term thinking. Basically the submitter can have a separate goroutine looking for submitted checkpoints from new BTC blocks and decide whether the submit it

So maybe we can start by adding a db because for phase-2 we are probably still the only org running vigilantes?

Hmm agreed that solving the case 1 is more important.

Even if we do not solve the case 2 before mainnet launch, if we solve case 1 our vigiliante loses are constrained by 1 submission per epoch. This is not ideal but imo we can leave with it.

So this is will be extension of this task if I am correct: https://github.com/babylonchain/vigilante/issues/156. We need persistent storage to:

save bootstrapping time after recovery
do not waste fees if we already submitted transactions for certain epochs

cc: @Lazar955

gitferry · 2024-09-02T13:51:38Z

Yep, https://github.com/babylonchain/vigilante/issues/156 was raised mostly for the monitor bootstrapping but I guess the submitter also needs it

In case of a restart, we want to avoid wasting fees if we have already submitted transactions for a certain epoch. Basic idea: A crash occurs after sending both checkpoint transactions for epoch `n` to the BTC and recording them in the database. Upon restarting, we find that epoch `n` is still marked as sealed. Before resending the transactions we first check our database and confirm that the transactions for this epoch have already been sent. Since the transactions were previously sent, the next step is to verify their status on our Bitcoin node. If the Bitcoin node is aware of the transactions and they are present in the mempool, no further action is needed. However, if the node does not recognize the transactions, this indicates they were lost, and we must resend them to the network. [References](#28 (comment))

Lazar955 added 22 commits August 22, 2024 13:22

reporter uses notifier

3f0bb19

monitor uses notifier

1991b21

use btc client without subscription

55c846d

remove zmq impl

eaa9900

update mocks

8575dd1

cleanup unused code

998cee9

cleanup

376bc3a

fix test

93f04e3

build flag

3d19173

rm commented out code

e3eb8b7

pr comments

bdf4f9e

refactor notifier init merge reporter block handles

btc scanner merge handle block logic

f7335e2

rm unused code

df198b3

combine booststrap and blockevent handle

e47d580

remove btcd handling in code

4fd5176

update readme

10de19b

merge

ec26580

update readme

db30811

update readme with bitcoind cmds

f2ca5f9

update readme link

066a526

update bitcoind cmds

f74d816

prevent tx1 resubmission if tx2 fails

c8a5c36

Lazar955 commented Aug 28, 2024

View reviewed changes

Lazar955 marked this pull request as ready for review August 28, 2024 14:29

Lazar955 requested review from KonradStaniec, gitferry and RafilxTenfen August 29, 2024 07:48

merge dev

543812d

RafilxTenfen reviewed Aug 29, 2024

View reviewed changes

gitferry reviewed Aug 30, 2024

View reviewed changes

cleaner code

d90117d

gitferry reviewed Sep 2, 2024

View reviewed changes

code and readibility cleanup

c773f7d

Lazar955 requested a review from gitferry September 2, 2024 11:11

gitferry reviewed Sep 2, 2024

View reviewed changes

more cleanup

a6a0da0

gitferry approved these changes Sep 2, 2024

View reviewed changes

KonradStaniec approved these changes Sep 2, 2024

View reviewed changes

Lazar955 merged commit 489293d into dev Sep 2, 2024
8 checks passed

Lazar955 deleted the lazar/handle_tx2_fail branch September 2, 2024 13:29

Lazar955 mentioned this pull request Sep 13, 2024

feat(submitter): stateful submitter #43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(relayer): prevent tx1 resubmission #28

chore(relayer): prevent tx1 resubmission #28

Lazar955 commented Aug 28, 2024 •

edited

Loading

Lazar955 Aug 28, 2024

RafilxTenfen Aug 29, 2024

Lazar955 Aug 28, 2024

RafilxTenfen Aug 29, 2024

RafilxTenfen Aug 29, 2024

RafilxTenfen Aug 29, 2024

RafilxTenfen Aug 29, 2024

Lazar955 Aug 29, 2024

gitferry left a comment

gitferry left a comment

gitferry Sep 2, 2024

gitferry Sep 2, 2024

Lazar955 Sep 2, 2024

gitferry Sep 2, 2024

gitferry Sep 2, 2024

gitferry Sep 2, 2024

gitferry Sep 2, 2024

gitferry Sep 2, 2024

Lazar955 Sep 2, 2024

gitferry Sep 2, 2024

gitferry left a comment

KonradStaniec left a comment

gitferry commented Sep 2, 2024

KonradStaniec commented Sep 2, 2024 •

edited

Loading

gitferry commented Sep 2, 2024 •

edited

Loading

		sendCompleteCkpt := rl.lastSubmittedCheckpoint.Tx1 == nil \|\|
		rl.lastSubmittedCheckpoint.Epoch < ckptEpoch

		shouldSendTx2 := (rl.lastSubmittedCheckpoint.Tx1 != nil \|\| rl.lastSubmittedCheckpoint.Epoch < ckptEpoch) &&
		rl.lastSubmittedCheckpoint.Tx2 == nil

	rl.logger.Infof("Submitting a raw checkpoint for epoch %v for the first time", ckptEpoch)
	rl.logger.Infof("Submitting a raw checkpoint for epoch %v", ckptEpoch)

chore(relayer): prevent tx1 resubmission #28

chore(relayer): prevent tx1 resubmission #28

Conversation

Lazar955 commented Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gitferry left a comment

Choose a reason for hiding this comment

gitferry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gitferry left a comment

Choose a reason for hiding this comment

KonradStaniec left a comment

Choose a reason for hiding this comment

gitferry commented Sep 2, 2024

KonradStaniec commented Sep 2, 2024 • edited Loading

gitferry commented Sep 2, 2024 • edited Loading

Lazar955 commented Aug 28, 2024 •

edited

Loading

KonradStaniec commented Sep 2, 2024 •

edited

Loading

gitferry commented Sep 2, 2024 •

edited

Loading