Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

340 add handling for non recoverable stellar submission errors #391

Merged

Conversation

adelarja
Copy link
Contributor

@adelarja adelarja commented Sep 1, 2023

The aim of this PR is to add error handling logic for failed submissions.

Closes #340

General overview of the changes:

  1. A few method renames;
  2. Introduce new file: resubmissions.rs, dedicated for resubmission and handling of errors. Bulk of the code is here
  3. Introduced new file: mock.rs, contains the function helpers for the tests cases
  4. Also updated the resubmit_transactions_from_cache(), with a parameter that helps handle failed tasks

How to begin the review:

  • clients/wallet/src/resubmissions.rs
    • type FailedTasks = essential for removing transaction envelopes in cache for failed resubmissions
      • ⭐ instead of complicating things with channels, I just immediately call the remove_tx_envelope_from_cache() if the error is not viable for resubmission
    • fn resubmit_transactions_from_cache() = the only public method. Every 30 minutes it will read from cache and resubmit existing ones
    • fn _resubmit_transactions_from_cache()
      1. Collect all envelopes from cache.
      2. Loops through all envelopes and resubmit them
      3. ^ collects errors from the loop and tries to handle them
      4. if auto check is set to true, check fn auto_check_running_tasks()
    • fn handle_errors()
      • loops through all the errors and handle them one by one
      • ⭐ for errors that cannot be resubmitted again, its corresponding envelope is removed from cache
    • fn handle_error() = resubmits transactions with the ff errors:
      • tx_bad_seq
      • tx_internal_error
      • CacheErrorKind::SequenceNumberAlreadyUsed
  • fn handle_tx_internal_error() = resubmits a transaction with the error tx_internal_error
  • fn handle_tx_bad_seq_error_with_xdr() = resubmits a transaction with the error tx_bad_seq
  • fn bump_sequence_number_and_submit() = bump the sequence number of a transaction and resubmits it
  • fn is_transaction_already_submitted() = a costly checking if a transaction already exists. This is currently limited to just 10 pages
  • fn auto_check_running_tasks() = auto checks for failed resubmissions and forcefully remove transactions from the cache

@adelarja adelarja linked an issue Sep 1, 2023 that may be closed by this pull request
@adelarja adelarja requested review from ebma and b-yap September 1, 2023 15:06
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. We'd still need to add some very descriptive logging of course but the core logic seems on track

clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
@b-yap b-yap force-pushed the 340-add-handling-for-non-recoverable-stellar-submission-errors branch from 629f78a to 38f5f5e Compare October 25, 2023 10:54
@b-yap b-yap force-pushed the 340-add-handling-for-non-recoverable-stellar-submission-errors branch from ec36c10 to e025bdc Compare October 27, 2023 14:38
@b-yap b-yap marked this pull request as ready for review November 2, 2023 11:12
Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some minor comments but I really like the new structure overall and I think it should do the trick. Good job @b-yap 👍

clients/wallet/src/stellar_wallet.rs Outdated Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
Comment on lines 139 to 145
// pause process for 20 minutes
#[cfg(not(test))]
pause_process_in_secs(1200).await;

// for testing purpose, set to 5 seconds
#[cfg(test)]
pause_process_in_secs(5).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to pause it? We are already pausing in the loop of the outer function, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


// Remove original transaction.
// The same envelope will be saved again using a different sequence number
self.remove_tx_envelope_from_cache(tx.seq_num);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to keep remove this here and keep it in line 270. In the (very rare/unlikely) case that our vault client shuts down when executing the logic between line 260 and 264, the actual resubmission might not have been happened (because the client dies) but the sequence number and envelope got removed from cache, so upon restart the vault client would not try that one again, which is bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually, I'm wrong. I missed that we might return from this function in line 264 already. Let's move this line into the if statement of lines 263 to 265.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

Comment on lines 262 to 222
// Check if we already submitted this transaction
if !self.is_transaction_already_submitted(&tx).await {
return self.bump_sequence_number_and_submit(tx).await
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Check if we already submitted this transaction
if !self.is_transaction_already_submitted(&tx).await {
return self.bump_sequence_number_and_submit(tx).await
}
// Check if we already submitted this transaction
if !self.is_transaction_already_submitted(&tx).await {
self.bump_sequence_number_and_submit(tx).await
self.remove_tx_envelope_from_cache(tx.seq_num);
return;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i called remove_tx_envelope_from_cache() first, before the bump_sequence_number_and_submit()

clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
@b-yap b-yap force-pushed the 340-add-handling-for-non-recoverable-stellar-submission-errors branch from c0933ca to b6f4695 Compare November 6, 2023 08:56
Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good overall, I just added some minor suggestions


#[cfg_attr(test, mockable)]
impl StellarWallet {
pub async fn resubmit_transactions_from_cache(&self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this function to indicate that this will actually start scheduling a periodic resubmission? Maybe we should also pass in the resubmission interval into this function call instead. Something like: start_periodic_resubmission_of_transactions_from_cache(&self, interval_in_seconds: u64)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed, and added the interval param

clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Outdated Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Show resolved Hide resolved
clients/wallet/src/resubmissions.rs Show resolved Hide resolved
Copy link
Member

@ebma ebma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go now. At least I don't see any other immediate problems with this implementation and the error handling seems to be the way we need it. Good job @b-yap and ready for merge (once the CI errors are gone).

@b-yap b-yap merged commit 82f9e43 into main Nov 14, 2023
2 checks passed
@b-yap b-yap deleted the 340-add-handling-for-non-recoverable-stellar-submission-errors branch November 14, 2023 05:42
@TorstenStueber TorstenStueber changed the title [WIP] 340 add handling for non recoverable stellar submission errors 340 add handling for non recoverable stellar submission errors Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add handling for non-recoverable Stellar submission errors
3 participants