Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add linera sync-validator CLI command #3156

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jvff
Copy link
Contributor

@jvff jvff commented Jan 21, 2025

Motivation

Some testnet validators started lagging, and that made clients stop sending certificates to them, which made them lag even more.

Proposal

As a quick-fix, a linera sync-validator CLI command was added that pushes local certificates that are missing from a specified validator.

Test Plan

A simple end-to-end test was written to ensure that it behaves as expected.

Release Plan

  • Nothing to do / These changes follow the usual release cycle, because it adds a new feature without breaking any backward compatibility.

Links

jvff added 7 commits January 21, 2025 02:58
Upload missing certificates to the validator.
The command should help to update lagging validators.
Prepare to test the `sync-validator` CLI command.
Stop the validator just by specifying its index.
Make inline usage of them shorter.
A helper method to obtain a `linera_rpc::Client` to interface directly
with a specific validator.
Ensure that it updates a lagging validator.
@jvff jvff added the enhancement New feature or request label Jan 21, 2025
@jvff jvff added this to the Testnet #2 milestone Jan 21, 2025
@jvff jvff self-assigned this Jan 21, 2025
let Some(missing_certificate_count) = local_chain_state
.next_block_height
.0
.checked_sub(validator_chain_state.next_block_height.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also return if it's zero?

Suggested change
.checked_sub(validator_chain_state.next_block_height.0)
.checked_sub(validator_chain_state.next_block_height.0)
.filter(|count| count > 0)


let first_chain_id = chains[0];
let first_chain = context.make_chain_client(first_chain_id)?;
let committee = first_chain.local_committee().await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Alternatively, we could try the admin chain's committee. Or specify an address rather than a name on the command line? Not sure what's best.)

/// Synchronizes a validator with the local state of chains.
SyncValidator {
/// The public key of the validator to synchronize.
name: ValidatorName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to make this optional? If not specified, just try to update all validators?

Copy link
Contributor

@MathieuDutSik MathieuDutSik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for implementing this needed feature.

let (mut net, client) = config.instantiate().await?;

// Stop a validator to force it to lag behind the others
net.stop_validator(0).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe introduce a let LAGGING_VALIDATOR = 0 to make the code a little clearer.

Comment on lines +650 to +654
let schema = match self.network.internal {
Network::Grpc | Network::Grpcs => "grpc",
Network::Tcp => "tcp",
Network::Udp => "udp",
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should not be here, but instead in the cli_wrapper/mod.rs implementation of Network.
We have plenty of such functions, but I think those functions should be in the same place.

Comment on lines +634 to +636
if let Some(mut validator) = self.running_validators.remove(&validator) {
validator.terminate().await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer an error to be reported here.

Comment on lines +3405 to +3409
let certificate = self
.client
.storage
.read_certificate(certificate_hash)
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The certificates can probably be downloaded in one single operation.
Though I agree that the handle_confirmed certificate is likely more time limiting.

Comment on lines +852 to +859
// #[cfg_attr(feature = "scylladb", test_case(LocalNetConfig::new_test(Database::ScyllaDb, Network::Udp) ; "scylladb_udp"))]
#[cfg_attr(feature = "scylladb", test_case(LocalNetConfig::new_test(Database::ScyllaDb, Network::Grpc) ; "scylladb_grpc"))]
#[cfg_attr(feature = "storage-service", test_case(LocalNetConfig::new_test(Database::Service, Network::Grpc) ; "storage_service_grpc"))]
// #[cfg_attr(feature = "storage-service", test_case(LocalNetConfig::new_test(Database::Service, Network::Tcp) ; "storage_service_tcp"))]
#[cfg_attr(feature = "dynamodb", test_case(LocalNetConfig::new_test(Database::DynamoDb, Network::Grpc) ; "aws_grpc"))]
// #[cfg_attr(feature = "scylladb", test_case(LocalNetConfig::new_test(Database::ScyllaDb, Network::Tcp) ; "scylladb_tcp"))]
// #[cfg_attr(feature = "dynamodb", test_case(LocalNetConfig::new_test(Database::DynamoDb, Network::Tcp) ; "aws_tcp"))]
// #[cfg_attr(feature = "dynamodb", test_case(LocalNetConfig::new_test(Database::DynamoDb, Network::Udp) ; "aws_udp"))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the outcommented stuff that seems to come from copy paste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

3 participants