Skip to content

Trust Quorum: Prepare phase retries and testing #8000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Apr 22, 2025
Merged

Conversation

andrewjstone
Copy link
Contributor

This PR implements the ability for a trust quorum node to handle prepare acknowledgements and send retries when time has advanced via Node::tick calls.

The vast majority of the code is test code. coordinator.rs is the start of a property based test to test the behavior of a node that is coordinating reconfigurations. The coordinating node itself is the system under test (SUT), and there is an abstract model that keeps enough information to allow asserting properties about the behavior of the SUT. A TestInput is generated which contains an initial configuration for the coordinating node and a generated list of abstract Actions to be executed by the test. Each action has a corresponding method on the TestState for handling it. These methods update the model state, SUT state, and then verify any properties they can.

The Action enum is going to grow in the next few PRs such that reconfigurations beyond the initial configuration will run and messages can be dropped. These will correspond with an expansion of the Node implementation to allow recovering key shares from past committed configuration and the ability to handle Commit and Cancel API calls which ultimately are triggered from Nexus, as described in RFD 238.

andrewjstone and others added 30 commits March 21, 2025 23:18
This is the initial commit of the "real" trust quorum code as described
in RFD 238. This commit provides some of the foundation needed to
implement the trust quorum reconfiguration protocol.

The protocol itself will be implemented in a `Node` type in a
[sans-io](https://sans-io.readthedocs.io/) style with property based
tests for the full protocol, similar to LRTQ. Async code will utilize
the protocol and forward messages over sprockets channels, and handle
requests from Nexus.

This initial code was split out to keep the PR small, although
it has been used in some preliminary protocol code already that will
come in a follow up PR.
This change removes the requirement to carry encrypted rack shares in
every `Prepare` message. gfss can generate them given enough encrypted
shares because its interpolate method is more generic than other
implementations and doesn't only calculate the secret.

Also fix up based on some other review comments. This removes the
`Error` type altogether as it's no longer used.
This code builds upon #7859, which itself builds upon #7891. Those
need to be merged in order first.

A `Node` is the sans-io entity driving the trust quorum protocol. This
PR starts the work of creating the `Node` type and coordinating an
initial configuration.

There's a property based test that generates an initial `ReconfigureMsg`
that is then used to call `Node::coordinate_reconfiguration`, which will
internally setup a `CoordinatorState` and create a bunch of messages
to be sent. We verify that a new `PersistentState` is returned and that
messages are sent.

This needs a bit more documentation and testing, so it's still WIP.
This PR implements the ability for a trust quorum node to handle
prepare acknowledgements and send retries when time has advanced
via `Node::tick` calls.

The vast majority of the code is test code. `coordinator.rs` is the
start of a property based test to test the behavior of a node that
is coordinating reconfigurations. The coordinating node itself is the
system under test (SUT), and there is an abstract model that keeps
enough information to allow asserting properties about the behavior
of the SUT. A `TestInput` is generated which contains an initial
configuration for the coordinating node and a generated list of abstract
`Action`s to be executed by the test. Each action has a corresponding
method on the `TestState` for handling it. These methods update the
model state, SUT state, and then verify any properties they can.

The `Action` enum is going to grow in the next few PRs such that
reconfigurations beyond the initial configuration will run and messages
can be dropped. These will correspond with an expansion of the `Node`
implementation to allow recovering key shares from past committed
configuration and the ability to handle `Commit` and `Cancel` API calls
which ultimately are triggered from Nexus, as described in RFD 238.
@andrewjstone andrewjstone requested a review from sunshowers April 17, 2025 20:49
@@ -18,9 +18,6 @@ use std::collections::BTreeMap;
/// All the persistent state for this protocol
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct PersistentState {
// Ledger generation
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason that the trust quorum protocol code should know how the persistent state is stored.

Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor comments.

/// When the reconfiguration started
#[allow(unused)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[allow(unused)]
#[expect(unused)]

Presumably this is useful for debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even know about expect until your last PR. Thanks!

@@ -49,6 +52,7 @@ impl CoordinatorState {
/// Return the newly constructed `CoordinatorState` along with this node's
/// `PrepareMsg` so that it can be persisted.
pub fn new_uninitialized(
log: Logger,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this add a component context via log.new indicating that it is the coordinator_state component?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 50ebf64

@@ -86,6 +90,7 @@ impl CoordinatorState {

/// A reconfiguration from one group to another
pub fn new_reconfiguration(
log: Logger,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same q here -- add component context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 50ebf64

@@ -101,18 +106,21 @@ impl CoordinatorState {
new_shares,
};

Ok(CoordinatorState::new(now, msg, config, op))
Ok(CoordinatorState::new(log, now, msg, config, op))
}

// Intentionallly private!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Intentionallly private!
// Intentionally private!

I guess this is because the public constructors do error checking? If so could this comment briefly mention that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 50ebf64

@@ -137,39 +145,76 @@ impl CoordinatorState {
// This method is "in progress" - allow unused parameters for now
#[allow(unused)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[allow(unused)]
#[expect(unused)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 50ebf64

#[weight(50)]
Tick(
#[strategy(
(RETRY_TIMEOUT_MS/4..RETRY_TIMEOUT_MS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go past RETRY_TIMEOUT_MS to something like 5/4 times that value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 50ebf64

Tick(
#[strategy(
(RETRY_TIMEOUT_MS/4..RETRY_TIMEOUT_MS)
.prop_map(|ms| Duration::from_millis(ms))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be

Suggested change
.prop_map(|ms| Duration::from_millis(ms))
.prop_map(Duration::from_millis)

I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed in 50ebf64

Comment on lines +555 to +557
/// still be duplicated due to the shift implementation used. Therefore we
/// instead just choose from a constrained set of usize values that we can
/// use directly as indexes into our fixed size structure for all tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Your idea to choose from a fixed size universe was super helpful!

Comment on lines +563 to +565
/// very efficient as opposed to a `prop_flat_map`. When we calculate the
/// threshold from the index we use max(2, Index), since the minimum
/// threshold is always 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does bias the index slightly away from being uniformly at random -- I guess it isn't a huge deal but another way to do it might be 2 + index.index(...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we do it that way, we could end up going over the max index. I'm not too worried about this distribution :D

Comment on lines +635 to +636
let threshold =
Threshold(usize::max(2, config.threshold.index(members.len())) as u8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I thought index.index panicked if the value passed in was 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but we never generate a set of members smaller than 3 (MIN_CLUSTER_SIZE).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, we just don't want the index to be 0 or 1. The length of the set is never smaller than 3.

Base automatically changed from ajs/realtq-2 to main April 21, 2025 22:28
@andrewjstone andrewjstone enabled auto-merge (squash) April 22, 2025 00:21
@andrewjstone andrewjstone merged commit 0bebe3a into main Apr 22, 2025
16 checks passed
@andrewjstone andrewjstone deleted the ajs/realtq-3 branch April 22, 2025 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants