Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SSV Handshake Protocol #125

Open
wants to merge 41 commits into
base: unstable
Choose a base branch
from

Conversation

diegomrsantos
Copy link

@diegomrsantos diegomrsantos commented Feb 4, 2025

Issue Addressed

Implements #116

Proposed Changes

  • Dependencies:

    • Updated dependencies in Cargo.toml including async-trait, quick-protobuf, serde_json, and thiserror.
  • Handshake Protocol:

    • Added a new module handshake with multiple new files including README.md, codec.rs, envelope.proto, envelope.rs, mod.rs, and node_info.rs.
  • Network Behaviour:

    • Enhanced behaviour.rs to include handshake::Behaviour and added a new field handshake to AnchorBehaviour.
  • Network Implementation:

    • Enhanced network.rs:
    • Updated to use the Handshake Behaviour
    • Created DefaultNodeInfoProvider to provide NodeInfo.

Additional Info

There's a spec in README.md file

false,
);

let domain = format!("0x{}", hex::encode(vec![0x0, 0x0, 0x5, 0x2]));
Copy link
Author

@diegomrsantos diegomrsantos Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be configurable per network - so it should be integrated into the ssv_network_config stuff

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Diego, thanks for starting this.
Left some comments

task_executor = { workspace = true }
tokio = { workspace = true }
tracing = { workspace = true }
types = { workspace = true }
version = { workspace = true }
serde_json = "1.0.137"
thiserror = "1.0.69"
prost = "0.13.4"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use quick-protobuf in libp2p which is also available in the dependency chain, so you shouldn't need to depend on prost which is also helpful as you don't need to specifiy prost-build in the build dependencies

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to quick-protobuf. But I found prost much easier to use and straightforward. It also seems to be more actively maintained.

anchor/network/src/behaviour.rs Outdated Show resolved Hide resolved
anchor/network/src/handshake/behaviour.rs Outdated Show resolved Hide resolved
anchor/network/src/handshake/behaviour.rs Outdated Show resolved Hide resolved
anchor/network/src/handshake/behaviour.rs Outdated Show resolved Hide resolved
}

/// Network behaviour handling the handshake protocol.
pub struct HandshakeBehaviour {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skimming through the NetworkBehaviour implementation, and without knowing much about the requirements I wonder if we can suffice our needs by just implementing the Codec trait and not needing to have request-response as a sub behaviour

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would this approach look like?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the request-response Behaviour and then doing the validation envelope signing validation logic in the Codec implementation.
Does that make sense Diego?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, having the validation in the codec is the way to go. About only one behavior, I'll need to think more about it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice to keep the behavior and centralize everything related to the handshake there, exposing only the events to the main behavior.

anchor/network/src/handshake/error.rs Outdated Show resolved Hide resolved
use thiserror::Error;

#[derive(Error, Debug)]
pub enum HandshakeError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need thiserror::Error for our Errors?
We are probably fine having an Enum and impl'ing Display and Error for it like request_response::OutboundFailure does it for example.
Nonetheless since it's probably already in the dependency chain feel free to ignore this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be revisited, but It'll probably be removed.

Copy link
Author

@diegomrsantos diegomrsantos Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another advantage is the #[from] see dc79ba7

anchor/network/src/handshake/codec.rs Outdated Show resolved Hide resolved
anchor/network/src/handshake/record/record.rs Outdated Show resolved Hide resolved
@diegomrsantos diegomrsantos force-pushed the handshake branch 5 times, most recently from fdd1b40 to 949101a Compare February 7, 2025 11:35
@diegomrsantos diegomrsantos marked this pull request as ready for review February 10, 2025 13:12
@diegomrsantos diegomrsantos requested a review from jxs February 10, 2025 13:12
@diegomrsantos diegomrsantos changed the title Handshake Implement SSV Handshake Protocol Feb 10, 2025
ssz_types = "0.10"
subnet_tracker = { workspace = true }
task_executor = { workspace = true }
thiserror = "1.0.69"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use workspace dependencies

Copy link
Author

@diegomrsantos diegomrsantos Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not defined in the root. Should it be defined there and inherited even if it's used only here for now?

Copy link
Member

@dknopik dknopik Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, good question. We should align with the others on that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule of thumb I use: if it's only used by a subcrate import it there, when it becomes used by two or more, move it to workspace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been taking the approach to keep it local if its unlikely to be used in other crates.

If its a general crate, its generally best to put in the workspace dependencies, in case for some reason some other crate decides to use it.

For this specific one, it seems fairly general and conceivable that another crate MIGHT use it in the future, so might be worth chucking in workspace deps.

anchor/network/src/behaviour.rs Outdated Show resolved Hide resolved
anchor/network/src/handshake/node_info.rs Outdated Show resolved Hide resolved
false,
);

let domain = format!("0x{}", hex::encode(vec![0x0, 0x0, 0x5, 0x2]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be configurable per network - so it should be integrated into the ssv_network_config stuff

Comment on lines 368 to 386
pub struct DefaultNodeInfoProvider {
node_info: Arc<Mutex<NodeInfo>>,
}

impl DefaultNodeInfoProvider {
pub fn new(node_info: NodeInfo) -> Self {
Self {
node_info: Arc::new(Mutex::new(node_info)),
}
}
}

impl NodeInfoProvider for DefaultNodeInfoProvider {
fn get_node_info(&self) -> NodeInfo {
// In a real implementation, consider handling lock poisoning.
self.node_info.lock().unwrap().clone()
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a todo for another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was going to ask similar, do we need the NodeInfoProvider trait for this? can't we just use NodeInfo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to do a read/write mutex if we are doing lots of reads. From this snippet of code, it looks like we can't deadlock.

Copy link
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Diego, thanks for driving this! Left some comments

ssz_types = "0.10"
subnet_tracker = { workspace = true }
task_executor = { workspace = true }
thiserror = "1.0.69"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule of thumb I use: if it's only used by a subcrate import it there, when it becomes used by two or more, move it to workspace

@@ -0,0 +1,272 @@
# SSV NodeInfo Handshake Protocol Specification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome work Diego, thanks for the time putting in elaborating this document ❤️
This feels a spec, what do you think about upstreaming it to the spec repo so that it lives there?
So that we proper PR this document and it becomes canon

}
}

/// A `Codec` that reads/writes an **`Envelope`**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A `Codec` that reads/writes an **`Envelope`**
/// A `Codec` that reads/writes an **`Envelope`**.

pub struct Codec;

#[async_trait]
impl RequestResponseCodec for Codec {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
impl RequestResponseCodec for Codec {
impl request_response::Codec for Codec {


const MAXIMUM_SIZE: u64 = 1024;

impl From<envelope::Error> for io::Error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this conversion?

where
T: AsyncRead + Unpin + Send,
{
debug!("reading handsake request");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, we probably can lower these ones to trace! and leave a single debug! logging the request received

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. My idea for now is to leave everything as debug as it makes debugging while developing easier and when the system is more stable revisit the logs. What do you think?

@@ -0,0 +1,8 @@
syntax = "proto3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we create a dir generated that contains the protobuf file and the generated rust code?
So that it follows the upstream protocol conventions?

Comment on lines 368 to 386
pub struct DefaultNodeInfoProvider {
node_info: Arc<Mutex<NodeInfo>>,
}

impl DefaultNodeInfoProvider {
pub fn new(node_info: NodeInfo) -> Self {
Self {
node_info: Arc::new(Mutex::new(node_info)),
}
}
}

impl NodeInfoProvider for DefaultNodeInfoProvider {
fn get_node_info(&self) -> NodeInfo {
// In a real implementation, consider handling lock poisoning.
self.node_info.lock().unwrap().clone()
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was going to ask similar, do we need the NodeInfoProvider trait for this? can't we just use NodeInfo?

.on_connection_handler_event(peer_id, connection_id, event);
}

fn poll(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementing poll complicates things where I feel it's not needed.
following what I wrote on to what I wrote in #125 (comment), for example, by directly calling request_response::poll we need to bubble up all it's ToSwarm events, therefore we cannot have the match wildcard arm with empty curlies as it's being done.

Here we are only listening on request_response::Message::Request and request_response::Message::Response, then we bubble up RequestResponseEvent::OutboundFailure and RequestResponseEvent::InboundFailure re written as Error.

We can still isolate the validation and Codec logic here both if we:

  • bubble up request-response::Message and call handshake::handle_handshake_request and handshake::handle_handshake_response in the network::handle_handshake_event that you already wrote

or

  • add the Keypair and NodeInfo to the Codec and have the validation logic there

) {
// Handle incoming request: send response then verify
let response = self.sealed_node_record();
let _ = self.behaviour.send_response(channel, response.clone()); // Any error here is handled by the InboundFailure handler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we responding here instead of on Codec::write_response?

Codec::write_response is called right after we receive a request, see here.

write_response accepts the response that you are sending here via ResponseChannel piggybacking on the suggestion above wrt to poll we can move the sealed_node_record function to the Codec wdyt?

@@ -16,7 +16,7 @@ pub const DEFAULT_EXECUTION_NODE_WS: &str = "ws://localhost:8545/";
/// The default Data directory, relative to the users home directory
pub const DEFAULT_ROOT_DIR: &str = ".anchor";
/// Default network, used to partition the data storage
pub const DEFAULT_HARDCODED_NETWORK: &str = "mainnet";
pub const DEFAULT_HARDCODED_NETWORK: &str = "holesky";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not push this to the main branch

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default network be mainnet while in development?

SsvNetworkConfig::load(testnet_dir.clone())
} else {
println!("Loading default network {:?}", cli_args.network);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tracing.

Also, the network is not necessarily the default network, but can also be the network passed via --network.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to commit it, it was a mistake. I just wanted to see what the default network was. Do we need a log here?

@@ -32,11 +33,12 @@ pub struct SsvNetworkConfig {
pub ssv_boot_nodes: Option<Vec<Enr<CombinedKey>>>,
pub ssv_contract: Address,
pub ssv_contract_block: u64,
pub ssv_domain_type: String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it would be nicer to convert to our target type here already: a [u8; 4]. Then we catch config mistakes early. :)

Copy link
Author

@diegomrsantos diegomrsantos Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more on the target type?

@@ -76,6 +81,7 @@ impl SsvNetworkConfig {
ssv_boot_nodes,
ssv_contract: read(&base_dir.join("ssv_contract_address.txt"))?,
ssv_contract_block: read(&base_dir.join("ssv_contract_block.txt"))?,
ssv_domain_type: read(&base_dir.join("domain_type.txt"))?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssv_domain_type.txt

@@ -66,6 +66,8 @@ pub struct Config {

/// Target number of connected peers.
pub target_peers: usize,

pub domain_type: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to pass the SsvNetworkConfig into the network crate instead of duplicating the field here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants