-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(l1): peers tcp connections #1885
Conversation
and refactors some methods that were locking the table too much
|
Will you make a PR with the tooling? I think it would be a good thing to have in the repo. |
@@ -693,7 +694,7 @@ pub(super) mod tests { | |||
broadcast, | |||
}; | |||
|
|||
let discv4 = Discv4Server::try_new(ctx).await?; | |||
let discv4 = Discv4Server::try_new(ctx.clone()).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ctx
safe to clone? How does storage
behave?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is safe! You can see the Store
struct here, cloning is just incrementing the reference count.
/// Inserts a node into the table, even if the bucket is filled. | ||
/// # Returns | ||
/// A tuple containing: | ||
/// 1. PeerData: none if the peer was already in the table or as a potential replacement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the opposite here, that is, if it was in the table I may want to check the old value. How is it currently used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few high level questions:
|
@@ -122,10 +122,6 @@ impl Decoder for RLPxCodec { | |||
let mut temp_ingress_aes = self.ingress_aes.clone(); | |||
temp_ingress_aes.apply_keystream(header_text); | |||
|
|||
// header-data = [capability-id, context-id] | |||
// Both are unused, and always zero | |||
assert_eq!(&header_text[3..6], &(0_u8, 0_u8).encode_to_vec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this check a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is ok, and it aligns with the spec (see here), but:
- The fields are not being used anywhere in the protocol now.
- Via the logs, I noticed that some clients may send a different value, but the connection would still be correct.
- Additionally, I found that the rest of the execution clients (at least geth, besu, and nethermind) don't enforce this header to be RLP Encoded 0, so I preferred to remove it simply.
Most of it can be read on the spec here. |
Motivation
We observed that our peer count was significantly lower compared to other execution clients.
Description
This PR refines the exploratory work done in #1875. While debugging, we developed tools to track individual peer traces (see here if interested). Through this, we discovered that although we were exchanging messages with many peers, they were not appearing in the stats. This indicated that they were not being inserted into the
KademliaTable
.The root cause was that the Bucket was already full, which happens quickly since each bucket only holds 10 entries. Below is a list of changes that improved the peer situation:
header-data
was failing. While the spec dictates that it should be RLP-encoded zero, some clients would not send the encoded values. Additionally, we found that no other clients seem to enforce verification on this field.Other Findings
Too many peers
as the reason).Closes #1811