p2p: smart discovery (dead/solo remedy subnets) #1949

iurii-ssv · 2024-12-18T11:23:57Z

Replaces #1917 additionally improving a bunch of p2p-handling code, with the main contributing factor being "selective discovery"

Before merging:

remove/resolve TODOs used for testing
(when testing) make sure this PR doesn't introduce memory-leaks or super-high CPU consumption type of issues
do we want to set MinConnectivitySubnets to higher-than-0 value (say 3) ? There isn't really a good way to tell how it affects network connectivity (but it's almost certainly gonna improve it, not worsen it); we removed this feature out of this PR for now (for simplicity)

network/discovery/dv5_service.go

network/peers/conn_manager.go

network/discovery/dv5_service.go

network/p2p/p2p_setup.go

iurii-ssv · 2024-12-20T13:19:39Z

Another thing I've realized is happening - because we only trim once we've reached MaxPeers limit, at some point we might stop trimming completely - for example, we've got to max amount of the incoming connections we allow via 0.5 ratio (say, 30) + we've discovered some outgoing peers but not enough to reach MaxPeers limit,

in other words, we've discovered everything we could (and we keep discovering, just at a very slow rate - due to quite an aggressive filter there isn't many connections we are interested in) but that also means we stopped re-cycling(trimming) our incoming connections periodically because we don't trim incoming/outgoing connections separately

I've added a commit to address that - d230c4a - it's pretty "hacky/ad-hoc" way do the job - ideally (to simplify this) I think we probably want to have separate limits per incoming/outgoing connections and somewhat separate trimming logic, WDYT ?

network/discovery/dv5_service.go

network/p2p/p2p.go

network/discovery/dv5_service.go

codecov · 2025-01-09T14:09:37Z

Codecov Report

Attention: Patch coverage is 29.30801% with 521 lines in your changes missing coverage. Please review.

Project coverage is 47.2%. Comparing base (13fbc84) to head (8d21ce7).
Report is 1 commits behind head on stage.

Files with missing lines	Patch %	Lines
network/p2p/p2p.go	9.3%	304 Missing and 7 partials ⚠️
network/commons/subnets.go	54.6%	37 Missing and 7 partials ⚠️
network/p2p/p2p_pubsub.go	3.2%	30 Missing ⚠️
network/peers/connections/conn_gater.go	0.0%	19 Missing ⚠️
network/peers/connections/conn_handler.go	0.0%	19 Missing ⚠️
network/p2p/p2p_setup.go	72.1%	14 Missing and 3 partials ⚠️
network/peers/conn_manager.go	0.0%	16 Missing ⚠️
network/p2p/observability.go	6.2%	15 Missing ⚠️
network/discovery/dv5_filters.go	50.0%	9 Missing and 4 partials ⚠️
network/discovery/dv5_service.go	71.0%	8 Missing and 3 partials ⚠️
... and 10 more

Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

beacon/goclient/goclient.go

network/discovery/dv5_bootnode.go

network/p2p/p2p.go

network/commons/subnets.go

network/p2p/p2p.go

Co-authored-by: rehs0y <[email protected]>

…ersPool

network/peers/conn_manager.go

operator/validator/controller.go

y0sher

small changes :)

network/p2p/p2p.go

zippoxer · 2025-02-27T16:26:24Z

network/p2p/p2p.go

+	// start go-routine to choose the best peer(s) from the pool of discovered peers to propose
+	// these as outbound connections, all discovered peers are kept in a pool so we can choose
+	// from that same pool across different runs of this interval-func
+	async.Interval(n.ctx, 15*time.Second, func() {


i believe its time to extract this 200 lines func into its own method 😅

@y0sher @iurii-ssv this is an improvement, but i still think this is just hard to read compared to what it should do

1: there are too many comments and some are way longer than they could be

comments should be added when it's hard to infer what's happening from the code (non-obvious things), or for visual separation of steps (and in the latter case it should be a name/title, not an explanation)

comments are really useful for both cases (and likely more), but when there are too many less useful comments, then the reader just avoids them entirely as they become noise unfortunately :/

(applies to the rest of the PR as well)

i believe its time to extract this 200 lines func into its own method 😅

I don't think "just splitting code up" does anything by itself, it should be a "logical unit" to be a separate meaningful function, do you have a specific suggestion what should be put in a separate func ?

Eg. main func is typically ~1000 lines long and its fine for it to be that long,

as for this function here - all it's doing is choosing/filtering discovered peers (it's 1 single task, there is nothing to split off I don't think)

comments should only be added where it's hard or impossible to infer what's happening from the code, or for visual separation of steps (and in the latter case it should be a name/title, not an explanation)

I try to leave only comments that have "useful" info in them, I think it's better to have more comments than almost no comments

but I agree it's hard to write concise and complete comments, do you have specific suggestions on what to remove/rephrase ?

2: i might be wrong, but the scoring mechanism looks more complex than it has to be

@y0sher @iurii-ssv curious to hear arguments why it should be more complex than what we had in the simulation?

I don't think "just splitting code up" does anything by itself, it should be a "logical unit" to be a separate meaningful function, do you have a specific suggestion what should be put in a separate func ?

i agree, i also prefer everything under the same purpose to be in the same place (and often get questioned for my big functions)

however in this case we're under the p2p.Start method which does many things, not one, so it'd make sense to extract the connection-making task to its own method, such as the tasks below it already are:

ssv/network/p2p/p2p.go

Lines 475 to 482 in 4895ae1

async.Interval(n.ctx, peersTrimmingInterval, n.peersTrimming(logger))

async.Interval(n.ctx, peersReportingInterval, recordPeerCount(n.ctx, logger, n.host))

async.Interval(n.ctx, peerIdentitiesReportingInterval, recordPeerIdentities(n.ctx, n.host, n.idx))

async.Interval(n.ctx, topicsReportingInterval, recordPeerCountPerTopic(n.ctx, logger, n.topicsCtrl, 2))

so we're not splitting this logic in any sense

wdyt?

but the scoring mechanism looks more complex than it has to be

The func closest to what we had in simulation is this one -

ssv/network/p2p/p2p.go

Line 845 in 4895ae1

func (n *p2pNetwork) score(peerID peer.ID, subnet int) float64 {

but it doesn't map in straightforward way onto this solution we've decided to go with (it only works with our own peers to decided if connected peer is good or not):

and so it's only used when trimming currently connected peers,

while scoreSubnetSum is the most straightforward implementation of the algo we want for discovery

none of the comments here are needed because they're obvious from reading the variable names

yeah, maybe it's a bit too much, removed redundant comments here - 81c8268

I agree splitting out of the Start function could make both the Start function and the scoring function easier to read.

Also on the comments part, I partially agree with @moshe-blox , I think it might be worth to "sum up" some comments so instead of comment every line we can "prepare the reader" to what the upcoming code will do, and trust the the variable names will talk for themselves. unless of course the code is a bit more complex so it needs further explaining.

however in this case we're under the p2p.Start method which does many things, not one, so it'd make sense to extract the connection-making task to its own method, such as the tasks below it already are:

ssv/network/p2p/p2p.go

Lines 475 to 482 in 4895ae1

async.Interval(n.ctx, peersTrimmingInterval, n.peersTrimming(logger))

async.Interval(n.ctx, peersReportingInterval, recordPeerCount(n.ctx, logger, n.host))

async.Interval(n.ctx, peerIdentitiesReportingInterval, recordPeerIdentities(n.ctx, n.host, n.idx))

async.Interval(n.ctx, topicsReportingInterval, recordPeerCountPerTopic(n.ctx, logger, n.topicsCtrl, 2))

so we're not splitting this logic in any sense

I wasn't thinking about Start func itself - it's a kind of refactoring I would usually prefer to do on top (later on in a separate PR), but if you think it simplifies (not complicates) review - maybe we can move discovery-related parts into separate func - let me see what I can do about that

update: split off discovery-related parts into separate method - 7b119d6 and also reviewed comments (I think whatever left is useful in case we need to get back to this code and understand it / update it in the future), happy to apply other suggestions if you have any

Co-authored-by: Zippo <[email protected]>

network/p2p/p2p.go

moshe-blox · 2025-02-27T19:33:16Z

network/p2p/p2p.go

+			const (
+				duoSubnetPriority  = 1
+				soloSubnetPriority = 3
+				deadSubnetPriority = 9
+
+				maxPossibleScore = commons.SubnetsCount * deadSubnetPriority
+			)
+
+			deadSubnetCnt := 0
+			soloSubnetCnt := 0
+			duoSubnetCnt := 0
+			for i := 0; i < commons.SubnetsCount; i++ {
+				if s[i] == 0 {
+					deadSubnetCnt++
+				}
+				if s[i] == 1 {
+					soloSubnetCnt++
+				}
+				if s[i] == 2 {
+					duoSubnetCnt++
+				}
+			}
+			return float64(maxPossibleScore - (deadSubnetPriority*deadSubnetCnt + soloSubnetPriority*soloSubnetCnt + duoSubnetPriority*duoSubnetCnt))
+		}
+


potentially we can change this func to accept suggested peer for score computation:

scoreSubnetPeers := func(current SubnetPeers, added SubnetPeers) (score float64) { const ( duoSubnetPriority = 1 soloSubnetPriority = 3 deadSubnetPriority = 9 maxPossibleScore = commons.SubnetsCount * deadSubnetPriority ) v := 0 for i := 0; i < commons.SubnetsCount; i++ { if added[i] == 1 { switch s[i] { case 0: v += deadSubnetPriority case 1: v += soloSubnetPriority case 2: v += duoSubnetPriority } } } return float64(maxPossibleScore - v) }

so only if the peer adds to a dead/solo subnet, it would receive more score

Yeah, what you suggest will make it a bit more readable, so I've changed the scoreSubnetPeers func to be a method of SubnetSum:

// scoreSubnetVector estimates the score of subnet-vector v (each byte in v is 0 or 1) based on // how many dead/solo/duo subnets it contributes to the current subnet-sum (higher score is better) func (s *SubnetSum) scoreSubnetVector(v commons.Subnets) (score float64) { const ( duoSubnetPriority = 1 soloSubnetPriority = 3 deadSubnetPriority = 9 ) result := 0.0 for i := 0; i < commons.SubnetsCount; i++ { if v[i] != 1 { continue } if s[i] == 0 { result += deadSubnetPriority } if s[i] == 1 { result += soloSubnetPriority } if s[i] == 2 { result += duoSubnetPriority } } return result }

and also refactored the way we work with SubnetSum a bit - 690ee3e

let me know if that's in line with what you had in mind :)

iurii-ssv mentioned this pull request Dec 18, 2024

p2p: dead/solo subnets remedy #1917

Closed

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch 2 times, most recently from 5dac0bc to a364c2c Compare December 18, 2024 13:31

iurii-ssv commented Dec 18, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

iurii-ssv commented Dec 18, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

iurii-ssv commented Dec 18, 2024

View reviewed changes

network/peers/conn_manager.go Outdated Show resolved Hide resolved

iurii-ssv commented Dec 18, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

iurii-ssv commented Dec 18, 2024

View reviewed changes

network/p2p/p2p_setup.go Outdated Show resolved Hide resolved

iurii-ssv mentioned this pull request Dec 18, 2024

p2p: dead/solo subnets investigation #1896

Closed

y0sher reviewed Dec 22, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

y0sher reviewed Dec 22, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

y0sher reviewed Dec 22, 2024

View reviewed changes

network/p2p/p2p.go Show resolved Hide resolved

moshe-blox reviewed Dec 24, 2024

View reviewed changes

network/discovery/dv5_service.go Outdated Show resolved Hide resolved

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch 2 times, most recently from 88a752c to 44665d0 Compare January 9, 2025 13:58

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch 4 times, most recently from 04f69b3 to 846688f Compare January 17, 2025 16:41

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch from 846688f to 13b33d0 Compare January 27, 2025 10:02

iurii-ssv changed the title ~~p2p: dead/solo subnets remedy (enhanced)~~ p2p: smart discovery (dead/solo remedy subnets) Jan 27, 2025

y0sher reviewed Jan 28, 2025

View reviewed changes

beacon/goclient/goclient.go Show resolved Hide resolved

y0sher reviewed Jan 28, 2025

View reviewed changes

network/discovery/dv5_bootnode.go Show resolved Hide resolved

y0sher reviewed Feb 19, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

y0sher reviewed Feb 19, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

iurii-ssv commented Feb 21, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

MatusKysel reviewed Feb 25, 2025

View reviewed changes

network/commons/subnets.go Outdated Show resolved Hide resolved

MatusKysel reviewed Feb 25, 2025

View reviewed changes

network/commons/subnets.go Show resolved Hide resolved

log min/max score in a way that makes more sense

8a5f703

iurii-ssv marked this pull request as draft February 26, 2025 13:31

y0sher reviewed Feb 26, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

y0sher reviewed Feb 26, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

iurii-ssv and others added 2 commits February 26, 2025 17:56

Update network/p2p/p2p.go

198f2c3

Co-authored-by: rehs0y <[email protected]>

clarify how peer retries reset to 0 through removal from DiscoveredPe…

ca56aa5

…ersPool

y0sher reviewed Feb 26, 2025

View reviewed changes

network/peers/conn_manager.go Outdated Show resolved Hide resolved

y0sher reviewed Feb 26, 2025

View reviewed changes

operator/validator/controller.go Outdated Show resolved Hide resolved

y0sher requested changes Feb 26, 2025

View reviewed changes

get rid of global variables

e4ae99e

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch from 6b34e3a to e4ae99e Compare February 26, 2025 18:36

fix unit-tests

5d3304e

y0sher reviewed Feb 27, 2025

View reviewed changes

network/p2p/p2p.go Show resolved Hide resolved

remove MinSubnets (filler subnets) as unused

8d67719

MatusKysel approved these changes Feb 27, 2025

View reviewed changes

initialize trimmedRecently for connManager

4895ae1

zippoxer reviewed Feb 27, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

zippoxer reviewed Feb 27, 2025

View reviewed changes

iurii-ssv and others added 4 commits February 27, 2025 19:15

remove redundant comments

81c8268

move discovery-related parts into separate method

7b119d6

Update network/p2p/p2p.go

8d21ce7

Co-authored-by: Zippo <[email protected]>

comments

eb0fc16

moshe-blox reviewed Feb 27, 2025

View reviewed changes

network/p2p/p2p.go Outdated Show resolved Hide resolved

iurii-ssv added 4 commits February 27, 2025 21:09

Clarify comments, fix compilation issues

a35538d

Simplify the way we find all connected peers of ours

bd189c1

minor clarification

7774a3e

minor clarification

64f968e

moshe-blox reviewed Feb 27, 2025

View reviewed changes

refactor SubnetSum and corresponding scoring function

690ee3e

iurii-ssv force-pushed the testing/deploy/deadsub-remedy branch from 808b302 to 690ee3e Compare February 28, 2025 08:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: smart discovery (dead/solo remedy subnets) #1949

p2p: smart discovery (dead/solo remedy subnets) #1949

iurii-ssv commented Dec 18, 2024 •

edited

Loading

iurii-ssv commented Dec 20, 2024

codecov bot commented Jan 9, 2025 •

edited

Loading

y0sher left a comment

zippoxer Feb 27, 2025

zippoxer Feb 27, 2025 •

edited

Loading

zippoxer Feb 27, 2025 •

edited by moshe-blox

Loading

iurii-ssv Feb 27, 2025 •

edited

Loading

moshe-blox Feb 27, 2025 •

edited

Loading

moshe-blox Feb 27, 2025 •

edited

Loading

iurii-ssv Feb 27, 2025 •

edited

Loading

y0sher Feb 27, 2025

y0sher Feb 27, 2025

iurii-ssv Feb 27, 2025 •

edited

Loading

moshe-blox Feb 27, 2025

iurii-ssv Feb 28, 2025


	async.Interval(n.ctx, peersTrimmingInterval, n.peersTrimming(logger))

	async.Interval(n.ctx, peersReportingInterval, recordPeerCount(n.ctx, logger, n.host))

	async.Interval(n.ctx, peerIdentitiesReportingInterval, recordPeerIdentities(n.ctx, n.host, n.idx))

	async.Interval(n.ctx, topicsReportingInterval, recordPeerCountPerTopic(n.ctx, logger, n.topicsCtrl, 2))

p2p: smart discovery (dead/solo remedy subnets) #1949

Are you sure you want to change the base?

p2p: smart discovery (dead/solo remedy subnets) #1949

Conversation

iurii-ssv commented Dec 18, 2024 • edited Loading

iurii-ssv commented Dec 20, 2024

codecov bot commented Jan 9, 2025 • edited Loading

Codecov Report

y0sher left a comment

Choose a reason for hiding this comment

zippoxer Feb 27, 2025

Choose a reason for hiding this comment

zippoxer Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

zippoxer Feb 27, 2025 • edited by moshe-blox Loading

Choose a reason for hiding this comment

iurii-ssv Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

moshe-blox Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

moshe-blox Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

iurii-ssv Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

y0sher Feb 27, 2025

Choose a reason for hiding this comment

y0sher Feb 27, 2025

Choose a reason for hiding this comment

iurii-ssv Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

moshe-blox Feb 27, 2025

Choose a reason for hiding this comment

iurii-ssv Feb 28, 2025

Choose a reason for hiding this comment

iurii-ssv commented Dec 18, 2024 •

edited

Loading

codecov bot commented Jan 9, 2025 •

edited

Loading

zippoxer Feb 27, 2025 •

edited

Loading

zippoxer Feb 27, 2025 •

edited by moshe-blox

Loading

iurii-ssv Feb 27, 2025 •

edited

Loading

moshe-blox Feb 27, 2025 •

edited

Loading

moshe-blox Feb 27, 2025 •

edited

Loading

iurii-ssv Feb 27, 2025 •

edited

Loading

iurii-ssv Feb 27, 2025 •

edited

Loading