Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Backend: libp2p is faster than litep2p #7183

Open
AndreiEres opened this issue Jan 15, 2025 · 9 comments
Open

Network Backend: libp2p is faster than litep2p #7183

AndreiEres opened this issue Jan 15, 2025 · 9 comments
Labels
T0-node This PR/Issue is related to the topic “node”. T8-polkadot This PR/Issue is related to/affects the Polkadot network. T12-benchmarks This PR/Issue is related to benchmarking and weights.

Comments

@AndreiEres
Copy link
Contributor

AndreiEres commented Jan 15, 2025

Recently, we began running benchmarks to compare the libp2p and litep2p implementations of NetworkBackend. We tested both implementations in two modes:

  • Serially: we send a new message or request only after receiving the previous one.
  • With backpressure: we send all messages at once, relying on the internal backpressure in these libraries; requests do not have backpressure.

The second mode is intended to demonstrate usage that closely resembles reality.

According to initial results from the notifications protocol, litep2p is faster with serial work, while libp2p performs better with backpressure. This means that either our benchmarks are incorrect or we can slow down gossiping after switching nodes to litep2p.

Benchmarks:

Charts:

Image Time to process a message 64B, notifications protocol (lower is better)

@paritytech/networking

@AndreiEres AndreiEres added T0-node This PR/Issue is related to the topic “node”. T12-benchmarks This PR/Issue is related to benchmarking and weights. T8-polkadot This PR/Issue is related to/affects the Polkadot network. labels Jan 15, 2025
@sandreim
Copy link
Contributor

@lexnv is this a blocker for litep2p deployment ?

@sandreim
Copy link
Contributor

@AndreiEres can we re-run one of these benchmarks (perhaps the 64k and 256k ones) and measure:

  • send/receive 1k, 5k, 10k messages in the backpressure scenario - load observed live in prod
  • avg time per 1 message

Kusama validators notifications/s chart:
Image

@lexnv
Copy link
Contributor

lexnv commented Jan 15, 2025

Gathering more data from our Kusama Validators to see how this stacks in production

Libp2p Data

Data obtained from one single kusama validator running libp2p:

Image

Have asked devops to reboot this node with litep2p enabled, I will have the data shortly 🙏

Grafana Link: https://grafana.teleport.parity.io/goto/aKLWk8DNg?orgId=1

@AndreiEres
Copy link
Contributor Author

@lexnv @dmitry-markin
On our real nodes, we have several notification protocols, such as block_announce, validation, and collation. Can a network worker with multiple notification services process more messages in parallel than a worker with only one protocol?

@AndreiEres
Copy link
Contributor Author

@sandreim
I executed the benchmarks for the notifications protocol on our CI benchmark runners. Were observed three different loads: 1k, 5k, and 10k messages. Here are the results.

Avg throughput per second (higher is better)

Processed 1k 5k 10k
64KB
libp2p 22.5K/s 22.7K/s 22.5K/s
litep2p 21.3K/s 20.2K/s 20.6K/s
256KB
libp2p 2.8K/s 2.8K/s 2.8K/s
litep2p 2.4K/s 2.5K/s 2.6K/s

Avg time per message (lower is better)

Processed 1k 5k 10k
64KB
libp2p 0.044ms 0.044ms 0.044ms
litep2p 0.047ms 0.049ms 0.049ms
256KB
libp2p 0.360ms 0.363ms 0.356ms
litep2p 0.413ms 0.402ms 0.383ms
Raw data from benchmarks
1k messages
notifications_protocol/libp2p/with_backpressure/64KB:    44421256 ns/iter (+/- 309026)
notifications_protocol/libp2p/with_backpressure/256KB:   359526580 ns/iter (+/- 2611825)
notifications_protocol/litep2p/with_backpressure/64KB:    46879477 ns/iter (+/- 327248)
notifications_protocol/litep2p/with_backpressure/256KB:   413041795 ns/iter (+/- 4181491)
5k messages
notifications_protocol/libp2p/with_backpressure/64KB:   220405786 ns/iter (+/- 1849544)
notifications_protocol/libp2p/with_backpressure/256KB:  1814364552 ns/iter (+/- 18384018)
notifications_protocol/litep2p/with_backpressure/64KB:   247287037 ns/iter (+/- 8365376)
notifications_protocol/litep2p/with_backpressure/256KB:  2008662039 ns/iter (+/- 78304950)
10k messages
notifications_protocol/libp2p/with_backpressure/64KB:   443843240 ns/iter (+/- 1671193)
notifications_protocol/libp2p/with_backpressure/256KB:  3564711134 ns/iter (+/- 18137773)
notifications_protocol/litep2p/with_backpressure/64KB:   485836412 ns/iter (+/- 4307439)
notifications_protocol/litep2p/with_backpressure/256KB:  3834542336 ns/iter (+/- 316935009)

@sandreim
Copy link
Contributor

Thanks @AndreiEres . Looks like libp2p is roughly 10% faster. I would not consider this a blocker since the difference is not that much and we still reap the benefits of lower CPU usage.

@lexnv WDYT ?

@AndreiEres
Copy link
Contributor Author

Image Yes, at 64KB libp2p and litep2p go very close to each other

@lexnv
Copy link
Contributor

lexnv commented Jan 15, 2025

The data looks good! Thanks @AndreiEres ! 🙏

Indeed, I don't consider this a major blocker, especially since we're only about 10% behind libp2p.
Additionally, litep2p brings a significant CPU improvement and faster times from request-response protocols, based on: https://paritytech.github.io/polkadot-sdk/bench/request_response_protocol/).
That said, we should still have a look at improving this to close the gap.

Attaching one more data point, from around ~15:25 UTC the validator was restarted with litep2p.

Image

Overall, litep2p handled 13-21% more inbound notifications (1927 litep2p vs. 1588 libp2p), while libp2p managed about 15% more outbound notifications (1228 litep2p vs. 1430 libp2p). We should also factor in the randomness of the network when interpreting these results.

@lexnv
Copy link
Contributor

lexnv commented Jan 30, 2025

Coming back with more data, the litep2p yamux component was updated to filter out any incompatibilities between libp2p and litep2p. The update brings significant improvements in upstream benchmarks (libp2p realm).

I've only tested this with 1K message and would love to run some more tests:

Before

  • collected by Andrei above
1k messages
notifications_protocol/libp2p/with_backpressure/64KB:    44421256 ns/iter (+/- 309026)
notifications_protocol/libp2p/with_backpressure/256KB:   359526580 ns/iter (+/- 2611825)

notifications_protocol/litep2p/with_backpressure/64KB:    46879477 ns/iter (+/- 327248)
notifications_protocol/litep2p/with_backpressure/256KB:   413041795 ns/iter (+/- 4181491)

After

notifications_protocol/libp2p/with_backpressure/64KB ... bench:    44405699 ns/iter (+/- 268459)
notifications_protocol/libp2p/with_backpressure/256KB ... bench:   353299153 ns/iter (+/- 2417926)
 
notifications_protocol/litep2p/with_backpressure/64KB ... bench:    47341171 ns/iter (+/- 398339)
notifications_protocol/litep2p/with_backpressure/256KB ... bench:   342768895 ns/iter (+/- 3704768)

Interpreting Data

  • We see a significant boost in performance on the litep2p/with_backpressure/256KB benchmark. The improvements make litep2p 20% faster than before, and 3% faster than libp2p
  • On the litep2p/with_backpressure/64KB dimension not significant improvement from the baseline was observed. Libp2p is ~7% faster in this regard. There's still room for improvement here and we could further fine-tune some backpressure parameters, however we'd need to take into account the real payload sizes since that's what we are aiming to improve. Definetely something to ponder about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T0-node This PR/Issue is related to the topic “node”. T8-polkadot This PR/Issue is related to/affects the Polkadot network. T12-benchmarks This PR/Issue is related to benchmarking and weights.
Projects
None yet
Development

No branches or pull requests

3 participants