Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Substitute poll() with epoll() in libfrr to boost performance in BGP anomalies simulation #12852

Open
ponedo opened this issue Feb 21, 2023 · 7 comments

Comments

@ponedo
Copy link
Contributor

ponedo commented Feb 21, 2023

Problem overview

In all currently available FRR version, thread.c and thread.h in libfrr used poll() function to poll file descriptors. We advise to substitute it with epoll() to enhance scalability.

Motivation

We are trying run BGP abnormal events simulation with FRR routers. In our scenario, we aim to run plenty of (hundreds/thousands of) FRR routers, which are mutually TCP connected to simulate BGP peer topology in reality, on a COTS server.

Experiment

Currently we have not simulate BGP peer topology in reality yet. Rather, we configure a fullmesh BGP connection between all FRR routers for test purpose. Also we have not launch route announcement and withdrawal yet. Only BGP keepalive packets are sent between FRR BGP peers.
Environment: AMD EPYC 7742 64-Core Processor(Only 16 cores used); 660GB RAM; CentOS7; Linux kernel version 5.17.3-1.el7.elrepo.x86_64; FRR version 8.0.
FRR BGP Configuration: fullmesh connection; keepalive interval: 1s; holdtime: 3s.

Observation

We found poor scalability when deploying increasing number of FRR routers. As seen from figure below, CPU util increases quadratically at the early stage of scaling. After 250 peers deployed, the CPU util increasing slows down. However at this point FRR peers become stressful. Finally, disconnection between peers happens after 384 peers being deployed.

image

Analysis

We drew flamegraph of BGP threads of FRR, and found poll() function was the bottleneck. As shown below

image

poll() function scans all file descriptors every time it is called. In our case, these file descriptors are mostly TCP sockets established with BGP peers. As the number of peers deployed (denoted by n) increases, every poll() call consumes O(n) time, leading to O(n) CPU util for each peer. Multiplied by n(number of BGP threads doing poll()), the system-wide CPU util is O(n^2), matching the observation above. To comfirm tha poll() introduces great overhead, we measured average running time of per poll() call. As shown below, the call time did increase linearly with peer number.

image

However, scanning all TCP sockets is a waste in this scenario, since keepalive interval (1s) is long. At most time, each TCP socket is not ready for reading. Therefore, substituting poll() with epoll APIs will help reduce CPU overhead significantly.

Solution (prototype)

We re-wrote thread.c and thread.h in libfrr with epoll APIs, and ran experiement again. As shown below, CPU overhead is greatly reduced (nearly 67%).

image

We provide our code here. However our code was just a prototype, since Vtysh command such as show_thread_poll_cmd was not implemented, and bugs may exist. If the community feels this issue important, we can complete the code.
code.zip

@ton31337
Copy link
Member

Great analysis! Could you create a pull request following our guide?

@ponedo
Copy link
Contributor Author

ponedo commented Feb 21, 2023

Sure. But I haven't implement show_thread_epoll vtysh command and there may exist bugs. Is that ok?

@ton31337
Copy link
Member

But I haven't implement show_thread_epoll vtysh command and there may exist bugs

It would be better if we could see some debugging information.

@eqvinox
Copy link
Contributor

eqvinox commented Feb 21, 2023

Thanks a lot for the cool analysis & data! Getting input like this is very useful.

We re-wrote thread.c and thread.h in libfrr with epoll APIs

Could you share your changes? You can open a pull request in "draft" state, it absolutely does not need to be "perfect". It would be great to allow others to make some tests / comparisons too.

@ponedo
Copy link
Contributor Author

ponedo commented Feb 21, 2023

I have opened PR at https://github.com/FRRouting/frr/pull/12855. It seemed failing some checks. I will update according to guidelines soon.

@ponedo
Copy link
Contributor Author

ponedo commented Feb 22, 2023

@ton31337 @eqvinox
The PR link above is wrong, and 12855 PR was discarded mistakenly somehow.
I have opened a new PR (#12862). I encoutered a compatibility problem and I will explain in the PR page.

@ponedo
Copy link
Contributor Author

ponedo commented Nov 20, 2023

Hello, dear participants @ton31337 @eqvinox @qlyoung,
Recently we are using FRRouting for routing system emulation as mentioned above, and still observing high cpu usage of ppoll system call when the scale of emulated network is large:

17b531aef66e75a357582b0825c8779

The PR has been opened for 9 months and seems to be a little bit slow in terms of progressing. But I think it will help a lot for FRR to be used in more scenarios, such as network emulation, if we could continue with this PR to optimize the polling method. I'd really appreciate it if you could review the code soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants