-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Substitute poll() with epoll() in libfrr to boost performance in BGP anomalies simulation #12852
Comments
Great analysis! Could you create a pull request following our guide? |
Sure. But I haven't implement show_thread_epoll vtysh command and there may exist bugs. Is that ok? |
It would be better if we could see some debugging information. |
Thanks a lot for the cool analysis & data! Getting input like this is very useful.
Could you share your changes? You can open a pull request in "draft" state, it absolutely does not need to be "perfect". It would be great to allow others to make some tests / comparisons too. |
I have opened PR at https://github.com/FRRouting/frr/pull/12855. It seemed failing some checks. I will update according to guidelines soon. |
Hello, dear participants @ton31337 @eqvinox @qlyoung, The PR has been opened for 9 months and seems to be a little bit slow in terms of progressing. But I think it will help a lot for FRR to be used in more scenarios, such as network emulation, if we could continue with this PR to optimize the polling method. I'd really appreciate it if you could review the code soon. |
Problem overview
In all currently available FRR version, thread.c and thread.h in libfrr used poll() function to poll file descriptors. We advise to substitute it with epoll() to enhance scalability.
Motivation
We are trying run BGP abnormal events simulation with FRR routers. In our scenario, we aim to run plenty of (hundreds/thousands of) FRR routers, which are mutually TCP connected to simulate BGP peer topology in reality, on a COTS server.
Experiment
Currently we have not simulate BGP peer topology in reality yet. Rather, we configure a fullmesh BGP connection between all FRR routers for test purpose. Also we have not launch route announcement and withdrawal yet. Only BGP keepalive packets are sent between FRR BGP peers.
Environment: AMD EPYC 7742 64-Core Processor(Only 16 cores used); 660GB RAM; CentOS7; Linux kernel version 5.17.3-1.el7.elrepo.x86_64; FRR version 8.0.
FRR BGP Configuration: fullmesh connection; keepalive interval: 1s; holdtime: 3s.
Observation
We found poor scalability when deploying increasing number of FRR routers. As seen from figure below, CPU util increases quadratically at the early stage of scaling. After 250 peers deployed, the CPU util increasing slows down. However at this point FRR peers become stressful. Finally, disconnection between peers happens after 384 peers being deployed.
Analysis
We drew flamegraph of BGP threads of FRR, and found poll() function was the bottleneck. As shown below
poll() function scans all file descriptors every time it is called. In our case, these file descriptors are mostly TCP sockets established with BGP peers. As the number of peers deployed (denoted by n) increases, every poll() call consumes O(n) time, leading to O(n) CPU util for each peer. Multiplied by n(number of BGP threads doing poll()), the system-wide CPU util is O(n^2), matching the observation above. To comfirm tha poll() introduces great overhead, we measured average running time of per poll() call. As shown below, the call time did increase linearly with peer number.
However, scanning all TCP sockets is a waste in this scenario, since keepalive interval (1s) is long. At most time, each TCP socket is not ready for reading. Therefore, substituting poll() with epoll APIs will help reduce CPU overhead significantly.
Solution (prototype)
We re-wrote thread.c and thread.h in libfrr with epoll APIs, and ran experiement again. As shown below, CPU overhead is greatly reduced (nearly 67%).
We provide our code here. However our code was just a prototype, since Vtysh command such as show_thread_poll_cmd was not implemented, and bugs may exist. If the community feels this issue important, we can complete the code.
code.zip
The text was updated successfully, but these errors were encountered: