-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backpressure bgp zebra client #15524
backpressure bgp zebra client #15524
Conversation
6562f20
to
0c698c5
Compare
I am converting this to a draft because I believe we are discussing some problems found internally |
EVPN code changes have issues which are being discussing internally. This review is Non evpn code change. EVPN will be either as a seperate commit in this review or a different MR on top of this code base. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we stop the thread (t_bgp_zebra_route
) on BGP termination/stop?
0c698c5
to
4cc141b
Compare
Agree.. i missed this . Fixing this now |
RUN_CI |
cac4b07
to
e458144
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, waiting on other's comments
Internal Test for ref: OBSERVATION:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question on bgp_path_info_lock()
, see inline.
one nit (not a blocker): Just not sure about the naming of bgp_zebra_route_install()
. Previously it was easier to read with bgp_zebra_announce()
, and bgp_zebra_withdraw()
. Can't we preserve the naming it was before? By converting to macros at least:
#define bgp_zebra_announce(...) bgp_zebra_route_install(..., true)
#define bgp_zebra_withdraw(...) bgp_zebra_route_install(..., false)
Other than that - LGTM.
e458144
to
27d5cdb
Compare
In the next MR, the changes will be for evpn and we will use evpn_zebra_install and evpn_zebra_uninstall in this function.. |
RUN_CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify the bgp master to hold a type safe list for bgp_dests that need to be passed to zebra. Future commits will use this. Ticket: #3390099 Signed-off-by: Donald Sharp <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
Since installing/withdrawing routes into zebra is going to be changed around to be dest based in a list, - Retrieve the afi/safi to use based upon the dest's afi/safi instead of passing it in. - Prefix is known by the dest. Remove this arg as well Ticket: #3390099 Signed-off-by: Donald Sharp <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
27d5cdb
to
630a2ce
Compare
BGP is now keeping a list of dests with the dest having a pointer to the bgp_path_info that it will be working on. 1) When bgp receives a prefix, process it, add the bgp_dest of the prefix into the new Fifo list if not present, update the flags (Ex: earlier if the prefix was advertised and now it is a withdrawn), increment the ref_count and DO NOT advertise the install/withdraw to zebra yet. 2) Schedule an event to wake up to invoke the new function which will walk the list one by one and installs/withdraws the routes into zebra. a) if BUFFER_EMPTY, process the next item on the list b) if BUFFER_PENDING, bail out and the callback in zclient_flush_data() will invoke the same function when BUFFER_EMPTY Changes - rename old bgp_zebra_announce to bgp_zebra_announce_actual - rename old bgp_zebra_withdrw to bgp_zebra_withdraw_actual - Handle new fifo list cleanup in bgp_exit() - New funcs: bgp_handle_route_announcements_to_zebra() and bgp_zebra_route_install() - Define a callback function to invoke bgp_handle_route_announcements_to_zebra() when BUFFER_EMPTY in zclient_flush_data() The current change deals with bgp installing routes via bgp_process_main_one() Ticket: #3390099 Signed-off-by: Donald Sharp <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
630a2ce
to
ccfe452
Compare
…rns (#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
…rns (sonic-net#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD/DEL) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864371 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L3 VNI ADD/DEL from zebra, - Walking the entire global routing table per L3VNI is very expensive. - The next read (say of another VNI ADD/DEL) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L3VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the BGP-VRF off the struct bgp_master and maintain a struct bgp FIFO list which is processed later on, where we walk a chunk of BGP-VRFs and do the remote route install/uninstall. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864371 Signed-off-by: Rajasekar Raja <[email protected]>
…rns (sonic-net#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
…rns (#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Anytime BGP gets a L2 VNI ADD from zebra, - Walking the entire global routing table per L2VNI is very expensive. - The next read (say of another VNI ADD) from the socket does not proceed unless this walk is complete. So for triggers where a bulk of L2VNI's are flapped, this results in huge output buffer FIFO growth spiking up the memory in zebra since bgp is slow/busy processing the first message. To avoid this, idea is to hookup the VPN off the bgp_master struct and maintain a VPN FIFO list which is processed later on, where we walk a chunk of VPNs and do the remote route install. Note: So far in the L3 backpressure cases(FRRouting#15524), we have considered the fact that zebra is slow, and the buffer grows in the BGP. However this is the reverse i.e. BGP is very busy processing the first ZAPI message from zebra due to which the buffer grows huge in zebra and memory spikes up. Ticket :#3864372 Signed-off-by: Rajasekar Raja <[email protected]>
Modify the bgp master to hold a type safe list for bgp_dests that need to be passed to zebra.
Since installing/withdrawing routes into zebra is going to be changed around to be dest based in a list,
BGP is now keeping a list of dests with the dest having a pointer to the bgp_path_info that it will be working on.
When bgp receives a prefix, process it, add the bgp_dest of the prefix into the new Fifo list if not present, update the flags (Ex: earlier if the prefix was advertised and now it is a withdrawn), increment the ref_count and DO NOT advertise the install/withdraw to zebra yet.
Schedule an event to wake up to invoke the new function which will walk the list one by one and installs/withdraws the routes into zebra.
a) if BUFFER_EMPTY, process the next item on the list
b) if BUFFER_PENDING, bail out and the callback in
zclient_flush_data() will invoke the same function when BUFFER_EMPTY
Changes
The current change deals with bgp installing routes via bgp_process_main_one()
Will add EVPN changes in a seperate commit shortly.