-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BMP rework - 2 BMP fixes + some rework #17307
Conversation
20d62ea
to
e8eb315
Compare
When a BGP listener configured with BMP receives the first BGP IPv6 update from a connected BGP IPv6 peer, the BMP collector receives a withdraw post-policy message. > {"peer_type": "route distinguisher instance", "policy": "post-policy", > "ipv6": true, "peer_ip": "192:167::3", "peer_distinguisher": "444:1", > "peer_asn": 65501, "peer_bgp_id": "192.168.1.3", "timestamp": > "2024-10-29 11:44:47.111962", "bmp_log_type": "withdraw", "afi": 2, > "safi": 1, "ip_prefix": "2001::1125/128", "seq": 22} > {"peer_type": "route distinguisher instance", "policy": "pre-policy", > "ipv6": true, "peer_ip": "192:167::3", "peer_distinguisher": "444:1", > "peer_asn": 65501, "peer_bgp_id": "192.168.1.3", "timestamp": > "2024-10-29 11:44:47.111963", "bmp_log_type": "update", "origin": > "IGP", "as_path": "", "afi": 2, "safi": 1, "nxhp_ip": "192:167::3", > "nxhp_link-local": "fe80::7063:d8ff:fedb:9e11", "ip_prefix": "2001::1125/128", "seq": 23} Actually, the BGP update is not valid, and BMP considers it as a withdraw message. The BGP upate is not valid, because the nexthop reachability is unknown at the time of reception, and no other BMP message is sent. Fix this by re-sending a BMP post update message when nexthop tracking becomes successfull. Generalise the re-sending of messages when nexthop tracking changes. Signed-off-by: Philippe Guibert <[email protected]>
There is no need to create the bgp bmp instance, while there is no bmp target to send data to. Rewrite the code by using bgp_bmp_find() API instead. Signed-off-by: Philippe Guibert <[email protected]>
The bgp_bmp_get() call is not necessary if the buffer size limit of bmp is the default value. Reversely, the no form of the config command should not allocate an unnecessary bgpbmp structure. Fix this by controlling the passed value before doing the allocation. Signed-off-by: Philippe Guibert <[email protected]>
The bmpbgp structure is used to store the mirror limit and the bmp target config of a given bgp instance (lets say X). It is wishable to share the bmpbgp structure with a separate BGP instance (lets say Y) that wants to import the BMP information from X. In the case where Y is deleted, the reference to the bmpbgp structure of X can be deleted, but not the bmpbgp structure of X. Fix this by using a reference counter that will be used for each usage: mirror limit, bmp target list, and later vrf importation. Signed-off-by: Philippe Guibert <[email protected]>
The following ASAN error can be seen. > ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x608000036c20 > #0 0x7f3d7a4b5425 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:198 > FRRouting#1 0x7f3d7a426a16 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common > /sanitizer_stacktrace.h:122 > FRRouting#2 0x7f3d7a426a16 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1074 > FRRouting#3 0x7f3d7a03f330 in mt_count_free lib/memory.c:78 > FRRouting#4 0x7f3d7a03f330 in qfree lib/memory.c:130 > FRRouting#5 0x7f3d76ccf89b in bmp_peer_status_changed bgpd/bgp_bmp.c:982 > FRRouting#6 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#7 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#8 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#9 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#10 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#11 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#12 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#13 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#14 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#15 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#16 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#17 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#18 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#19 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#20 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f3d79a29e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x560ae29e4ef4 in _start (/usr/lib/frr/bgpd+0x2eeef4) > > 0x608000036c20 is located 0 bytes inside of 81-byte region [0x608000036c20,0x608000036c71) > freed by thread T0 here: > #0 0x7f3d7a4b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x7f3d76ccf85f in bmp_peer_status_changed bgpd/bgp_bmp.c:981 > FRRouting#2 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#3 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#4 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#5 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#6 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#7 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#8 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#9 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#10 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#11 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#12 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#13 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#14 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#16 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7f3d7a4b4887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 > FRRouting#1 0x7f3d7a03f0e9 in qmalloc lib/memory.c:101 > FRRouting#2 0x7f3d76cd0166 in bmp_bgp_peer_vrf bgpd/bgp_bmp.c:2194 > FRRouting#3 0x7f3d76cd0166 in bmp_bgp_update_vrf_status bgpd/bgp_bmp.c:2236 > FRRouting#4 0x7f3d76cd29b8 in bmp_vrf_state_changed bgpd/bgp_bmp.c:3479 > FRRouting#5 0x560ae2c45b34 in hook_call_bgp_instance_state bgpd/bgpd.c:88 > FRRouting#6 0x560ae2c4d158 in bgp_instance_up bgpd/bgpd.c:3936 > FRRouting#7 0x560ae29e5ed1 in bgp_vrf_enable bgpd/bgp_main.c:299 > FRRouting#8 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:286 > FRRouting#9 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:275 > FRRouting#10 0x7f3d7a12ab66 in zclient_vrf_add lib/zclient.c:2561 > FRRouting#11 0x7f3d7a12eb43 in zclient_read lib/zclient.c:4624 > FRRouting#12 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#13 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#15 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Signed-off-by: Philippe Guibert <[email protected]>
e8eb315
to
3447077
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so just making the general comment that we have historically done a poor job with refcounting. I think it would be much better to get an understanding of the goal, the proposal that you think requires refcounting, before taking such a change.
if (bmpbgp->mirror_qsizelimit == ~0UL) | ||
return; | ||
bmpbgp->mirror_qsizelimit = ~0UL; | ||
bmp_bgp_put(bmpbgp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so ... won't this deref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have a better design pattern to apply to my use case, you are welcome:
I can try to apply it.
For the reason why I do this:
It is wishable to share the bmpbgp structure with a
separate BGP instance (lets say Y) that wants to import
the BMP information from X.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be a reasonable question, I guess, but it's not a response to the review comment? The review comment is "this code is wrong".
The irony is that the general comment was "this implementation choice is hard to get right", and ... in fact it's not right, even in this first effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, waiting still on Mark's comment.
So some ideas about this:
I'm not sure I understand what "share the bmp structure" means. Do you want to share configuration - from parent to children, that sort of thing? or do you want to use a single bmp connection to multiplex updates from multiple bgp instances? One choice would be to have X know about the set of its children or dependents {Y, A, B, ...} and push changes to those objects. |
Case one single BMP connection to multiplex updates from multiple bgp instances.
Not this choice.
Even if BMP from X is not configured, the instance Y needs to get access from it.
Either I reuse it, or I duplicate vrf_state in the imported list structure. |
I remove the lock idea.. closing this PR, creating new one: #17373 |
Why not update this PR just? |
A separate pull request will be put in place to add the following in BMP:
ability to import BMP information from a separate BGP instance.
This will be useful for L3VPN setups.
Some commits prepare the code for to this feature. It includes the ability to share the bgpbmp structure for various usages
A commit to re-send post-policy BGP updates when nexthop reachability is changed.
A commit related to an ASAN issue encountered with rx_open and tx_open buffers.