-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BFD session down does not trigger BGP session down #14266
Comments
Did you try explicitly defining an interface for bfd peer? |
Yes, the result is the same. Extra info: I have a second BFD and BGP peer in a different subnet but reachable in the same L2 network. I assume that would not interfere, right? |
Could you also |
This was captured while BFD is down and BGP is still up. |
Interface names are different: enp0s1f0d3 vs. enp0s1f0d1. Is this expected? |
Interesting. The interface name in the log message seems wrong; it remains the same even if I configure the BFD peer with explicit interface:
Addresses and routes:
|
Can you show also |
|
Don't you block something? |
As in: firewall? Not that I am aware of. What would have to be blocked to cause this? |
Can you check what packets do you see in tcpdump for those interfaces? If you see BFD packets or not at all. |
Yes, but different on both ports.
|
Are these interfaces enp0s1f0d1/enp0s1f0d3 physically separate or is this a breakout cable? (Just trying to gather information before trying to replicate it locally). |
Capture on enp0s1f0d1 looks like broken packets. Let me investigate our data plane. It is non-standard. |
And I see only the packets one-way for BFD... |
The messages were not corrupted; that must have been a problem of tcpdump. In Wireshark they looked OK. Messages were only seen in one direction because of asymmetric routing. |
Hi, I observed this issue in my environment as well and wondering if this is expected since ticket is closed stating peers in separate vlan solves the problem. Look at below. computebgp-0# show ip bgp neighbors For address family: IPv4 Unicast For address family: IPv6 Unicast Connections established 34; dropped 33 BFD: Type: single hop computebgp-0# show bfd peers My config is: My usecase here is very simple, unnumbered iBGP session between 2 peers connected via uplink switch in same vlan. I brought down BFD purposefully here. However, bringing BFD up/down does not impact BGP session. Shall we re-open the ticket and fix this issue? Thanks |
The problem has reoccurred for us as well, in a different topology, even though the interfaces are pairwise isolated on different VLANs. So this was not the solution. In our case there is asymmetrical routing, i.e. from FRR perspective, incoming and outgoing BFD (and BGP) packets are on different logical interfaces associated with the same physical port. While the physical port is up, I continuously get the following message in the log: When I bring down the interface, this log message stops, the BFD peer is immediately down, but the BGP peer stays up, as reported before. I tried configuring the BFD session as multihop, but with the same result. Our FRR version is now 8.5.4. |
Can you give us the configuration and a quick shot of how the interfaces look like? |
Here are the configs and a printout of the interfaces. When I bring down the link on RouterB, BFD goes down immediately, but BGP stays up until the holdtime expires. [Edit] Adding captures of the interfaces: RouterA: RouterB: |
@sbrs3 how is enp0s1f0d4 related to enp0s1f0d1? I'm not sure I understood the topology here. |
I'm not sure yet regarding RouterA configuration. Is there a bridge or what is the actual underlaying configuration? Can you show the |
There is a hardware pipeline connecting enp0s1f0d1/4 with the physical port. It forwards outgoing packets transparently. For incoming packets, it rewrites the DMAC from "phy port MAC" to "enp0s1f0d4 MAC". Could that be a problem for BFD? There are no other changes done to the packet. You can also see what arrives on the other end (Router B ens3) in the capture I uploaded above. Router A's control plane module is an ARM platform running Rocky Linux 9.3/kernel 5.15 and FRR 8.5.4 compiled for ARM. I am uploading the Network Manager systemconnection files for the two above interfaces. |
I think I found the cause of the problem in our case. If I comment out following lines in lib/bfd.c, the coupling between BFD and BGP works:
Now this is obviously a solution for our specific case with two interface for ingress and egress traffic. So I am wondering what is the above if-clause needed for, and what are the side effects when removing it? |
I'm not an expert on BFD, actually, we need to check what RFCs say on such a scenario (if any). Maybe @rzalamena has an opinion here? |
Yes, BFD expects to receive the packet from the same (OS) interface it sent. The way your OS is configured it makes it look like you are doing LACP (Link Aggregation) from the BFD point of view and that is not going to work. We also don't support RFC 7130 yet. A possible hack would be ignore the interface name as you suggested (e.g. make a configuration knob for it), but I'm not sure what others might think of this option (or how common it is to have such data plane). You also have the option to implement your own BFD data plane handler using distributed BFD and hide these details from FRR. There is a library to help build distributed BFD data plane handlers here. |
I think you have just given the best argument for such a knob, namely the current lack of support for BFD on LAG. But there could also be other scenarios of asymmetric routing with multiple links that are not in a LAG. So a configuration knob like the following could benefit multiple usecases:
If "ignore-interface-names" is specified, the BFD implementation would simply skip the interface name check in the code that I quoted above. |
The test is really just comparing interfaces - I think the code that tests the interface name is sort of ... internal, it's not really the right thing to try to expose in a config. Maybe the config should be
|
Could you check the latest master? We pushed one fix regarding BFD/BGP integration (when the session is down/admin down). |
How to reproduce this issue,please provide configuration steps. |
Hi @ton31337 Unfortunately, i dont have environment right now and moved on to different project. My config is: My usecase here is very simple, unnumbered iBGP session between 2 peers connected via uplink switch in same vlan. I brought down BFD purposefully here. However, bringing BFD up/down does not impact BGP session. Thanks |
It should be. |
Describe the bug
BFD does not seem to notify BGP when a peer goes down.
I configure a BFD peer with default parameters and use the same peer IP in an eBGP session as follows:
The result is: when the remote peer interface (172.16.1.7) goes down, the BFD session is taken down within 1 second. The BGP session however stays up until the BGP timer expires (~ 3 minutes). I see following messages from BGPD:
It seems that the function zclient_bfd_session_update in bfd.c skips the BGP session. Why?
Versions
The text was updated successfully, but these errors were encountered: