-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BFD fails when running over OSPF unnumbered interfaces in a non-default VRF #15099
Comments
zebra logs for default VRF for adding 2.2.2.2/32 and 3.3.3.3/32 routes:
nexthop table for default VRF:
zebra logs for Test VRF adding 2.2.2.2/32 and 3.3.3.3/32 routes:
nh table for VRF Test:
Should there be two nexthops in the VRF Test? In the logs above we add route 2.2.2.2/32 and then deleted it and add it again with different nexthop id |
This issue is stale because it has been open 180 days with no activity. Comment or remove the |
This issue will be automatically closed in the specified period unless there is further activity. |
Describe the bug
We have run into an issue running BFD on OSPF unnumbered interfaces in a VRF. The problem that we see is that the packets aren't sent on the specific interface, but rather are routed. So, if there is an alternate path with lower cost, the packets take that path. This (understandably) confuses BFD, and in some cases we get no BFD peers ever coming up, and in other cases we get peers flapping.
It's clear that for unnumbered interfaces, BFD has to send the packet on the specific interface, rather than allowing it to be routed.
If running BFD on OSPF unnumbered interfaces in default VRF traffic passes correctly on the expected interfaces and the BFD session come up.
To Reproduce
I created a topotest scenario to setup an environment to cause the failure:
#15048
This scenario has three FRR routers in a VRF called Test. The OSPF cost between R1-R2 is 50,
where the cost between R1-R3 is 20. All BFD traffic is sent over the R1-R3 link.
If you run this scenario and look for bfd peers on R1 you will see one BFD session down and one up.
The BFD session that is supposed to run over r1-eth1 interface is down as it has a higher OSPF cost
and the BFD packets for link r1-r2 are incorrectly sent over r1-eth2.
Running tcpdump -ni r1-eth1 you will only see OSPF packets
Running tcpdump -ni r1-eth2 you will see OSPF packets and all BFD packets for both peers
When running in VRF we are learning both 2.2.2.2 and 3.3.3.3 over r1-eth2 as it has the lower OSPF cost.
Obviously with these routes, BFD will be sent on the wrong interface.
Running in the same test environment but removing VRF Test I see a routing table that I don't understand,
but makes BFD work correctly. My question is on the 2.2.2.2 and 3.3.3.3 routes why wouldn't they both have
OSPF cost like in the Test VRF? Is this something special because of unnumbered interfaces?
I would think that we would learn to reach 3.3.3.3 via rt1-eth0 as the OSPF cost would be lower.
But with these routes BFD packets are sent on the correct interfaces.
Versions
OS Version: Ubuntu 20.04.5 LTS
Kernel: 5.15.91
FRR Version: 9.1
Additional context
We have tried debugging this issue and adding logs and Socket options to make sure that FRR is attempting to send BFD packets out correct interfaces.
Different socket options have been tried to see if we are incorrectly setting them but I've not been able to fix this issue.
And it seems more of an issue on how the routes are installed in the kernel.
The text was updated successfully, but these errors were encountered: