-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zebra won't automatically recover after a failed RTM_NEWNEXTHOP (bug analysis) #14481
Comments
@riw777 any PRs you know of? |
The problem is still reproducible in the latest master.
After running the script:
Note the absence of Relevant log messages:
The problem is that the link carrier goes down while |
I have the similar issue, when interface flaps sometimes I get such errors and no routes installed. |
well, this is good news. I hope it will be merged soon, so I can backport it to my branch. |
Hi, I've possibly ran in the same condition, interface was considered as invalid, YET the BGP peerings are up over it, routes imported but marked as rejected. |
I can reliably reproduce the situation where zebra won't automatically recover after a failed
RTM_NEWNEXTHOP
due to aCarrier for nexthop device is down
error. I'm using a virtual device here, but we've seen the same problem on a production system after a link flap on a physical NIC.Here is the latest FRR configuration sufficient to reproduce the behaviour:
Here is the script to reproduce the bug:
After running the script I don't see the 192.168.55.55/32 route in the kernel table.
Log from zebra with debugging enabled: https://gist.github.com/sysoleg/c171e06b7cde67c2c4d06810fcae300e
The reason why
RTM_NEWNEXTHOP
failed is becausenetif_carrier_ok(dev)
is false (seertm_to_nh_config()
in the Linux kernel source). The NIC driver callsnetif_carrier_off(dev)
and this immediately sets the__LINK_STATE_NOCARRIER
device state flag. TheRTM_NEWLINK
message is not even sent at this stage, only possibly scheduled (see below). Since there is almost no delay between theecho 1
andecho 0
, the flag is set (again) before zebra starts to install nexthops into the kernel. At that moment, zebra thinks that it's OK to install the nexthop object, but it's not, and it doesn't know it.Now to the reason why automatic recovery fails.
Running the script without the last sleep (
sleep 1
) gives us threeRTM_NEWLINK
events (you can monitor these with theip monitor
coomand):So only three events (DOWN, UP, UP). If there were four (DOWN, UP, DOWN, UP) - automatic recovery would be possible after the last (fourth) UP. But in the three-event scenario zebra won't try to reinstall nexthops after the third UP because it thinks the interface is already UP (having already processed the second UP with problems installing objects into the kernel).
The reason why we only get three events is in the Linux kernel. Looking into
linkwatch_fire_event()
, which is called bynetif_carrier_off(dev)
ornetif_carrier_on(dev)
, we can see that the work, that leads to the operstate update and eventually theRTM_NETLINK
being sent, is only scheduled immediately if the event is "urgent" (link is up in our case) or if there's no linkwatch event currently pending and we're "wrapped around" (seelinkwatch_schedule_work()
). Otherwise, work is delayed by up to HZ jiffies (1 second) to (quote) "runaway driver does not cause a storm of messages on the netlink socket".If you uncomment the
sleep 1
line in the mentioned script, you'll get four events, because in this case you're delaying the last carrier "on", which allows delayed work that was created after the previous carrier "off" to update the device link status and eventually send an event while the carrier is still "off". No nexthop problems after the last UP in this case.So the kernel is working as it should: if not urgent - no more than one event per second, but as we have two consecutive UPs with no DOWN in between, the last UP is ignored by the FRR.
Any thoughts on this?
The text was updated successfully, but these errors were encountered: