-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FRR sometimes starts with incomplete config #15799
Comments
After debugging this some, I see watchfrr is giving up because it takes too long to read the config. From systemd journal:
Essentially, |
It looks like watchfrr has code that's supposed to handle this - I have been able to work-around this by setting |
This issue is stale because it has been open 180 days with no activity. Comment or remove the |
This issue will be automatically closed in the specified period unless there is further activity. |
I've been able to work around this, but it is a real bug for people using large configurations. Ideally, there should be a positive ack for all config reloaded, rather than just a semi-fixed and hard to debug timeout if the config is too large. |
This issue will no longer be automatically closed. |
Description
We are trying to use FRR/bgpd to originate a large number of prefixes and have run into issues where it occasionally starts up with incomplete config, especially when the system is loaded:
On our test system, the issue happens occasionally during a normal restart but is 100% reproducible when the system is CPU loaded (I run one instance of
yes > /dev/null &
for every CPU core).While trying to debug this, I noticed that the problem goes away if I add
--no-fork
to thevtysh -b
command in/usr/lib/frr/frrcommon.sh
.Version
How to reproduce
Only bgpd is enabled in /etc/frr/daemons:
frr.conf I've been using to repro, with a lot of synthetic test /32s removed:
Expected behavior
FRR should start with a complete set of config (
vtysh -c 'sh run' | wc -l
should return > 24,000 lines)Actual behavior
FRR starts without all config loaded.
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: