-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: fix a pile of directory and/or state retention related issues #15243
*: fix a pile of directory and/or state retention related issues #15243
Conversation
As found in the GNU autoconf archive https://git.savannah.gnu.org/gitweb/?p=autoconf-archive.git Signed-off-by: David Lamparter <[email protected]>
The autoconf version can just loop forever, abort instead. Signed-off-by: David Lamparter <[email protected]>
Replace inline expansion loop. (Also correctly handles prefix=NONE and exec_prefix=NONE inside the macro.) Signed-off-by: David Lamparter <[email protected]>
`--sysconfdir` should be `/etc` and `--localstatedir` should be `/var`. The package-specific subdirectory should be added by configure, not given by the user, to match established behavior by other packages. Note that `--bindir`, `--sbindir`, `--libdir` and `--libexecdir` have different established/expected behavior due to distro specific multi-arch support. That's why these are left unchanged. The reason this is getting fixed now is that we need to use `--localstatedir` for its actual value to put things in `/var/lib`. As it is now, being overloaded for `/run`, the configured `/var` path becomes inaccessible. Signed-off-by: David Lamparter <[email protected]>
`--sysconfdir` and `--localstatedir` now align with general autoconf practices. Signed-off-by: David Lamparter <[email protected]>
Also remove frr_init_vtydir(), just initialize to default. Signed-off-by: David Lamparter <[email protected]>
These paths were ignoring the `-N` namespacing option. Signed-off-by: David Lamparter <[email protected]>
This just unnecessarily complicates things by involving autoconf. Signed-off-by: David Lamparter <[email protected]>
This just unnecessarily complicates things by involving autoconf. Signed-off-by: David Lamparter <[email protected]>
This just unnecessarily complicates things by involving autoconf. Signed-off-by: David Lamparter <[email protected]>
This needs to be used for persistent state, which currently is misplaced into `/var/run` / `/run` where it gets deleted across reboots. Signed-off-by: David Lamparter <[email protected]>
These functions load daemon-specific persistent state from `/var/lib/frr` and supersede open-coded variants of similar calls in ospfd, ospf6d and isisd to save GR state and/or sequence numbers. Unlike the open-coded variants, the save call correctly `fsync()`s the saved data to ensure disk contents are consistent. Signed-off-by: David Lamparter <[email protected]>
clang-format doesn't understand FRR_DAEMON_INFO is a long macro where laying out items semantically makes sense. (Also use only one `FRR_DAEMON_INFO(` in isisd so editors don't get confused with the mismatching `( ( )`. Signed-off-by: David Lamparter <[email protected]>
This belongs in `/var/lib`, not `/var/run`. Also the filename was typo'd (`isid-restart.json`). Change to proper location and fall back to previous in case it's the first restart after an FRR update from a version with the bugged path. Signed-off-by: David Lamparter <[email protected]>
This belongs in `/var/lib`, not `/var/run`. Use library facility to load/save, support previous path as fallback, and do proper fsync(). Signed-off-by: David Lamparter <[email protected]>
Unfortunately, `ospf6d` is much worse than `ospfd` and `isisd` regarding its state saving, due to the existence of the auth trailer code. Again, this belongs in `/var/lib`, not `/var/run`. Merge both state files into one, and add reconciliation code for the auth seqno. I'm gonna save my comment on the fact that `ospf6_auth_seqno_nvm_delete` is not in fact used anywhere. Which is now a warning because it's `static`. Well. It probably should be used somewhere, so leave it in. Signed-off-by: David Lamparter <[email protected]>
Both of these belong in `/var/lib`, not `/var/run`. Rather hilariously, the history read in `mgmt_history_read_cmt_record_index` was always failing, because it was doing a `file_exists(MGMTD_COMMIT_FILE_PATH)` check. Which is the wrong macro - it's `.../commit-%s.json`, including the unprocessed `%s`, which would never exist. I guess noone ever tried if this actually works. Cool. On the plus side, this means I don't have to implement legacy compatibility for this, since it never worked to begin with. (SQLite3 DB location is also changed in this commit since it also uses `DAEMON_DB_DIR`.) Signed-off-by: David Lamparter <[email protected]>
This doesn't align with what the code actually loads, remove it. Signed-off-by: David Lamparter <[email protected]>
Use consistent `e_somepath` names for expanded versions of `somepath`. Also remove all paths from `config.h` and put them into `lib/config_paths.h` - this is to make more obvious when someone is doing something probably not quite properly structured. Signed-off-by: David Lamparter <[email protected]>
… frrbot complaining about … that's not something I touched (I guess I touched code nearby) … |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
d5dfc9b
to
bbd8589
Compare
@@ -0,0 +1,56 @@ | |||
# =========================================================================== | |||
# https://www.gnu.org/software/autoconf-archive/ax_recursive_eval.html | |||
# =========================================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason this commit doesn't replace the licensing with the spx first line while the second commit does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep the import 1:1 to the source from the GNU autoconf archive before changing it, so it's visible what exactly changed
@@ -28,20 +28,18 @@ source="$pkgname-$pkgver.tar.gz" | |||
builddir="$srcdir"/$pkgname-$pkgver | |||
|
|||
_sbindir=/usr/lib/frr | |||
_sysconfdir=/etc/frr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of developers probably have custom scripts to set this up for themselves. Do we need to have them change anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the configure script detects and removes the /frr
(before re-adding it)
configure.ac
Outdated
@@ -2740,8 +2740,6 @@ AC_DEFINE_UNQUOTED([LDPD_SOCKET], ["$CFG_STATE%s%s/ldpd.sock"], [ldpd control so | |||
AC_DEFINE_UNQUOTED([ZEBRA_SERV_PATH], ["$CFG_STATE%s%s/zserv.api"], [zebra api socket]) | |||
AC_DEFINE_UNQUOTED([BFDD_CONTROL_SOCKET], ["$CFG_STATE%s%s/bfdd.sock"], [bfdd control socket]) | |||
AC_DEFINE_UNQUOTED([OSPFD_GR_STATE], ["$CFG_STATE%s/ospfd-gr.json"], [ospfd GR state information]) | |||
AC_DEFINE_UNQUOTED([MGMTD_FE_SERVER_PATH], ["$CFG_STATE/mgmtd_fe.sock"], [mgmtd frontend server socket]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a genuine bug fix for mgmtd ( and should be backported to 9.0 and 9.1 as well right? ) Should this be in it's own PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I'm still pondering if we shouldn't just backport this entire PR; saving state to /var/run
is such a weird thing to do. But fixing that needs the entire other pile of commits too, which is a bit much…
… and because we have (historically) overloaded --localstatedir
in this stupid way I can't really "undo" or split off that change for a backport, without that /var/lib is kinda "inaccessible" …
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Case in point: if you configure OSPFv3 with authentication trailers and then reboot your box… your sequence number is gone and all the other routers refuse you into the network due to replay protection 🤡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(regardless, it's not going to be a clean backport ⇒ need manual backport PR anyways ⇒ can adjust/drop things if needed)
To be clear, this is a bugfix PR, not a cleanup PR. It just looks a bit like cleanup PR because I became exasperated enough to clean up the entire thing rather than fixing individual weirdness (and doing so was necessary because |
Adapting to FRRouting#15243 upstream FRR. Signed-off-by: David Lamparter <[email protected]>
All of these are inter-related to some degree, here's the rough grouping:
These 3 just replace some open-coded autoconf hacking with a proper autoconf function.
Main change re.
./configure
parameters:after this,
--sysconfdir=
and--localstatedir=
expect/etc
and/var
(previously/etc/frr
and/var/run/frr
). Both of these now match standard package behavior. (Thefrr
is appended by autoconf.) Thelocalstatedir
one is necessary to correctly encode/var/lib/frr
down the road. Thesysconfdir
one comes along for the ride to make packaging life easier.Note:
--sbindir
--libdir
etc. don't have the same level of standard package behavior, and different distros do different things here anyway (e.g. multi-arch libraries), that's why they're intentionally not getting the same treatment.frr_vtydir
tofrr_runstatedir
Cleanup to make things less confusing.
This was plain wrong,
mgmtd
didn't correctly do-N
.BFDD_CONTROL_SOCKET
ZEBRA_SERV_PATH
LDPD_SOCKET
autoconf simplifications, there's no benefit in plumbing these through autoconf
frr_libstatedir
Main change: start using
/var/lib/frr
.Any state-saving features in FRR need to go to
/var/lib/frr
, not/var/run/frr
./var/run
is cleared out across reboots and therefore unsuitable for these. The daemons will create the directory (with correct permissions, which arefrr:frr 0700
) if it doesn't exist.frr_daemon_state_{load,save}
Extracting common code from isisd, ospfd and ospf6d, and fixing state sync writes with proper sync-to-disk behavior.
frr_daemon_info
indentationIndentation fix, here because I'm touching these in the next commits.
Proper state save and load using libfrr common functions and
/var/lib/frr
.Same kind of problem,
mgmtd
was saving things into places that get erased across reboots. Oops.Just an incorrect printout.
This is, TBFH, mostly a guardrail against future issues of the same kind. The
./configure
paths are moved intolib/config_paths.h
, which is a red flag if you see it included somewhere else, as noted in the file.d5dfc9b doc, build: shorten too long filenameFreebie at the end. This causedRemoved due to conflict.make dist
to fail.I tried actually running FRR for a change. It did not go well. I ended up stepping in a few piles of 💩 in the process.
The specific thing that made this cross the use-a-💩-emoji threshold is in bfd6d8e which clearly never worked to begin with and therefore cannot have been tested at all.