Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to handle noprefixroute to zebra #14957

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

donaldsharp
Copy link
Member

@donaldsharp donaldsharp commented Dec 6, 2023

See individual commits, but you can mark a route as a noprefixroute and the connected route will not appear in the local or routing table. Let's honor it.

Closes #14952

The linux kernel can send up a flag that tells us that the
connected address is not a PREFIXROUTE.  Add the ability
to note this and pass it up from the data plane.

Signed-off-by: Donald Sharp <[email protected]>
Add ability for the connected routes to know
if they are a prefix route or not.

sharpd@eva:/work/home/sharpd/frr1$ ip addr show dev dummy1
13: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether aa:93:ce:ce:3f:62 brd ff:ff:ff:ff:ff:ff
    inet 192.168.55.1/24 scope global noprefixroute dummy1
       valid_lft forever preferred_lft forever
    inet 192.168.56.1/24 scope global dummy1
       valid_lft forever preferred_lft forever
    inet6 fe80::a893:ceff:fece:3f62/64 scope link
       valid_lft forever preferred_lft forever

sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show int dummy1"
Interface dummy1 is up, line protocol is up
  Link ups:       0    last: (never)
  Link downs:     0    last: (never)
  vrf: default
  index 13 metric 0 mtu 1500 speed 0 txqlen 1000
  flags: <UP,BROADCAST,RUNNING,NOARP>
  Type: Ethernet
  HWaddr: aa:93:ce:ce:3f:62
  inet 192.168.55.1/24 noprefixroute
  inet 192.168.56.1/24
  inet6 fe80::a893:ceff:fece:3f62/64
  Interface Type Other
  Interface Slave Type None
  protodown: off

sharpd@eva:/work/home/sharpd/frr1$ sudo vtysh -c "show ip route"
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:00:08
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr2 linkdown, 00:00:08
L>* 192.168.44.1/32 is directly connected, dummy2, 00:00:08
L>* 192.168.55.1/32 is directly connected, dummy1, 00:00:08
C>* 192.168.56.0/24 is directly connected, dummy1, 00:00:08
L>* 192.168.56.1/32 is directly connected, dummy1, 00:00:08
L>* 192.168.119.205/32 is directly connected, enp13s0, 00:00:08

sharpd@eva:/work/home/sharpd/frr1$ ip route show
default via 192.168.119.1 dev enp13s0 proto dhcp metric 100
169.254.0.0/16 dev virbr2 scope link metric 1000 linkdown
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.45.0/24 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
192.168.56.0/24 dev dummy1 proto kernel scope link src 192.168.56.1
192.168.119.0/24 dev enp13s0 proto kernel scope link src 192.168.119.205 metric 100
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown

sharpd@eva:/work/home/sharpd/frr1$ ip route show table 255
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
local 172.17.0.1 dev docker0 proto kernel scope host src 172.17.0.1
broadcast 172.17.255.255 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
local 192.168.44.1 dev dummy2 proto kernel scope host src 192.168.44.1
broadcast 192.168.44.255 dev dummy2 proto kernel scope link src 192.168.44.1
local 192.168.45.1 dev virbr2 proto kernel scope host src 192.168.45.1
broadcast 192.168.45.255 dev virbr2 proto kernel scope link src 192.168.45.1 linkdown
local 192.168.55.1 dev dummy1 proto kernel scope host src 192.168.55.1
broadcast 192.168.55.255 dev dummy1 proto kernel scope link src 192.168.55.1
local 192.168.56.1 dev dummy1 proto kernel scope host src 192.168.56.1
broadcast 192.168.56.255 dev dummy1 proto kernel scope link src 192.168.56.1
local 192.168.119.205 dev enp13s0 proto kernel scope host src 192.168.119.205
broadcast 192.168.119.255 dev enp13s0 proto kernel scope link src 192.168.119.205
local 192.168.122.1 dev virbr0 proto kernel scope host src 192.168.122.1
broadcast 192.168.122.255 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown

Fixes: FRRouting#14952
Signed-off-by: Donald Sharp <[email protected]>
Add a simple test case to ensure that the noprefixroute
code stays working in the future.

Signed-off-by: Donald Sharp <[email protected]>
@frrbot frrbot bot added bugfix tests Topotests, make check, etc zebra labels Dec 6, 2023
@ton31337
Copy link
Member

ton31337 commented Dec 6, 2023

@Mergifyio backport stable/9.1 stable/9.0

Copy link

mergify bot commented Dec 6, 2023

backport stable/9.1 stable/9.0

✅ Backports have been created

@donaldsharp
Copy link
Member Author

is this more of a feature or a bug?

@ton31337
Copy link
Member

ton31337 commented Dec 6, 2023

to me it sounds like a bug 🤷‍♂️

@ton31337 ton31337 merged commit 24869b4 into FRRouting:master Dec 7, 2023
80 checks passed
donaldsharp added a commit that referenced this pull request Dec 7, 2023
Add ability to handle `noprefixroute` to zebra (backport #14957)
donaldsharp added a commit that referenced this pull request Dec 7, 2023
Add ability to handle `noprefixroute` to zebra (backport #14957)
@Jafaral Jafaral added the release-notes should be added to release notes label Jan 3, 2024
@github-actions github-actions bot added the rebase PR needs rebase label Jan 3, 2024
tohojo added a commit to tohojo/frr that referenced this pull request Jul 17, 2024
When importing routes from the kernel, the zebra daemon ignores any routes
marked as 'proto kernel', such as the link-scoped routes that the kernel
generates for addresses assigned to interfaces. Instead, zebra implements its
own logic to synthesise routes for each address assignment, installing them into
the RIB with the ZEBRA_ROUTE_CONNECT proto set.

This behaviour requires zebra to mirror the logic of the kernel, to avoid having
the kernel FIB diverge from the FRR RIB, which can cause routing loops or other
failures. One example of this was the recent addition of support for the
'noprefixroute' flag to zebra[0].

However, attempting to mirror the kernel behaviour this way causes problems when
the mirroring is imperfect. An example of this was seen as a result of the
change mentioned above, where zebra honouring the noprefixroute flag leads to
routes missing from the RIB in some cases. Specifically, this happens when
network management daemons set the noprefixroute on the address assignment, but
subsequently installs a link-scoped route into the kernel identical to the
prefix route the kernel would have installed automatically. The use case for
this is enable the network management daemon to atomically change route
attributes (such as route metric) on the prefix route, but otherwise keep the
behaviour identical to the case where the kernel creates the prefix route
itself.

The failure described above was noticed for NetworkManager and reported as a
NetworkManager bug[1] as well as an FRR issue[2]. Other network management
daemons use the noprefixroute flag for similar purposes (e.g.,
systemd-networkd[3]).

[0] FRRouting#14957
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1452
[2] FRRouting#16101
[3] https://github.com/systemd/systemd/blob/main/src/network/networkd-dhcp4.c#L962

To resolve this discrepancy between the kernel FIB and the FRR RIB, this patch
changes zebra's behaviour to import 'proto kernel' instead of ignoring them, and
to treat routes with 'scope link' as ZEBRA_ROUTE_CONNECT routes, just like the
ones synthesised by zebra itself. This allows the noprefixroute flag to work
correctly, while still playing nice with network management daemons that install
a different link-scope route for installed addresses. The change in behaviour
can be seen from the following example:

Kernel config:
5: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen
1000
    link/ether fe:da:bb:eb:74:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.11.1.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.12.0.0/24 scope global noprefixroute veth0
       valid_lft forever preferred_lft forever

10.11.0.0/16 via 10.11.1.1 dev veth0
10.11.1.0/24 dev veth0 proto kernel scope link src 10.11.1.2
10.12.0.0/24 dev veth0 proto kernel scope link metric 100

The 10.12.0.0/24 route was manually added with:

Running zebra, pre-patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:22
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:22
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:22
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:22

Notice that the 10.12.0.0/24 route is missing from the RIB.

After the patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:05
C * 10.11.1.0/24 is directly connected, veth0, 00:00:05
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:05
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:05
C>* 10.12.0.0/24 [0/100] is directly connected, veth0, 00:00:05
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:05

The prefix is now shown as connected (C>) as it should. Note also that the other
prefix (10.11.1.0/24, without the noprefix flag) now appears twice, because it's
both created by zebra from the interface config, and imported from the kernel.
This is harmless as the routes are identical, and an arbitrary one just ends up
being selected.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
tohojo added a commit to tohojo/frr that referenced this pull request Jul 17, 2024
When importing routes from the kernel, the zebra daemon ignores any routes
marked as 'proto kernel', such as the link-scoped routes that the kernel
generates for addresses assigned to interfaces. Instead, zebra implements its
own logic to synthesise routes for each address assignment, installing them into
the RIB with the ZEBRA_ROUTE_CONNECT proto set.

This behaviour requires zebra to mirror the logic of the kernel, to avoid having
the kernel FIB diverge from the FRR RIB, which can cause routing loops or other
failures. One example of this was the recent addition of support for the
'noprefixroute' flag to zebra[0].

However, attempting to mirror the kernel behaviour this way causes problems when
the mirroring is imperfect. An example of this was seen as a result of the
change mentioned above, where zebra honouring the noprefixroute flag leads to
routes missing from the RIB in some cases. Specifically, this happens when
network management daemons set the noprefixroute on the address assignment, but
subsequently installs a link-scoped route into the kernel identical to the
prefix route the kernel would have installed automatically. The use case for
this is enable the network management daemon to atomically change route
attributes (such as route metric) on the prefix route, but otherwise keep the
behaviour identical to the case where the kernel creates the prefix route
itself.

The failure described above was noticed for NetworkManager and reported as a
NetworkManager bug[1] as well as an FRR issue[2]. Other network management
daemons use the noprefixroute flag for similar purposes (e.g.,
systemd-networkd[3]).

[0] FRRouting#14957
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1452
[2] FRRouting#16101
[3] https://github.com/systemd/systemd/blob/main/src/network/networkd-dhcp4.c#L962

To resolve this discrepancy between the kernel FIB and the FRR RIB, this patch
changes zebra's behaviour to import 'proto kernel' instead of ignoring them, and
to treat routes with 'scope link' as ZEBRA_ROUTE_CONNECT routes, just like the
ones synthesised by zebra itself. This allows the noprefixroute flag to work
correctly, while still playing nice with network management daemons that install
a different link-scope route for installed addresses. The change in behaviour
can be seen from the following example:

Kernel config:
5: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen
1000
    link/ether fe:da:bb:eb:74:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.11.1.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.12.0.0/24 scope global noprefixroute veth0
       valid_lft forever preferred_lft forever

10.11.0.0/16 via 10.11.1.1 dev veth0
10.11.1.0/24 dev veth0 proto kernel scope link src 10.11.1.2
10.12.0.0/24 dev veth0 proto kernel scope link metric 100

The 10.12.0.0/24 route was manually added with:

Running zebra, pre-patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:22
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:22
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:22
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:22

Notice that the 10.12.0.0/24 route is missing from the RIB.

After the patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:05
C * 10.11.1.0/24 is directly connected, veth0, 00:00:05
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:05
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:05
C>* 10.12.0.0/24 [0/100] is directly connected, veth0, 00:00:05
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:05

The prefix is now shown as connected (C>) as it should. Note also that the other
prefix (10.11.1.0/24, without the noprefix flag) now appears twice, because it's
both created by zebra from the interface config, and imported from the kernel.
This is harmless as the routes are identical, and an arbitrary one just ends up
being selected.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
tohojo added a commit to tohojo/frr that referenced this pull request Jul 17, 2024
When importing routes from the kernel, the zebra daemon ignores any routes
marked as 'proto kernel', such as the link-scoped routes that the kernel
generates for addresses assigned to interfaces. Instead, zebra implements its
own logic to synthesise routes for each address assignment, installing them into
the RIB with the ZEBRA_ROUTE_CONNECT proto set.

This behaviour requires zebra to mirror the logic of the kernel, to avoid having
the kernel FIB diverge from the FRR RIB, which can cause routing loops or other
failures. One example of this was the recent addition of support for the
'noprefixroute' flag to zebra[0].

However, attempting to mirror the kernel behaviour this way causes problems when
the mirroring is imperfect. An example of this was seen as a result of the
change mentioned above, where zebra honouring the noprefixroute flag leads to
routes missing from the RIB in some cases. Specifically, this happens when
network management daemons set the noprefixroute on the address assignment, but
subsequently installs a link-scoped route into the kernel identical to the
prefix route the kernel would have installed automatically. The use case for
this is enable the network management daemon to atomically change route
attributes (such as route metric) on the prefix route, but otherwise keep the
behaviour identical to the case where the kernel creates the prefix route
itself.

The failure described above was noticed for NetworkManager and reported as a
NetworkManager bug[1] as well as an FRR issue[2]. Other network management
daemons use the noprefixroute flag for similar purposes (e.g.,
systemd-networkd[3]).

[0] FRRouting#14957
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1452
[2] FRRouting#16101
[3] https://github.com/systemd/systemd/blob/main/src/network/networkd-dhcp4.c#L962

To resolve this discrepancy between the kernel FIB and the FRR RIB, this patch
changes zebra's behaviour to import 'proto kernel' instead of ignoring them, and
to treat routes with 'scope link' as ZEBRA_ROUTE_CONNECT routes, just like the
ones synthesised by zebra itself. This allows the noprefixroute flag to work
correctly, while still playing nice with network management daemons that install
a different link-scope route for installed addresses. The change in behaviour
can be seen from the following example:

Kernel config:
5: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen
1000
    link/ether fe:da:bb:eb:74:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.11.1.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.12.0.0/24 scope global noprefixroute veth0
       valid_lft forever preferred_lft forever

10.11.0.0/16 via 10.11.1.1 dev veth0
10.11.1.0/24 dev veth0 proto kernel scope link src 10.11.1.2
10.12.0.0/24 dev veth0 proto kernel scope link metric 100

The 10.12.0.0/24 route was manually added with:

Running zebra, pre-patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:22
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:22
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:22
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:22

Notice that the 10.12.0.0/24 route is missing from the RIB.

After the patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:05
C * 10.11.1.0/24 is directly connected, veth0, 00:00:05
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:05
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:05
C>* 10.12.0.0/24 [0/100] is directly connected, veth0, 00:00:05
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:05

The prefix is now shown as connected (C>) as it should. Note also that the other
prefix (10.11.1.0/24, without the noprefix flag) now appears twice, because it's
both created by zebra from the interface config, and imported from the kernel.
This is harmless as the routes are identical, and an arbitrary one just ends up
being selected.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
tohojo added a commit to tohojo/frr that referenced this pull request Jul 17, 2024
When importing routes from the kernel, the zebra daemon ignores any routes
marked as 'proto kernel', such as the link-scoped routes that the kernel
generates for addresses assigned to interfaces. Instead, zebra implements its
own logic to synthesise routes for each address assignment, installing them into
the RIB with the ZEBRA_ROUTE_CONNECT proto set.

This behaviour requires zebra to mirror the logic of the kernel, to avoid having
the kernel FIB diverge from the FRR RIB, which can cause routing loops or other
failures. One example of this was the recent addition of support for the
'noprefixroute' flag to zebra[0].

However, attempting to mirror the kernel behaviour this way causes problems when
the mirroring is imperfect. An example of this was seen as a result of the
change mentioned above, where zebra honouring the noprefixroute flag leads to
routes missing from the RIB in some cases. Specifically, this happens when
network management daemons set the noprefixroute on the address assignment, but
subsequently installs a link-scoped route into the kernel identical to the
prefix route the kernel would have installed automatically. The use case for
this is enable the network management daemon to atomically change route
attributes (such as route metric) on the prefix route, but otherwise keep the
behaviour identical to the case where the kernel creates the prefix route
itself.

The failure described above was noticed for NetworkManager and reported as a
NetworkManager bug[1] as well as an FRR issue[2]. Other network management
daemons use the noprefixroute flag for similar purposes (e.g.,
systemd-networkd[3]).

[0] FRRouting#14957
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1452
[2] FRRouting#16101
[3] https://github.com/systemd/systemd/blob/main/src/network/networkd-dhcp4.c#L962

To resolve this discrepancy between the kernel FIB and the FRR RIB, this patch
changes zebra's behaviour to import 'proto kernel' instead of ignoring them, and
to treat routes with 'scope link' as ZEBRA_ROUTE_CONNECT routes, just like the
ones synthesised by zebra itself. This allows the noprefixroute flag to work
correctly, while still playing nice with network management daemons that install
a different link-scope route for installed addresses. The change in behaviour
can be seen from the following example:

Kernel config:
5: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen
1000
    link/ether fe:da:bb:eb:74:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.11.1.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.12.0.0/24 scope global noprefixroute veth0
       valid_lft forever preferred_lft forever

10.11.0.0/16 via 10.11.1.1 dev veth0
10.11.1.0/24 dev veth0 proto kernel scope link src 10.11.1.2
10.12.0.0/24 dev veth0 proto kernel scope link metric 100

The 10.12.0.0/24 route was manually added with:

Running zebra, pre-patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:22
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:22
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:22
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:22

Notice that the 10.12.0.0/24 route is missing from the RIB.

After the patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:05
C * 10.11.1.0/24 is directly connected, veth0, 00:00:05
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:05
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:05
C>* 10.12.0.0/24 [0/100] is directly connected, veth0, 00:00:05
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:05

The prefix is now shown as connected (C>) as it should. Note also that the other
prefix (10.11.1.0/24, without the noprefix flag) now appears twice, because it's
both created by zebra from the interface config, and imported from the kernel.
This is harmless as the routes are identical, and an arbitrary one just ends up
being selected.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
tohojo added a commit to tohojo/frr that referenced this pull request Jul 17, 2024
When importing routes from the kernel, the zebra daemon ignores any routes
marked as 'proto kernel', such as the link-scoped routes that the kernel
generates for addresses assigned to interfaces. Instead, zebra implements its
own logic to synthesise routes for each address assignment, installing them into
the RIB with the ZEBRA_ROUTE_CONNECT proto set.

This behaviour requires zebra to mirror the logic of the kernel, to avoid having
the kernel FIB diverge from the FRR RIB, which can cause routing loops or other
failures. One example of this was the recent addition of support for the
'noprefixroute' flag to zebra[0].

However, attempting to mirror the kernel behaviour this way causes problems when
the mirroring is imperfect. An example of this was seen as a result of the
change mentioned above, where zebra honouring the noprefixroute flag leads to
routes missing from the RIB in some cases. Specifically, this happens when
network management daemons set the noprefixroute on the address assignment, but
subsequently installs a link-scoped route into the kernel identical to the
prefix route the kernel would have installed automatically. The use case for
this is enable the network management daemon to atomically change route
attributes (such as route metric) on the prefix route, but otherwise keep the
behaviour identical to the case where the kernel creates the prefix route
itself.

The failure described above was noticed for NetworkManager and reported as a
NetworkManager bug[1] as well as an FRR issue[2]. Other network management
daemons use the noprefixroute flag for similar purposes (e.g.,
systemd-networkd[3]).

[0] FRRouting#14957
[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1452
[2] FRRouting#16101
[3] https://github.com/systemd/systemd/blob/main/src/network/networkd-dhcp4.c#L962

To resolve this discrepancy between the kernel FIB and the FRR RIB, this patch
changes zebra's behaviour to import 'proto kernel' instead of ignoring them, and
to treat routes with 'scope link' as ZEBRA_ROUTE_CONNECT routes, just like the
ones synthesised by zebra itself. This allows the noprefixroute flag to work
correctly, while still playing nice with network management daemons that install
a different link-scope route for installed addresses. The change in behaviour
can be seen from the following example:

Kernel config:
5: veth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen
1000
    link/ether fe:da:bb:eb:74:17 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.11.1.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.12.0.0/24 scope global noprefixroute veth0
       valid_lft forever preferred_lft forever

10.11.0.0/16 via 10.11.1.1 dev veth0
10.11.1.0/24 dev veth0 proto kernel scope link src 10.11.1.2
10.12.0.0/24 dev veth0 proto kernel scope link metric 100

The 10.12.0.0/24 route was manually added with:

Running zebra, pre-patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:22
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:22
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:22
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:22

Notice that the 10.12.0.0/24 route is missing from the RIB.

After the patch:

Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 10.11.0.0/16 [0/0] via 10.11.1.1, veth0, 00:00:05
C * 10.11.1.0/24 is directly connected, veth0, 00:00:05
C>* 10.11.1.0/24 is directly connected, veth0, 00:00:05
L>* 10.11.1.2/32 is directly connected, veth0, 00:00:05
C>* 10.12.0.0/24 [0/100] is directly connected, veth0, 00:00:05
L>* 10.12.0.0/32 is directly connected, veth0, 00:00:05

The prefix is now shown as connected (C>) as it should. Note also that the other
prefix (10.11.1.0/24, without the noprefix flag) now appears twice, because it's
both created by zebra from the interface config, and imported from the kernel.
This is harmless as the routes are identical, and an arbitrary one just ends up
being selected.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport bugfix master rebase PR needs rebase release-notes should be added to release notes size/M tests Topotests, make check, etc zebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FRRouting ignores noprefixroute flag and uses the prefix as the connected route
3 participants