Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to remove IPv6 static route with multiple output paths #7718

Closed
mg964 opened this issue Dec 11, 2020 · 23 comments
Closed

Failure to remove IPv6 static route with multiple output paths #7718

mg964 opened this issue Dec 11, 2020 · 23 comments
Labels
platform Issue in a specific platform staticd vyatta zebra

Comments

@mg964
Copy link

mg964 commented Dec 11, 2020

Looking for a clue as why the following is failing. I'm running with FRR 7.4 under DANOS, but I've managed to strip away the DANOS portions and am left with a fairly simple repro of the problem.

Two files with the necessary configuration to add and delete the route:

vyatta@vm-ipv6fib-1:~$ cat add-ipv6-route.txt 
ipv6 route 10:2::1/128 dp0p1s1
ipv6 route 10:2::1/128 dp0p1s2.102
ipv6 route 10:2::1/128 dp0p1s2.103
vyatta@vm-ipv6fib-1:~$ cat del-ipv6-route.txt 
no ipv6 route 10:2::1/128 dp0p1s1
no ipv6 route 10:2::1/128 dp0p1s2.102
no ipv6 route 10:2::1/128 dp0p1s2.103
vyatta@vm-ipv6fib-1:~$ 

Run "sudo vtysh -f ~/add-ipv6-route.txt" to add the route and all looks OK:

vyatta@vm-ipv6fib-1:~$ ip -6 route list 10:2::1/128
10:2::1 dev dp0p1s1 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.102 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.103 proto static metric 20 pref medium
vyatta@vm-ipv6fib-1:~$ 

But if you now apply the delete only the first next-hop gets removed from the kernel:

vyatta@vm-ipv6fib-1:~$ sudo vtysh -f ~/del-ipv6-route.txt
vyatta@vm-ipv6fib-1:~$ ip -6 route list 10:2::1/128
10:2::1 dev dp0p1s2.102 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.103 proto static metric 20 pref medium
vyatta@vm-ipv6fib-1:~$ 
vyatta@vm-ipv6fib-1:~$ sudo vtysh -c "show run"
Building configuration...

Current configuration:
!
frr version 7.4
frr defaults traditional
hostname node
log syslog
no zebra nexthop kernel enable
hostname vm-ipv6fib-1
service integrated-vtysh-config
!
line vty
!
end
vyatta@vm-ipv6fib-1:~$ 

You can clean things up by manually re-adding the routes and re-issuing the delete - repeat for each of the output paths until the route is finally removed from the kernel.

Any ideas where to start? Is it some sort of vtysh issue?

@mg964
Copy link
Author

mg964 commented Dec 11, 2020

Can confirm the same issue exists with a master build "frr version 7.6-dev-20201209-05-g327c3aad2".

One other data-point. You can cut-n-past the add sequence directly into vtysh, but not the delete. To delete everything you have to "slowly" issue each individual "no ipv6 route ..." command, i.e. wait a second or two before issuing the next command.

@mjstapp
Copy link
Contributor

mjstapp commented Dec 11, 2020

I'm not able to reproduce this "need a delay" issue with master. my 'uninstall' file was:

no ipv6 route 10:1::1/128 ens33
no ipv6 route 10:1::1/128 ANNIE
no ipv6 route 10:1::1/128 BETTY

I didn't have any subinterface config handy, so that may be different? I just ran the daemons and vtysh from a development sandbox: is there some other component of the danos/whatever stack that's involved?

@mg964
Copy link
Author

mg964 commented Dec 11, 2020

OK thanks.

The only DANOS item left in the test is the dataplane, at the moment I can't see how that would impact the kernel. Is staticd the right place to be poking about for clues at what might be happening?

@mjstapp
Copy link
Contributor

mjstapp commented Dec 11, 2020

OK thanks.

The only DANOS item left in the test is the dataplane, at the moment I can't see how that would impact the kernel. Is staticd the right place to be poking about for clues at what might be happening?

sure: staticd is receiving the "ipv6 route" cli input there, and processing it to decide what to send to zebra. zebra is responding to the "show ipv6 route" cli. it would give you some insight to enable some debugs too:
debug zebra rib detail
debug zebra events
debug zebra kernel

those should show you what staticd is sending, and when, and then how zebra disposes of the info.

@pjdruddy
Copy link
Contributor

Hi @mg964 - it might be worth knocking up a quick topotest to check this in a vanilla FRR setting. Can point you in the right direction if you want to pursue this.

@donaldsharp
Copy link
Member

so the output from the first ip -6 route ... does not show ecmp behavior at all in the linux kernel. After the route installation can we see vtysh -c "show ipv6 route" and Can we see the route installation with debug zebra rib detail, debug zebra dplane, debug zebra kernel turned on? Something is going wrong at install time, imo

@mg964
Copy link
Author

mg964 commented Dec 11, 2020

Does look like the multipath is not present in the kernel. This is a similar test (which works) using ipv4:

yatta@vm-ipv6fib-1:~$ ip -4 route list 10.2.0.1/32
10.2.0.1 proto static metric 20 
	nexthop dev dp0p1s1 weight 1 
	nexthop dev dp0p1s2.102 weight 1 
	nexthop dev dp0p1s2.103 weight 1 
vyatta@vm-ipv6fib-1:~$ 

Here's a re-run with ipv6:

vyatta@vm-ipv6fib-1:~$ sudo vtysh -c "show ipv6 route"
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure
C>* 1::1/128 is directly connected, lo1, 00:04:55
S>* 10:2::1/128 [1/0] is directly connected, dp0p1s1, weight 1, 00:00:19
  *                   is directly connected, dp0p1s2.102, weight 1, 00:00:19
  *                   is directly connected, dp0p1s2.103, weight 1, 00:00:19
C>* 192:168:1::/64 is directly connected, dp0p1s1, 00:04:51
C>* 192:168:2::/64 is directly connected, dp0p1s2.102, 00:04:49
C>* 192:168:3::/64 is directly connected, dp0p1s2.103, 00:04:50
C * fe80::/64 is directly connected, dp0p1s2.102, 00:04:49
C * fe80::/64 is directly connected, dp0p1s2.103, 00:04:50
C * fe80::/64 is directly connected, dp0p1s1, 00:04:52
C * fe80::/64 is directly connected, dp0p1s3, 00:04:52
C * fe80::/64 is directly connected, dp0p1s2, 00:04:52
C * fe80::/64 is directly connected, lo1, 00:04:56
C>* fe80::/64 is directly connected, ens2, 00:05:10
vyatta@vm-ipv6fib-1:~$ ip -6 route list 10:2::1/128
10:2::1 dev dp0p1s1 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.102 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.103 proto static metric 20 pref medium
vyatta@vm-ipv6fib-1:~$ 
vyatta@vm-ipv6fib-1:~$ sudo vtysh -f ~/del-ipv6-route.txt
vyatta@vm-ipv6fib-1:~$ 
vyatta@vm-ipv6fib-1:~$ sudo vtysh -c "show ipv6 route"
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure
C>* 1::1/128 is directly connected, lo1, 00:17:25
C>* 10:1::/64 is directly connected, dp0p1s3, 00:02:27
C>* 192:168:1::/64 is directly connected, dp0p1s1, 00:17:21
C>* 192:168:2::/64 is directly connected, dp0p1s2.102, 00:17:19
C>* 192:168:3::/64 is directly connected, dp0p1s2.103, 00:17:20
C * fe80::/64 is directly connected, dp0p1s2.102, 00:17:19
C * fe80::/64 is directly connected, dp0p1s2.103, 00:17:20
C * fe80::/64 is directly connected, dp0p1s1, 00:17:22
C * fe80::/64 is directly connected, dp0p1s3, 00:17:22
C * fe80::/64 is directly connected, dp0p1s2, 00:17:22
C * fe80::/64 is directly connected, lo1, 00:17:26
C>* fe80::/64 is directly connected, ens2, 00:17:40
vyatta@vm-ipv6fib-1:~$ ip -6 route list 10:2::1/128
10:2::1 dev dp0p1s2.102 proto static metric 20 pref medium
10:2::1 dev dp0p1s2.103 proto static metric 20 pref medium
vyatta@vm-ipv6fib-1:~$ 

Logging: frr-ipv6-route.txt

@donaldsharp
Copy link
Member

The output given in the initial message is not v6 ecmp from my perspective. Something has gone wrong, hence the request for more data. Secondly the ipv4 quote is missing data from your last response.

@donaldsharp
Copy link
Member

This is what I expect it too look like:

eva(config)# ipv6 route 1::1/128 fdfc:6960:ae84:10::1
eva(config)# ipv6 route 1::1/128 fdfc:6960:ae84:10::2
eva(config)# end
eva# show ipv6 route 1::1
Routing entry for 1::1/128
  Known via "static", distance 1, metric 0, best
  Last update 00:00:06 ago
  * fdfc:6960:ae84:10::1, via enp39s0, weight 1
  * fdfc:6960:ae84:10::2, via enp39s0, weight 1

eva# exit
sharpd@eva ~/f/doc (pbr_ifp_attention)> ip -6 route show 1::1
1::1 nhid 449 proto static metric 20 pref medium
	nexthop via fdfc:6960:ae84:10::1 dev enp39s0 weight 1 
	nexthop via fdfc:6960:ae84:10::2 dev enp39s0 weight 1 
sharpd@eva ~/f/doc (pbr_ifp_attention)> 

@mg964
Copy link
Author

mg964 commented Dec 11, 2020

Its the interface/gateway attribute that is causing problems. Use an address and all works as expected.

@donaldsharp
Copy link
Member

eva(config)# ipv6 route 2::2/128 dummy302
eva(config)# ipv6 route 2::2/128 enp39s0
eva(config)# do show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure
K>* ::/0 [0/100] via fe80::e063:daff:fe79:1dab, enp39s0, 1d01h54m
S>* 1::1/128 [1/0] via fdfc:6960:ae84:10::1, enp39s0, weight 1, 00:29:24
  *                via fdfc:6960:ae84:10::2, enp39s0, weight 1, 00:29:24
S>* 2::2/128 [1/0] is directly connected, enp39s0, weight 1, 00:00:09
  *                is directly connected, dummy301, weight 1, 00:00:09
  *                is directly connected, dummy302, weight 1, 00:00:09
K>* fdfc:6960:ae84::/48 [0/100] via fe80::e063:daff:fe79:1dab, enp39s0, 1d01h54m
C * fdfc:6960:ae84:10::/64 is directly connected, enp39s0, 01:25:51
C * fdfc:6960:ae84:10::/64 is directly connected, enp39s0, 1d01h17m
K * fdfc:6960:ae84:10::/64 [0/100] is directly connected, enp39s0, 1d01h54m
C * fdfc:6960:ae84:10::/64 is directly connected, enp39s0, 1d01h54m
C * fdfc:6960:ae84:10::/64 is directly connected, enp39s0, 1d01h54m
C>* fdfc:6960:ae84:10::/64 is directly connected, enp39s0, 1d01h54m
C * fe80::/64 is directly connected, dummy302, 00:01:12
C * fe80::/64 is directly connected, dummy301, 1d01h25m
C>* fe80::/64 is directly connected, enp39s0, 1d01h54m
eva(config)# exit
eva# exit
sharpd@eva ~/f/doc (pbr_ifp_attention)> ip -6 route show 2::2
2::2 nhid 467 proto static metric 20 pref medium
	nexthop dev dummy301 weight 1 
	nexthop dev dummy302 weight 1 
	nexthop dev enp39s0 weight 1 

This works for me. Can you please provide the requested data?

@mg964
Copy link
Author

mg964 commented Dec 11, 2020

This works for me. Can you please provide the requested data?

Was it just the logging for add AND delete you wanted?

frr-ipv6-route-2.txt

@donaldsharp
Copy link
Member

no just the add looking at the data now

@donaldsharp
Copy link
Member

Can you please provide the kernel you are running as well as how it is compiled? ( the boot/config-XXX ) I would also like to see the output of sysctl -a

@mg964
Copy link
Author

mg964 commented Dec 14, 2020

This is the kernel:

vyatta@vm-ipv6fib-1:~$ uname -a
Linux vm-ipv6fib-1 5.4.0-trunk-vyatta-amd64 #1 SMP PREEMPT Debian 5.4.81-0vyatta1 (2020-12-07) x86_64 GNU/Linux
vyatta@vm-ipv6fib-1:~$ 

The source can be found here:

https://github.com/danos/linux-vyatta/tree/linux-vyatta-5.4.y

I've attached config (from /proc/config.gz) & sysctl.
sysctl.gz
config.gz

@mg964
Copy link
Author

mg964 commented Dec 14, 2020

Another datapoint... used "ip -6 monitor" to see what is issued to the kernel. Looks a bit odd. The following is the result of manually/slowly issuing "ipv6 route 10:2::1/128 dp0p1s1" & "ipv6 route 10:2::1/128 dp0p1s2.102":

vyatta@vm-ipv6fib-1:~$ ip -6 monitor 
10:2::1 dev dp0p1s1 proto static metric 20 pref medium
Deleted 10:2::1 dev dp0p1s1 proto static metric 20 pref medium
10:2::1 dev dp0p1s1 proto static metric 20 pref medium

I can see how the first add & subsequent delete makes sense - preparation for the subsequent addition of the second NH interface - but the next add only includes the initial NH interface, i.e. a repeat of the first add. Now watch what happens if I issue "no ipv6 route 10:2::1/128 dp0p1s2.102" followed by "no ipv6 route 10:2::1/128 dp0p1s1" (reverse the order used to add):

Deleted 10:2::1 dev dp0p1s1 proto static metric 20 pref medium
10:2::1 dev dp0p1s1 proto static metric 20 pref medium
Deleted 10:2::1 dev dp0p1s2.102 proto static metric 20 pref medium
2: ens2    inet 192.168.252.50/24 brd 192.168.252.255 scope global dynamic ens2
       valid_lft 3600sec preferred_lft 3600sec
^C
vyatta@vm-ipv6fib-1:~$ ip -6 route show 10:2::1
10:2::1 dev dp0p1s1 proto static metric 20 pref medium
vyatta@vm-ipv6fib-1:~$ 

Feels like there is a "work queue" ("next-hop path list"?) where only the head of the queue is being processed.

@deastoe
Copy link
Contributor

deastoe commented Dec 14, 2020

@mg964 it looks like there might be a kernel bug in DANOS. If we take FRR out of the equation and use ip instead, ECMP routes aren't installed.

[edit]
vyatta@vm-dan-1# sudo ip -6 route add 5::5/128 nexthop dev dp0p1s2.3 nexthop dev dp0p1s2.4
[edit]
vyatta@vm-dan-1# ip -6 route show 5::5
5::5 dev dp0p1s2.3 metric 1024 pref medium
5::5 dev dp0p1s2.4 metric 1024 pref medium

@mg964
Copy link
Author

mg964 commented Dec 14, 2020

Ok found that we (vyatta) have a kernel patch to allow device only routes via the multipath API. See the ipv6-interface-route-ecmp.patch in the above kernel tree. This patch was in response to a 2018 kernel update (b5d2d75):

Really, IPv6 multipath is just FUBAR'ed beyond repair when it comes to
    device only routes, so do not allow it all.

and, as @deastoe pointed out, we must program up the kernel in some slightly different way to FRR. I guess FRR must have have a work around for the shortcomings of IPv6 multipath in the kernel that fails when used in combination with the vyatta patch.

@qlyoung qlyoung added staticd zebra platform Issue in a specific platform vyatta labels Dec 15, 2020
@mg964
Copy link
Author

mg964 commented Jan 7, 2021

@donaldsharp in your working example, what kernel/FRR versions were you using? I've run a quick test using a stock debian 10 image together with FRR7.5:

vyatta@debian:~$ uname -a
Linux debian 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
vyatta@debian:~$ cat /etc/debian_version
10.7
vyatta@debian:~$ sudo vtysh

Hello, this is FRRouting (version 7.5).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

debian# show run
Building configuration...

Current configuration:
!
frr version 7.5
frr defaults traditional
hostname debian
log stdout
log syslog
service integrated-vtysh-config
!
debug zebra events
debug zebra kernel
debug zebra nexthop detail
debug static events
!
line vty
!
end
debian# conf 
debian(config)# ipv6 route 5::1/128 enp2s0
debian(config)# ipv6 route 5::1/128 enp3s0
debian(config)# end
debian# 
vyatta@debian:~$ 
vyatta@debian:~$ 
vyatta@debian:~$ sudo vtysh -c "show ipv6 route"
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup

S>r 5::1/128 [1/0] is directly connected, enp2s0, weight 1, 00:01:04
  r                is directly connected, enp3s0, weight 1, 00:01:04
C>* 192:168:2::/64 is directly connected, enp2s0, 1d02h10m
C>* 192:168:3::/64 is directly connected, enp3s0, 1d02h10m
C>* 192:168:4::/64 is directly connected, enp4s0, 1d02h10m
C * fe80::/64 is directly connected, enp4s0, 1d02h10m
C * fe80::/64 is directly connected, enp3s0, 1d02h10m
C * fe80::/64 is directly connected, enp2s0, 1d02h10m
C>* fe80::/64 is directly connected, enp1s0, 1d02h10m
vyatta@debian:~$ ip -d -6 route show 5::1
vyatta@debian:~$ 

The RIB has tagged the paths as rejected and there is nothing in the kernel, i.e. everything has been left in a "sensible" state. This is the result of the kernel rejecting the attempt to use the multipath API. From the logs:

Jan 07 11:41:59 debian zebra[3286]: netlink_route_multipath_msg_encode: RTM_NEWROUTE 5::1/128 vrf 0(254)
Jan 07 11:41:59 debian zebra[3286]: _netlink_route_build_multipath: (multipath): 5::1/128 nexthop via if 3 vrf default(0)
Jan 07 11:41:59 debian zebra[3286]: _netlink_route_build_multipath: (multipath): 5::1/128 nexthop via if 4 vrf default(0)
Jan 07 11:41:59 debian zebra[3286]: nl_batch_send: netlink-dp (NS 0), batch size=76, msg cnt=1
Jan 07 11:41:59 debian zebra[3286]: Extended Error: Device only routes can not be added for IPv6 using the multipath API.
Jan 07 11:41:59 debian zebra[3286]: [EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWROUTE(24), seq=73, pid=3633315736
Jan 07 11:41:59 debian zebra[3286]: nl_batch_read_resp: netlink error message seq=73
Jan 07 11:41:59 debian zebra[3286]: Nexthop dplane ctx 0x55f8b44dd530, op NH_INSTALL, nexthop ID (62), result SUCCESS
Jan 07 11:41:59 debian zebra[3286]: Nexthop dplane ctx 0x55f8b44de860, op NH_INSTALL, nexthop ID (63), result SUCCESS
Jan 07 11:41:59 debian zebra[3286]: Nexthop dplane ctx 0x55f8b44df200, op NH_INSTALL, nexthop ID (61), result SUCCESS
Jan 07 11:41:59 debian zebra[3286]: default(0:254):5::1/128: Route install failed
Jan 07 11:41:59 debian staticd[3293]: route_notify_owner: Route 5::1/128 failed to install for table: 254

Wondering if you are using a kernel that has fixed the multipath issue or if there is some sort of workaround in FRR.

Cheers,

Mark

@mg964
Copy link
Author

mg964 commented Jan 7, 2021

Never mind, looks like we've found a possible solution in DANOS.

@mg964 mg964 closed this as completed Jan 7, 2021
@kwind
Copy link

kwind commented Apr 1, 2021

@mg964 HI,I have a similar problem , I can't find the patch(ipv6-interface-route-ecmp.patch), how did you fixd it?

@mg964
Copy link
Author

mg964 commented Apr 6, 2021

Hello @kwind. The solution was specific to DANOS, there was no need to make any changes to FRR. Essentially, ended up issuing 'zebra nexthop kernel enable'.

@zhangwen-network
Copy link

Hello @kwind. The solution was specific to DANOS, there was no need to make any changes to FRR. Essentially, ended up issuing 'zebra nexthop kernel enable'.

Is there any way to add IPv6 multipath path routes whith device only. The version of kernel is 5.4.18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform Issue in a specific platform staticd vyatta zebra
Projects
None yet
Development

No branches or pull requests

8 participants