Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
zebra: fix handle netlink state reachable message
When creating an NHRP shortcut entry between two spokes, the cache entry created has very often a field 'used' set to false, despite a continuous traffic towards the 11.255.255.1 IP address. > north-vm# show ip nhrp cache > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 local 11.255.255.2 10.125.0.2 10.125.0.2 - > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 T west <---- Actually, that flag reflects the protocol address reachability. A neighbor entry is maintained in the kernel. As an ARP probe operates only with MAC address, NHRP will take the place of ARP, and will refresh the NBMA link IP address, every time the neighbor entry enters in PROBE state. The neighbor entry state goes in REACHABLE state, and the expectation is that the REACHABLE state is notified to the NHRP daemon, which does not happen. The below dump indicates 3 netlink messages over the gre1 interface: > # trace from 'ip monitor neigh' > miss 11.255.255.1 dev gre1 lladdr 10.125.0.1 PROBE proto 191 > 11.255.255.1 dev gre1 lladdr 10.125.0.1 PROBE proto 191 > 11.255.255.1 dev gre1 lladdr 10.125.0.1 REACHABLE proto 191 The NHRP/ZEBRA traces indicate the first 2 netlink messages received, followed by the update of the neighbor entry by the NHRP daemon. But there is no reception or confirmation that the new neighbor state moved to REACHABLE. > # trace from nhrp > 2023/11/06 09:46:21 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv] > 2023/11/06 09:46:21 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=80 type=(30) GETNEIGH flags=(0x0001) {REQUEST} seq=0 pid=0] > 2023/11/06 09:46:21 ZEBRA: [S4WS4-PS3KF] netlink_parse_info: netlink-listen (NS 0) type RTM_GETNEIGH(30), len=80, seq=0, pid=0 > 2023/11/06 09:46:21 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv] > 2023/11/06 09:46:21 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=80 type=(28) NEWNEIGH flags=(0x0000) {} seq=0 pid=0] > 2023/11/06 09:46:21 ZEBRA: [T4YQJ-83R8H] ndm [family=2 (AF_INET) ifindex=10 state=0x0010 {PROBE} flags=0x0000 {} type=1 (UNICAST)] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(1) DST] > 2023/11/06 09:46:21 ZEBRA: [M8QV4-KY9C0] 11.255.255.1 > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(2) LLADDR] > 2023/11/06 09:46:21 ZEBRA: [V74GD-NYS6Y] 0A:7D:00:01 > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(4) PROBES] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=20 (payload=16) type=(3) CACHEINFO] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=5 (payload=1) type=(12) UNKNOWN] > 2023/11/06 09:46:21 ZEBRA: [S4WS4-PS3KF] netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0 > 2023/11/06 09:46:21 ZEBRA: [TDS34-MNEJW] Neighbor Entry received is not on a VLAN or a BRIDGE, ignoring > 2023/11/06 09:46:21 NHRP: [QQ0NK-1H449] Netlink: who-has 11.255.255.1 dev gre1 lladdr 10.125.0.1 nud 0x10 cache used 0 type 5 > 2023/11/06 09:46:21 NHRP: [QVXNM-NVHEQ] Netlink: update binding for 11.255.255.1 dev gre1 from c (unspec) peer.vc.nbma 10.125.0.1 to lladdr 10.125.0.1 > 2023/11/06 09:46:21 NHRP: [QQ0NK-1H449] Netlink: new-neigh 11.255.255.1 dev gre1 lladdr 10.125.0.1 nud 0x10 cache used 1 type 5 > 2023/11/06 09:46:21 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF gre1(10) Neigh 11.255.255.1 Link 10.125.0.1 flags 0x0 state 0x2 ext_flags 0x0 > 2023/11/06 09:46:21 ZEBRA: [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=52, msg cnt=1 The NHRP daemon uses the ZEBRA netlink layer to update the binding, and uses a netlink socket for writing, which is different than the netlink socket used for reading. There is no confirmation that the write operation succeeded when the ZEBRA daemon is the initiator of the write operation. Previously, the write and read operations were directly done on a netlink socket from NHRP, and was working ok. Fix this by reading the return status of the netlink write operation, and link it to the neighbor notification message to inform NHRP of the REACHABLE state. > north-vm# show ip nhrp cache > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 local 11.255.255.2 10.125.0.2 10.125.0.2 - > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 UT west <---- Link: https://flylib.com/books/3/475/1/html/2/images/0131777203/graphics/15fig06.gif Fixes: b3b7510 ("nhrpd: link layer registration to notificationas") Signed-off-by: Philippe Guibert <[email protected]>
- Loading branch information