-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nhrp maintain used state #14739
base: master
Are you sure you want to change the base?
Nhrp maintain used state #14739
Conversation
@@ -746,7 +746,7 @@ static void show_ip_nhrp_cache(struct nhrp_cache *c, void *pctx) | |||
nhrp_cache_type_str[c->cur.type]); | |||
json_object_string_add(json, "protocol", buf[0]); | |||
json_object_string_add(json, "nbma", buf[1]); | |||
json_object_string_add(json, "claimed_nbma", buf[2]); | |||
json_object_string_add(json, "claimedNbma", buf[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please go through an appropriate deprecation cycle with this change. Even if it is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed to keep the old value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should mark this as a deprecation anyway (claimed_nbma) because it's not compliant with our JSON requirements for the key names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pguibert6WIND can you add deprecation for claimed_nbma
? (CONFDATE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
9378d4a
to
0cae44f
Compare
The main fix is simpler, as it modified the netlink filter so that we are now able to receive NEWNEIGH notifications. |
945b495
to
6f71f4e
Compare
@@ -9,7 +9,6 @@ | |||
"protocol":"10.255.255.2", | |||
"nbma":"10.2.1.2", | |||
"claimed_nbma":"10.2.1.2", | |||
"used":false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this value unconditionally, any reason to remove it from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'used' attribute is dynamic and relies on the state of the
system neighbor entry attached to the protocol address. Periodically,
the value is set to false, when receiving a NUD_PROBE notification
from netlink. Then, NHRP re-programs the link entry and re-sets the
'used ' value to true.
Because that value may be intermittently set to false, it is better to
remove it from the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
851f17c
to
9c34333
Compare
ci:rerun |
2b593e5
to
084b59b
Compare
723b1a7
to
f066b48
Compare
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
f066b48
to
9613b07
Compare
9613b07
to
c30a7a1
Compare
c30a7a1
to
8f24211
Compare
6f386ab
to
1c9720c
Compare
Some json attributes are not compliant with caml format. > # show ip nhrp cache json > [..] > { > "interface":"gre1", > "type":"dynamic", > "protocol":"11.255.255.4", > "nbma":"10.125.0.4", > "claimed_nbma":"10.125.0.4", > [..] Fix this by duping the 'claimed_nbma' attribute with the 'claimedNbma' name. Fixes: 85365e5 ("nhrpd: Add Claimed NBMA field in sh ip nhrp cache output") Signed-off-by: Philippe Guibert <[email protected]>
When creating a shortcut, there is no information on the life cycle of the entry. Add some more fields in the json part of the command. > # show ip nhrp shortcut > [..] > "table":[ > { > "type":"dynamic", > "prefix":"11.255.255.4\/32", > "via":"11.255.255.4", > "identity":"south", > "holdingTimeSecs":1200, <-- added > "routeInstalled":true, <-- added > "expiring":false, <-- added > "remainingTimeSecs":312 <-- added > } The remaining time can be understood with the expiring flag. If true, the timer stands for the remaining time before shortcut is removed. Otherwise, the timer stands for the time before a renewal attempt may happen (initially set to the 2 thirds of the 'holdingTimeSecs' value. Signed-off-by: Philippe Guibert <[email protected]>
There is no information about the active timers in an nhrp cache entry. Dump the cache timer and/or the authentication timer when available. Dump the '*routeInstalled' boolean. > # show ip nhrp cache json > [..] > { > "interface":"gre1", > "type":"nhs", > [..] > "used":true, > "routeInstalled":true, <-- added > "nhrpRouteInstalled":true, <-- added > "timeout":true, > "timeoutRemainingTimeSecs":6644, <-- added > "auth":false, > [..] Signed-off-by: Philippe Guibert <[email protected]>
The 'used' field of NHRP cache entries goes to false every time it receives a netlink STALE message. Consequently, the cache entry appears as unused until the reception of a PROBE message triggers the neighbor entry re-configuration that changes the entry to REACHABLE. > '# show ip nhrp cache' command before receiving STALE message. > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 UT west > # ip monitor neigh dev gre1 > 10.125.0.1 lladdr de:ed:02:05:3c:b5 STALE > > '# show ip nhrp cache' command after sending STALE message. > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 T west Fix this by ignoring the STALE state, and do not change the cache entry 'used' flag. Signed-off-by: Philippe Guibert <[email protected]>
When creating an NHRP shortcut entry between two spokes, the cache entry created has very often a field 'used' set to false, despite a continuous traffic towards the 11.255.255.1 IP address. > north-vm# show ip nhrp cache > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 local 11.255.255.2 10.125.0.2 10.125.0.2 - > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 T west <---- Actually, that flag reflects the protocol address reachability. A neighbor entry is maintained in the kernel. As an ARP probe operates only with MAC address, NHRP will take the place of ARP, and will refresh the NBMA link IP address, every time the neighbor entry enters in PROBE state. The neighbor entry state goes in REACHABLE state, and the expectation is that the REACHABLE state is notified to the NHRP daemon, which does not happen. The below dump indicates 3 netlink messages over the gre1 interface: > # trace from 'ip monitor neigh' > miss 11.255.255.1 dev gre1 lladdr 10.125.0.1 PROBE proto 191 > 11.255.255.1 dev gre1 lladdr 10.125.0.1 PROBE proto 191 > 11.255.255.1 dev gre1 lladdr 10.125.0.1 REACHABLE proto 191 The NHRP/ZEBRA traces indicate the first 2 netlink messages received, followed by the update of the neighbor entry by the NHRP daemon. But there is no reception or confirmation that the new neighbor state moved to REACHABLE. > # trace from nhrp > 2023/11/06 09:46:21 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv] > 2023/11/06 09:46:21 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=80 type=(30) GETNEIGH flags=(0x0001) {REQUEST} seq=0 pid=0] > 2023/11/06 09:46:21 ZEBRA: [S4WS4-PS3KF] netlink_parse_info: netlink-listen (NS 0) type RTM_GETNEIGH(30), len=80, seq=0, pid=0 > 2023/11/06 09:46:21 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv] > 2023/11/06 09:46:21 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=80 type=(28) NEWNEIGH flags=(0x0000) {} seq=0 pid=0] > 2023/11/06 09:46:21 ZEBRA: [T4YQJ-83R8H] ndm [family=2 (AF_INET) ifindex=10 state=0x0010 {PROBE} flags=0x0000 {} type=1 (UNICAST)] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(1) DST] > 2023/11/06 09:46:21 ZEBRA: [M8QV4-KY9C0] 11.255.255.1 > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(2) LLADDR] > 2023/11/06 09:46:21 ZEBRA: [V74GD-NYS6Y] 0A:7D:00:01 > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=8 (payload=4) type=(4) PROBES] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=20 (payload=16) type=(3) CACHEINFO] > 2023/11/06 09:46:21 ZEBRA: [KFBSR-XYJV1] rta [len=5 (payload=1) type=(12) UNKNOWN] > 2023/11/06 09:46:21 ZEBRA: [S4WS4-PS3KF] netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0 > 2023/11/06 09:46:21 ZEBRA: [TDS34-MNEJW] Neighbor Entry received is not on a VLAN or a BRIDGE, ignoring > 2023/11/06 09:46:21 NHRP: [QQ0NK-1H449] Netlink: who-has 11.255.255.1 dev gre1 lladdr 10.125.0.1 nud 0x10 cache used 0 type 5 > 2023/11/06 09:46:21 NHRP: [QVXNM-NVHEQ] Netlink: update binding for 11.255.255.1 dev gre1 from c (unspec) peer.vc.nbma 10.125.0.1 to lladdr 10.125.0.1 > 2023/11/06 09:46:21 NHRP: [QQ0NK-1H449] Netlink: new-neigh 11.255.255.1 dev gre1 lladdr 10.125.0.1 nud 0x10 cache used 1 type 5 > 2023/11/06 09:46:21 ZEBRA: [NH6N7-54CD1] Tx RTM_NEWNEIGH family ipv4 IF gre1(10) Neigh 11.255.255.1 Link 10.125.0.1 flags 0x0 state 0x2 ext_flags 0x0 > 2023/11/06 09:46:21 ZEBRA: [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=52, msg cnt=1 The NHRP daemon relies on zebra netlink layer. Read and write operations are done on two different sockets. As a filter is attached to the read socket to prevent from reading notifications from the write socket, the NEWNEIGH operations from NHRP are ignored. Fix this by adding an exception in the netlink filter to autorise NEWNEIGH notifications. Consequently, the REACHABLE state is notified to NHRP. > north-vm# show ip nhrp cache > Iface Type Protocol NBMA Claimed NBMA Flags Identity > gre1 local 11.255.255.2 10.125.0.2 10.125.0.2 - > gre1 nhs 11.255.255.1 10.125.0.1 10.125.0.1 UT west <---- Link: https://flylib.com/books/3/475/1/html/2/images/0131777203/graphics/15fig06.gif Fixes: b3b7510 ("nhrpd: link layer registration to notificationas") Signed-off-by: Philippe Guibert <[email protected]>
Expect expect nhrp used state true. Signed-off-by: Louis Scalbert <[email protected]>
Use the check_ping library function. Signed-off-by: Louis Scalbert <[email protected]>
1c9720c
to
a8c06a5
Compare
The test expected the NHRP shortcut on nhc1 to expire after 10s in the absence of traffic. Previous patches no longer cause shortcuts to expire unless the remote network is down or the kernel has removed neighbor entries to the remote network. Test that the shortcut only expires in these cases. Note that the shortcut mechanism is based on NHRP type 8 (NHRP traffic indication) packets that is not part of RFC2332 but of draft-detienne-dmvpn-01. The tested behavior conforms with: > an NHRP cache entry is created for TunS1 => PubS1. The entry > SHOULD NOT remain in the cache for more than the specified Hold > Time (from the NHRP Resolution Request). This NHRP cache entry > may be 'refreshed' for another hold time period prior to expiry by > receipt of another matching NHRP Resolution Request or by sending > an NHRP Resolution Request and receiving an NHRP Resolution Reply. Signed-off-by: Louis Scalbert <[email protected]>
a8c06a5
to
c06c32e
Compare
ci:rerun |
1 similar comment
ci:rerun |
The problem is due to the 'U'sed flag of dynamic cache entries (shortcut), which is set to false, despite its use.
Consequently, the shortcut associated goes in expiring state.
Fix this by ensuring that the 'U'sed flag is properly set to true, as it is always maintained by NHRP. Only NHRP can decide whether the cache entry is valid or not.