Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protocol next hop group allow recursion #14973

Closed

Conversation

pguibert6WIND
Copy link
Member

ubuntu2204(config)# ip route 172.31.0.0/24 192.0.2.88
ubuntu2204(config)# nexthop-group nh1
ubuntu2204(config-nh-group)# allow-recursion 
ubuntu2204(config-nh-group)# nexthop 172.31.0.103
ubuntu2204(config-nh-group)# do show nexthop-group rib sharp 
ID: 1666 (sharp)
     RefCnt: 2
     Uptime: 00:00:10
     VRF: default
     Valid
     Depends: (1667)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (181818168)
ID: 1667 (sharp)
     RefCnt: 3
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Interface Index: 3056
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (1666)
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Depends: (1666)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1

@pguibert6WIND pguibert6WIND changed the title next hop group allow recursion protocol next hop group allow recursion Dec 8, 2023
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation is missing.

{
VTY_DECLVAR_CONTEXT(nexthop_group_cmd, nhgc);

if (!!no == !CHECK_FLAG(nhgc->nhg.flags, NEXTHOP_GROUP_ALLOW_RECURSION))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this check supposed to do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing configuration or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of adding a new cli here. Sharpd already refuses to install a nexhtop that doesn't have an interface attached to it. Just make it auto turn on the flag if the nexhtop doesn't have an interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to mimic BGP behaviour, I have to do this command. Or we limit sharpd for scalability testing, and in that case, the changes I do are not necessary from testing perspective with sharpd.

@@ -46,6 +46,10 @@ ip route 4.5.6.16/32 192.168.0.4 10
ip route 4.5.6.17/32 192.168.0.2 tag 9000
ip route 4.5.6.17/32 192.168.0.2 tag 10000

# Create a static route to test recursive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments use !, not #.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -1895,6 +1895,8 @@ static int zapi_nhg_decode(struct stream *s, int cmd, struct zapi_nhg *api_nhg)
STREAM_GETL(s, api_nhg->resilience.idle_timer);
STREAM_GETL(s, api_nhg->resilience.unbalanced_timer);

STREAM_GETC(s, api_nhg->flags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a dumb question: is this compatible with older versions (if this new flags is not supported)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean to increment zapi version , maybe we should.
Other than that, when using the same version, I don't see the point.
do you know the procedure to do this?

@@ -2593,10 +2595,8 @@ static unsigned nexthop_active_check(struct route_node *rn,
family = AFI_IP6;
else
family = AF_UNSPEC;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop this unrelated change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if (IS_ZEBRA_DEBUG_NHG_DETAIL)
zlog_debug("%s: re %p, nexthop %pNHv", __func__, re, nexthop);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop this unrelated change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

valid = re.search(r"Valid", output)
if valid is None:
found = False
sleep(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use functools.partial().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out how to reuse previous code

@donaldsharp
Copy link
Member

When I originally worked on this we talked about this. It was decided, at that time, that if an upper level protocol wanted to control the nexthop group that they should be controlling the nexthop group completely. Can you give me a use case for this change in behavior so I can understand how it is going to be used?

@pguibert6WIND
Copy link
Member Author

pguibert6WIND commented Dec 10, 2023

When I originally worked on this we talked about this. It was decided, at that time, that if an upper level protocol wanted to control the nexthop group that they should be controlling the nexthop group completely. Can you give me a use case for this change in behavior so I can understand how it is going to be used?

Completely defining the nexthop at protocol level would imply that the protocol sends both the information, and the resolution to zebra.

nexthop-group A
 nexthop 172.31.0.1/24 recursive
 nexthop 192.0.2.100 loop1

The first next-hop is just here for display reasons (show ip route must be the same), whereas zebra already does the same resolution.

This is why I propose an alternative, where zebra will do the recursive resolution when needed.
BGP will still ensure that the nexthop to send is the appropriate one. For instance, a distinction per AFI/SAFI will be done, in order to take into account the label value.

I plan to act similarly, in order to support srte_color.

@pbrisset
Copy link

ubuntu2204(config)# ip route 172.31.0.0/24 192.0.2.88
ubuntu2204(config)# nexthop-group nh1
ubuntu2204(config-nh-group)# allow-recursion 
ubuntu2204(config-nh-group)# nexthop 172.31.0.103
ubuntu2204(config-nh-group)# do show nexthop-group rib sharp 
ID: 1666 (sharp)
     RefCnt: 2
     Uptime: 00:00:10
     VRF: default
     Valid
     Depends: (1667)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (181818168)
ID: 1667 (sharp)
     RefCnt: 3
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Interface Index: 3056
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (1666)
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Depends: (1666)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1

Hi,

I'm looking at the proposed configuration. It is not clear to me how that work; how the NHG and interconnect with the static IP.
ubuntu2204(config)# ip route 172.31.0.0/24 192.0.2.88
ubuntu2204(config)# nexthop-group nh1
ubuntu2204(config-nh-group)# allow-recursion
ubuntu2204(config-nh-group)# nexthop 172.31.0.103

from the show command output, they seems to be interconnected even thought the ip static do not have any reference to "nh1".

What am I missing?

Patrice

@pguibert6WIND
Copy link
Member Author

ubuntu2204(config)# ip route 172.31.0.0/24 192.0.2.88
ubuntu2204(config)# nexthop-group nh1
ubuntu2204(config-nh-group)# allow-recursion 
ubuntu2204(config-nh-group)# nexthop 172.31.0.103
ubuntu2204(config-nh-group)# do show nexthop-group rib sharp 
ID: 1666 (sharp)
     RefCnt: 2
     Uptime: 00:00:10
     VRF: default
     Valid
     Depends: (1667)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (181818168)
ID: 1667 (sharp)
     RefCnt: 3
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Interface Index: 3056
           via 192.0.2.88, loop1 (vrf default), weight 1
     Dependents: (1666)
ID: 181818168 (sharp)
     RefCnt: 1
     Uptime: 00:00:10
     VRF: default
     Valid, Installed
     Depends: (1666)
        via 172.31.0.103 (vrf default) (recursive), weight 1
           via 192.0.2.88, loop1 (vrf default), weight 1

Hi,

I'm looking at the proposed configuration. It is not clear to me how that work; how the NHG and interconnect with the static IP. ubuntu2204(config)# ip route 172.31.0.0/24 192.0.2.88 ubuntu2204(config)# nexthop-group nh1 ubuntu2204(config-nh-group)# allow-recursion ubuntu2204(config-nh-group)# nexthop 172.31.0.103

from the show command output, they seems to be interconnected even thought the ip static do not have any reference to "nh1".

What am I missing?

Patrice

Hi Patrice,

Thanks for your comment.
The new NHG is computing a resolved nexthop, and at the install time, solves it with the static route.
The resolved static is not "dependent". It is just a resolution, triggered by NHG_INSTALL.

Routes do the same. If you use a recursive static route, you will create a resolution.

ip route 192.168.0.0/24 172.31.0.100

If there is a change with the next-hop resolution, staticd will be informed via nexthop tracking, and staticd
will update the route by re-sending ROUTE_ADD.
What I do with nexthop-groups is similar what is done with routes.

@pguibert6WIND pguibert6WIND force-pushed the nhg_allow_recursion branch 3 times, most recently from 1eae19a to 3e30cf2 Compare January 19, 2024 13:35
This small rework prepares next commit.

Signed-off-by: Philippe Guibert <[email protected]>
In sharpd, configuring a nexthop-group with an IP nexthop that is not
directly connected does not create a NHG context in zebra:

> ubuntu2204(config)# interface loop1
> ubuntu2204(config-if)# ip address 192.0.2.1/24
> ubuntu2204(config-if)# exi
> ubuntu2204(config)# ip route 192.168.0.0/24 192.0.2.100
> ubuntu2204(config)# nexthop-group ABCD
> ubuntu2204(config-nh-group)# nexthop 192.168.0.100
> 2024/01/17 16:36:23 SHARP: [Y2G2F-ZTW6M] nhg_add: nhg 181818169 not sent: no valid nexthops
> ubuntu2204(config-nh-group)# do show nexthop-group rib 181818169
> Nexthop Group ID: 181818169 does not exist

Nexthops with an undefined interface index are neither handled
in ZEBRA, nor in SHARPD.

On the other hand, if we had created a route pointing to the same
nexthop (by using ZEBRA_ROUTE_ADD zapi), the next-hop would have been
installed, thanks to the ALLOW_RECURSION flag embedded in the zapi_route
structure.

Add the support for recursivity in nexthop groups by introducing
a flags attribute in the nexthop group structure. Define the
NEXTHOP_GROUP_ALLOW_RECURSION value, which will be used by ZEBRA to
check for the next-hop group recursivity. This flag will be used by
default, unless the 'no allow-recursion' command undoes it.

> ubuntu2204(config)# nexthop-group ABCD
> ubuntu2204(config-nh-group)# allow-recursion
> ubuntu2204(config-nh-group)# nexthop 192.168.0.100
> 2024/01/17 16:57:44 SHARP: [JWRCN-N9K90] Installed nhg 181818168
> ubuntu2204(config-nh-group)# do show nexthop-group rib 181818168
> ID: 181818168 (sharp)
>      RefCnt: 1
>      Uptime: 00:00:10
>      VRF: default
>      Valid, Installed
>      Depends: (73)
>         via 192.168.0.100 (vrf default) (recursive), weight 1
>            via 192.0.2.100, loop1 (vrf default), weight 1

This flag control is mandatory, as the recursivity may not be allowed
by protocols like eBGP single-hop.

Signed-off-by: Philippe Guibert <[email protected]>
Add a nexthop group test that ensures that a recursive
next-hop is resolved in zebra.

Signed-off-by: Philippe Guibert <[email protected]>
@pguibert6WIND
Copy link
Member Author

see #15212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants