Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis][voq] Fix to ignore duplicate nexthop in zebra #16275

Merged
merged 3 commits into from
Aug 31, 2023

Conversation

arlakshm
Copy link
Contributor

@arlakshm arlakshm commented Aug 24, 2023

Why I did it

Fixes #15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
  • Microsoft ADO (24997365):

How I did it

Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
@arlakshm
Copy link
Contributor Author

@saksarav-nokia @rlhui for viz...

@abdosi
Copy link
Contributor

abdosi commented Aug 29, 2023

@arlakshm did we confirm after this patch issue is not seen in regression of config reload ?

@arlakshm arlakshm requested a review from judyjoseph August 29, 2023 23:37
Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
@arlakshm
Copy link
Contributor Author

@arlakshm did we confirm after this patch issue is not seen in regression of config reload ?

@abdosi, we have not seen any issue with config reload so far in the testing @saksarav-nokia, were you able to test which this change. Any issue found?

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
@saksarav-nokia
Copy link
Contributor

@arlakshm did we confirm after this patch issue is not seen in regression of config reload ?

@abdosi, we have not seen any issue with config reload so far in the testing @saksarav-nokia, were you able to test which this change. Any issue found?

Yes. We tested the fix while ago when the FRR team suggested this patch and we didn't see the crash

@gechiang
Copy link
Collaborator

@lguohan , All comments have been addressed by @arlakshm , please help review/approve Thanks!

@mssonicbld
Copy link
Collaborator

@arlakshm PR conflicts with 202205 branch

@mssonicbld
Copy link
Collaborator

@arlakshm PR conflicts with 202211 branch

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Sep 3, 2023
)

Why I did it
Fixes sonic-net#15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
Microsoft ADO (24997365):

How I did it
Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #16420

sonic-otn pushed a commit to sonic-otn/sonic-buildimage that referenced this pull request Sep 20, 2023
)

Why I did it
Fixes sonic-net#15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
Microsoft ADO (24997365):

How I did it
Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Zebra process crashes intermittently during 'config reload' on the DUT line cards
9 participants