Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zebra process crashes intermittently during 'config reload' on the DUT line cards #15803

Closed
sanjair-git opened this issue Jul 12, 2023 · 9 comments · Fixed by #16275
Closed
Assignees
Labels
Chassis 🤖 Modular chassis support NOKIA P0 Priority of the issue Triaged this issue has been triaged

Comments

@sanjair-git
Copy link

Description

On a T2 chassis line card, when we do 'sudo config reload -y', we see 'zebra' process getting crashed and generates a core. We see this issue intermittently happening. (~ approx once in 30 attempts or so)

sonic-buildimage-msft commit:
Azure/sonic-buildimage-msft@6f19e12

Following logs are seen on the bgp docker, when the crash is happening.

2023-07-09 13:59:40,064 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)
2023-07-11 19:39:22,156 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)

Crash logs:
image

Attached the zebra core generated and the frr logs for reference.
frr.zip
zebra.1689104360.44.0.core.gz

Steps to reproduce the issue:

  1. On any T2 chassis line card, do 'sudo config reload -y' for multiple times.

Describe the results you received:

  • Zebra process under bgp docker gets crashed.
  • Core generated

Describe the results you expected:

Output of show version:

admin@ixre-egl-board1:~$ show version

SONiC Software Version: SONiC.HEAD.489499-msft-2205-ndk-d963ac161
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-18-2-amd64
Build commit: d963ac161
Build date: Fri Jul  7 18:18:51 UTC 2023
Built by: gitlab-runner@sonic-bld2

Platform: x86_64-nokia_ixr7250e_36x400g-r0
HwSKU: Nokia-IXR7250E-36x100G
ASIC: broadcom
ASIC Count: 2
Serial Number: EAG2-04-210
Model Number: N/A
Hardware Revision: 56
Uptime: 15:45:52 up 1 day, 12:15,  3 users,  load average: 1.56, 1.54, 1.59
Date: Wed 12 Jul 2023 15:45:52

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@rlhui
Copy link
Contributor

rlhui commented Jul 12, 2023

@mlok-nokia please check if similar issue/fix exisits in frr github

@vmittal-msft vmittal-msft added the Chassis 🤖 Modular chassis support label Jul 19, 2023
@vmittal-msft
Copy link
Contributor

vmittal-msft commented Jul 19, 2023

@mlok-nokia will help collect stack trace using debug image and we can follow up with FRR team afterwards. please open an issue with FRR team once we have all the information available.

@vmittal-msft vmittal-msft added the Triaged this issue has been triaged label Jul 19, 2023
@saksarav-nokia
Copy link
Contributor

@vmittal-msft
Here is the backtrace. Do you need more information?

Jul 11 19:39:20.466949 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: Received signal 11 at 1689104360 (si_addr 0x4, PC 0x7f1489871646); aborting...
Jul 11 19:39:20.467281 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: zlog_signal+0xf5 7f1489eaf215 7ffcca468db0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f1489e13000)
Jul 11 19:39:20.467505 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: PBKDF2_SHA256+0x4e1 7f1489edb851 7ffcca468ef0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f1489e13000)
Jul 11 19:39:20.467734 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: funlockfile+0x50 7f1489ded140 7ffcca469040 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7f1489dda000)
Jul 11 19:39:20.467892 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: ---- signal ----
Jul 11 19:39:20.467892 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: zfpm_netlink_encode_route+0x616 7f1489871646 7ffcca4695f0 /usr/lib/x86_64-linux-gnu/frr/modules/zebra_fpm.so (mapped at 0x7f1489869000)
Jul 11 19:39:20.468039 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: zfpm_route_for_update+0x4cc 7f1489870d5c 7ffcca46bf20 /usr/lib/x86_64-linux-gnu/frr/modules/zebra_fpm.so (mapped at 0x7f1489869000)
Jul 11 19:39:20.468247 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: thread_call+0x7d 7f1489eed48d 7ffcca46bf90 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f1489e13000)
Jul 11 19:39:20.468472 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: frr_run+0xe8 7f1489ea74a8 7ffcca46c030 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f1489e13000)
Jul 11 19:39:20.468559 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: main+0x346 55973ca42fc6 7ffcca46c250 /usr/lib/frr/zebra (mapped at 0x55973c9c9000)
Jul 11 19:39:20.468781 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: __libc_start_main+0xea 7f1489c29d0a 7ffcca46c350 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7f1489c06000)
Jul 11 19:39:20.468867 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: _start+0x2a 55973ca43a2a 7ffcca46c420 /usr/lib/frr/zebra (mapped at 0x55973c9c9000)
Jul 11 19:39:20.468867 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: in thread zfpm_write_cb scheduled from zebra/zebra_fpm.c:491 zfpm_write_on()
Jul 11 19:39:20.469087 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group libfrr

@saksarav-nokia
Copy link
Contributor

@vmittal-msft @arlakshm , We have the core and docker-fpm-frr-dbg.gz files. But i am unable attach them here since the size is bigger. Can we put in our teams shared link?.

@saksarav-nokia
Copy link
Contributor

Also seen the zebra crash with this back trace
Jul 23 20:43:08.445300 ixre-egl-board1 NOTICE swss1#portsyncd: :- onMsg: nlmsg type:16 key:eth1 admin:1 oper:1 addr:6a:f4:65:9b:7b:e0 ifindex:14 master:0 type:macvlan
Jul 23 20:43:08.445596 ixre-egl-board1 NOTICE swss0#portsyncd: :- onMsg: nlmsg type:16 key:eth1 admin:1 oper:1 addr:46:c8:d2:8c:b2:43 ifindex:15 master:0 type:macvlan
Jul 23 20:43:08.445978 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: Received signal 11 at 1690144988 (si_addr 0xd0, PC 0x55a69a48dc14); aborting...
Jul 23 20:43:08.446053 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: Received signal 11 at 1690144988 (si_addr 0xd0, PC 0x55d280ed7c14); aborting...
Jul 23 20:43:08.446765 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: zlog_signal+0xf5 7fb88f60d215 7fff58cfaa30 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fb88f571000)
Jul 23 20:43:08.446816 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: zlog_signal+0xf5 7f2e5d741215 7fffa37bfb70 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f2e5d6a5000)
Jul 23 20:43:08.447304 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: PBKDF2_SHA256+0x4e1 7fb88f639851 7fff58cfab70 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fb88f571000)
Jul 23 20:43:08.447348 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: PBKDF2_SHA256+0x4e1 7f2e5d76d851 7fffa37bfcb0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f2e5d6a5000)
Jul 23 20:43:08.447829 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: funlockfile+0x50 7fb88f54b140 7fff58cfacc0 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7fb88f538000)
Jul 23 20:43:08.447829 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: funlockfile+0x50 7f2e5d67f140 7fffa37bfe00 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7f2e5d66c000)
Jul 23 20:43:08.448086 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: ---- signal ----
Jul 23 20:43:08.448086 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: ---- signal ----
Jul 23 20:43:08.448086 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: zebra_vxlan_macvlan_up+0x24 55a69a48dc14 7fff58cfb250 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.448086 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: zebra_vxlan_macvlan_up+0x24 55d280ed7c14 7fffa37c0390 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.448311 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: if_up+0x278 55d280e5f508 7fffa37c03c0 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.448311 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: if_up+0x278 55a69a415508 7fff58cfb280 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.448530 ixre-egl-board1 INFO kernel: [25681.832068] amd-xgbe 0000:0a:00.4 qfpgap0: Link is Up - 2.5Gbps/Full - flow control rx/tx
Jul 23 20:43:08.448545 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: netlink_link_change+0xb69 55d280e59fa9 7fffa37c0410 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.448545 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: netlink_link_change+0xb69 55a69a40ffa9 7fff58cfb2d0 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.448758 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: netlink_parse_info+0x15d 55a69a41aeed 7fff58cfb860 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.448849 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: netlink_parse_info+0x15d 55d280e64eed 7fffa37c09a0 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.448979 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: kernel_dplane_read+0xb5 55a69a41b225 7fff58d03940 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.449025 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: kernel_dplane_read+0xb5 55d280e65225 7fffa37c8a80 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.449499 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: thread_call+0x7d 7fb88f64b48d 7fff58d039d0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fb88f571000)
Jul 23 20:43:08.449560 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: thread_call+0x7d 7f2e5d77f48d 7fffa37c8b10 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f2e5d6a5000)
Jul 23 20:43:08.450004 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: frr_run+0xe8 7fb88f6054a8 7fff58d03a70 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fb88f571000)
Jul 23 20:43:08.450089 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: frr_run+0xe8 7f2e5d7394a8 7fffa37c8bb0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7f2e5d6a5000)
Jul 23 20:43:08.450203 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: main+0x346 55a69a408fc6 7fff58d03c90 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.450310 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: main+0x346 55d280e52fc6 7fffa37c8dd0 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.450686 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: __libc_start_main+0xea 7fb88f387d0a 7fff58d03d90 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fb88f364000)
Jul 23 20:43:08.450805 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: __libc_start_main+0xea 7f2e5d4bbd0a 7fffa37c8ed0 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7f2e5d498000)
Jul 23 20:43:08.450907 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: _start+0x2a 55a69a409a2a 7fff58d03e60 /usr/lib/frr/zebra (mapped at 0x55a69a38f000)
Jul 23 20:43:08.450907 ixre-egl-board1 CRIT bgp0#ZEBRA[44]: in thread kernel_read scheduled from zebra/kernel_netlink.c:419 kernel_read()
Jul 23 20:43:08.451032 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: _start+0x2a 55d280e53a2a 7fffa37c8fa0 /usr/lib/frr/zebra (mapped at 0x55d280dd9000)
Jul 23 20:43:08.451050 ixre-egl-board1 CRIT bgp1#ZEBRA[44]: in thread kernel_read scheduled from zebra/kernel_netlink.c:419 kernel_read()
Jul 23 20:43:08.451284 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group libfrr
Jul 23 20:43:08.451338 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Buffer : 3 * 24
Jul 23 20:43:08.451338 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Host config : 5 * (variably sized)
Jul 23 20:43:08.451363 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Command Tokens : 4554 * 72
Jul 23 20:43:08.451446 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Command Token Text : 3269 * (variably sized)
Jul 23 20:43:08.451446 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Command Token Help : 3269 * (variably sized)
Jul 23 20:43:08.451468 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Command Argument Name : 1068 * (variably sized)
Jul 23 20:43:08.451483 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: RCU thread : 5 * 128
Jul 23 20:43:08.451530 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: RCU sequence barrier : 1 * 32
Jul 23 20:43:08.451530 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: FRR POSIX Thread : 10 * (variably sized)
Jul 23 20:43:08.451530 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group libfrr
Jul 23 20:43:08.451630 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Buffer : 3 * 24
Jul 23 20:43:08.451630 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Host config : 5 * (variably sized)
Jul 23 20:43:08.451630 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Command Tokens : 4554 * 72
Jul 23 20:43:08.451630 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Command Token Text : 3269 * (variably sized)
Jul 23 20:43:08.451719 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: POSIX sync primitives : 10 * (variably sized)
Jul 23 20:43:08.451719 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Graph : 31 * 8
Jul 23 20:43:08.451719 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Command Token Help : 3269 * (variably sized)
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Graph Node : 5333 * 32
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Command Argument Name : 1068 * (variably sized)
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: RCU thread : 5 * 128
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Hash : 740 * (variably sized)
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: RCU sequence barrier : 1 * 32
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Hash Bucket : 1162 * 32
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: FRR POSIX Thread : 10 * (variably sized)
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Hash Index : 370 * (variably sized)
Jul 23 20:43:08.451759 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: POSIX sync primitives : 10 * (variably sized)
Jul 23 20:43:08.451816 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Interface : 32 * 280
Jul 23 20:43:08.451816 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Connected : 64 * 48
Jul 23 20:43:08.451816 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Link List : 297 * 40
Jul 23 20:43:08.451841 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Link Node : 611 * 24
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Graph : 31 * 8
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Temporary memory : 5 * (variably sized)
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Graph Node : 5333 * 32
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Bitfield memory : 1 * 2052
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Hash : 740 * (variably sized)
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Module loading name : 2 * (variably sized)
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Hash Bucket : 1160 * 32
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Nexthop : 110 * 152
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Hash Index : 370 * (variably sized)
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Interface : 32 * 280
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: NetNS Context : 2 * (variably sized)
Jul 23 20:43:08.451884 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Connected : 64 * 48
Jul 23 20:43:08.451953 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Link List : 297 * 40
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Link Node : 611 * 24
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Temporary memory : 5 * (variably sized)
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: NetNS Name : 1 * 18
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Bitfield memory : 1 * 2052
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Module loading name : 2 * (variably sized)
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Northbound Node : 655 * 1192
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Nexthop : 110 * 152
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Northbound Configuration : 2 * 16
Jul 23 20:43:08.452025 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: NetNS Context : 2 * (variably sized)
Jul 23 20:43:08.452093 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Northbound Configuration Entry: 45 * 1032
Jul 23 20:43:08.452344 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route map rule str : 2 * (variably sized)
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route map compiled : 2 * 16
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Prefix : 64 * 48
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route map dependency : 2 * 24
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Privilege information : 4 * (variably sized)
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route map dependency data : 2 * 16
Jul 23 20:43:08.452390 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map : 12 * 120
Jul 23 20:43:08.452430 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Stream : 8 * (variably sized)
Jul 23 20:43:08.452430 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Stream FIFO : 7 * 64
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route table : 67 * 56
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map name : 20 * (variably sized)
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map index : 18 * 152
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route node : 51982 * (variably sized)
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map rule : 2 * 40
Jul 23 20:43:08.452458 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Thread : 30 * 160
Jul 23 20:43:08.452519 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map rule str : 2 * (variably sized)
Jul 23 20:43:08.452519 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Thread master : 24 * (variably sized)
Jul 23 20:43:08.452519 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Thread Poll Info : 12 * 8388608
Jul 23 20:43:08.452539 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Thread stats : 37 * 96
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Typed-hash bucket : 51 * (variably sized)
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map compiled : 2 * 16
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map dependency : 2 * 24
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Typed-heap array : 1 * 576
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route map dependency data : 2 * 16
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Vector : 10729 * 24
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Stream : 8 * (variably sized)
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Vector index : 10729 * (variably sized)
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: VRF : 1 * 216
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: VRF bit-map : 4 * 8
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: VTY server : 2 * 32
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Stream FIFO : 7 * 64
Jul 23 20:43:08.452585 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Work queue : 2 * (variably sized)
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route table : 67 * 56
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route node : 51982 * (variably sized)
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Thread : 30 * 160
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Thread master : 24 * (variably sized)
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Thread Poll Info : 12 * 8388608
Jul 23 20:43:08.452716 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Thread stats : 37 * 96
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Typed-hash bucket : 51 * (variably sized)
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Typed-heap array : 1 * 576
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Vector : 10729 * 24
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Vector index : 10729 * (variably sized)
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: VRF : 1 * 216
Jul 23 20:43:08.452855 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Work queue item : 1 * 24
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: VRF bit-map : 4 * 8
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Work queue name string : 1 * 22
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: VTY server : 2 * 32
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: YANG module : 7 * 48
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Work queue : 2 * (variably sized)
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Work queue item : 1 * 24
Jul 23 20:43:08.452900 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: log thread-local buffer : 6 * 24608
Jul 23 20:43:08.452949 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Work queue name string : 1 * 22
Jul 23 20:43:08.452949 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group logging subsystem
Jul 23 20:43:08.452949 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: syslog target : 1 * 56
Jul 23 20:43:08.452949 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group Label Manager
Jul 23 20:43:08.452985 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group Table Manager
Jul 23 20:43:08.452985 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Table Manager Context : 1 * 16
Jul 23 20:43:08.453016 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group SRv6 Manager
Jul 23 20:43:08.453016 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: showing active allocations in memory group zebra
Jul 23 20:43:08.453016 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: YANG module : 7 * 48
Jul 23 20:43:08.453016 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Zebra Interface Information : 32 * 472
Jul 23 20:43:08.453016 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: log thread-local buffer : 6 * 24608
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group logging subsystem
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Zebra Netlink buffers : 1 * 131072
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: syslog target : 1 * 56
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Router Advertisement Prefix : 17 * 48
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group Label Manager
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Zebra DPlane Provider : 1 * 232
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group Table Manager
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: DPlane NSes : 1 * 120
Jul 23 20:43:08.453064 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Table Manager Context : 1 * 16
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Nexthop Group Entry : 98 * 112
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group SRv6 Manager
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: showing active allocations in memory group zebra
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Nexthop Group Connected : 74 * 40
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Zebra Interface Information : 32 * 472
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Zebra Name Space : 1 * 400
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: PTM BFD process registration table.: 1 * 32
Jul 23 20:43:08.453133 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Route Entry : 334 * 104
Jul 23 20:43:08.453180 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: RIB destination : 245 * 88
Jul 23 20:43:08.453180 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: RIB table info : 4 * 24
Jul 23 20:43:08.453202 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Zebra VRF table : 4 * 56
Jul 23 20:43:08.453216 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: Nexthop tracking object : 36 * 240
Jul 23 20:43:08.453216 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Zebra Netlink buffers : 1 * 131072
Jul 23 20:43:08.453216 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: ZEBRA VRF : 1 * 5056
Jul 23 20:43:08.453249 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Router Advertisement Prefix : 17 * 48
Jul 23 20:43:08.453249 ixre-egl-board1 INFO bgp0#supervisord: zebra core_handler: memstats: MH global info : 1 * 128
Jul 23 20:43:08.453249 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Zebra DPlane Provider : 1 * 232
Jul 23 20:43:08.453249 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: DPlane NSes : 1 * 120
Jul 23 20:43:08.453249 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Nexthop Group Entry : 98 * 112
Jul 23 20:43:08.453282 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Nexthop Group Connected : 74 * 40
Jul 23 20:43:08.453282 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Zebra Name Space : 1 * 400
Jul 23 20:43:08.453327 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: PTM BFD process registration table.: 1 * 32
Jul 23 20:43:08.453327 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Route Entry : 334 * 104
Jul 23 20:43:08.453327 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: RIB destination : 245 * 88
Jul 23 20:43:08.453327 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: RIB table info : 4 * 24
Jul 23 20:43:08.453327 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Zebra VRF table : 4 * 56
Jul 23 20:43:08.453366 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: Nexthop tracking object : 36 * 240
Jul 23 20:43:08.453401 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: ZEBRA VRF : 1 * 5056
Jul 23 20:43:08.453401 ixre-egl-board1 INFO bgp1#supervisord: zebra core_handler: memstats: MH global info : 1 * 128
Jul 23 20:43:10.004821 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:10,004 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)
Jul 23 20:43:10.004821 ixre-egl-board1 INFO bgp1#supervisord: fpmsyncd Connection lost, reconnecting...
Jul 23 20:43:10.004821 ixre-egl-board1 INFO bgp1#supervisord: fpmsyncd Waiting for fpm-client connection...
Jul 23 20:43:10.009939 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:10,009 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)
Jul 23 20:43:10.009939 ixre-egl-board1 INFO bgp0#supervisord: fpmsyncd Connection lost, reconnecting...
Jul 23 20:43:10.009968 ixre-egl-board1 INFO bgp0#supervisord: fpmsyncd Waiting for fpm-client connection...
Jul 23 20:43:10.013537 ixre-egl-board1 INFO bgp1#supervisor-proc-exit-listener: Process 'zebra' exited unexpectedly. Terminating supervisor 'bgp'
Jul 23 20:43:10.014439 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:10,013 WARN received SIGTERM indicating exit request
Jul 23 20:43:10.014640 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:10,014 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd, bgpmon, fpmsyncd, staticroutebfd to die
Jul 23 20:43:10.018463 ixre-egl-board1 INFO bgp0#supervisor-proc-exit-listener: Process 'zebra' exited unexpectedly. Terminating supervisor 'bgp'
Jul 23 20:43:10.019080 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:10,018 WARN received SIGTERM indicating exit request
Jul 23 20:43:10.019228 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:10,018 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd, bgpmon, fpmsyncd, staticroutebfd to die
Jul 23 20:43:10.612669 ixre-egl-board1 NOTICE coredump_gen_handler.py[1136105]: Another instance of techsupport running, aborting this. stderr: Accquiring lock failed, PID 1136228 is active
Jul 23 20:43:11.016600 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:11,016 INFO stopped: staticroutebfd (exit status 0)
Jul 23 20:43:11.017600 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:11,016 INFO stopped: fpmsyncd (terminated by SIGTERM)
Jul 23 20:43:11.018342 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:11,018 INFO stopped: bgpmon (terminated by SIGTERM)
Jul 23 20:43:11.021082 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:11,020 INFO stopped: staticroutebfd (exit status 0)
Jul 23 20:43:11.021561 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:11,021 INFO stopped: fpmsyncd (terminated by SIGTERM)
Jul 23 20:43:11.023169 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:11,022 INFO stopped: bgpmon (terminated by SIGTERM)
Jul 23 20:43:12.288224 ixre-egl-board1 INFO nokia-ndk-qfpga-mgr.sh[1135980]: ndk_qfpga_mgr is up and running
Jul 23 20:43:12.288684 ixre-egl-board1 INFO systemd[1]: Started Nokia IXR-7250 QFpga Manager Service.
Jul 23 20:43:13.021099 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:13,020 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:13.024703 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:13,024 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:16.025136 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:16,024 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:16.028473 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:16,028 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:18.790211 ixre-egl-board1 INFO dbus-daemon[820]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.5' (uid=0 pid=1137750 comm="systemd-analyze plot ")
Jul 23 20:43:18.795293 ixre-egl-board1 INFO systemd[1]: Starting Hostname Service...
Jul 23 20:43:18.910901 ixre-egl-board1 INFO dbus-daemon[820]: [system] Successfully activated service 'org.freedesktop.hostname1'
Jul 23 20:43:18.911106 ixre-egl-board1 INFO systemd[1]: Started Hostname Service.
Jul 23 20:43:19.029171 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:19,028 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:19.031514 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:19,031 INFO waiting for supervisor-proc-exit-listener, rsyslogd, staticd, bgpd, bgpcfgd to die
Jul 23 20:43:21.031846 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:21,031 WARN killing 'bgpcfgd' (66) with SIGKILL
Jul 23 20:43:21.033187 ixre-egl-board1 INFO bgp0#supervisord 2023-07-23 20:43:21,032 WARN killing 'bgpcfgd' (66) with SIGKILL
Jul 23 20:43:21.034752 ixre-egl-board1 INFO bgp1#supervisord 2023-07-23 20:43:21,034 INFO stopped: bgpcfgd (terminated by SIGKILL)

@mlok-nokia
Copy link
Contributor

@mlok-nokia please check if similar issue/fix exisits in frr github

No. We don't find any related fix

@mlok-nokia
Copy link
Contributor

Issue has been raised on the FRRouting submodule FRRouting/frr#14092

@saksarav-nokia
Copy link
Contributor

show_ip_route.txt

@saksarav-nokia
Copy link
Contributor

bgp_asic0.zip

@rlhui rlhui added the P0 Priority of the issue label Aug 9, 2023
lguohan pushed a commit that referenced this issue Aug 31, 2023
Why I did it
Fixes #15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
Microsoft ADO (24997365):

How I did it
Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Sep 3, 2023
)

Why I did it
Fixes sonic-net#15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
Microsoft ADO (24997365):

How I did it
Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
sonic-otn pushed a commit to sonic-otn/sonic-buildimage that referenced this issue Sep 20, 2023
)

Why I did it
Fixes sonic-net#15803

In SONiC chassis, routes have recursive nexthop resolution when the routes are learnt from remote linecard.
In some cases after recursive nexthop resolution the number of nexthop for a route could reach 256.
Zebra ran out of space when filling up 256 nexthops which causes zebra crash.

Work item tracking
Microsoft ADO (24997365):

How I did it
Create a patch to port FRRouting/frr#14096 which has change to ignore duplicate nexthop when filling up fpm message

Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support NOKIA P0 Priority of the issue Triaged this issue has been triaged
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants