forked from sonic-net/sonic-buildimage
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARS POC #7
Open
VladimirKuk
wants to merge
247
commits into
master
Choose a base branch
from
Mrvl-ARS
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
ARS POC #7
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Vladimir Kuk <[email protected]>
Signed-off-by: Vladimir Kuk <[email protected]>
Signed-off-by: Vladimir Kuk <[email protected]>
Signed-off-by: Vladimir Kuk <[email protected]>
shiraez
pushed a commit
that referenced
this pull request
Dec 11, 2024
…et#21095) Adding the below fix from FRR FRRouting/frr#17297 This is to fix the following crash which is a statistical issue [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 sonic-net#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
Signed-off-by: Vladimir Kuk <[email protected]>
Signed-off-by: Vladimir Kuk <[email protected]>
- Why I did it After this pull request sonic-net#19190 , the pmon has been added to the start list in fast/warm reboot scenarios. However, certain non-critical daemons of pmon could be delayed, resulting in a saving of approximately 1 second in the reboot process. For performance considerations, especially as the current time usage of fast reboot is closer to 30 seconds limitation, this change could ease the pressure. - How I did it add a script as fast/warm reboot monitor and relative supervisord rlues. once the script exited means the reboot process has ended, other delayed daemon would then initialize. - How to verify it check the fast/warm reboot time usage Signed-off-by: Yuanzhe, Liu <[email protected]>
* [Micas/Platform]platform support M2-W6920-32QC2X Signed-off-by: philo <[email protected]> * update device files Signed-off-by: philo <[email protected]> * triggle rebuild * rebuild * rebuild * rebuild * triggle rebuild * triggle rebuild * triggle rebuild --------- Signed-off-by: philo <[email protected]>
…onic-net#20726) == Why I did it == Commit 06c469e added an extra redis instance. This resulted in a two item string without linefeeds in /etc/supervisor/critical_processes: program:redisprogram:redis_bmp That resulted in an error in syslog and docker-database failing. == Work item tracking == ERR database#supervisor-proc-exit-listener: Syntax of the line program:redisprogram:redis_bmp#012 in processes file is incorrect. Exiting... (sonic-net#20636) ossobv#17 == How I did it == Replace the jinja2 whitespace eating hyphens from BOL to EOL. Note that j2 and the jinja2 parser in sonig-cfggen do not behave the same. The sonig-cfggen is the relevant one. Before: $ j2 ./dockers/docker-database/critical_processes.j2.old -f json \ <<< '{"INSTANCES":{"foo":"bar","baz":"..."}}' | | program:foo program:baz # docker exec database sonic-cfggen \ -j /var/run/redis/sonic-db/database_config.json \ -t /usr/share/sonic/templates/critical_processes.j2.old program:redisprogram:redis_bmp After: $ j2 ./dockers/docker-database/critical_processes.j2 -f json \ <<< '{"INSTANCES":{"foo":"bar","baz":"..."}}' program:foo program:baz # docker exec database sonic-cfggen \ -j /var/run/redis/sonic-db/database_config.json \ -t /usr/share/sonic/templates/critical_processes.j2 program:redis program:redis_bmp | After this fix, the output in /etc/supervisor/critical_processes is correct and the error from docker-database is gone.
Why I did it Bugfix for Yang model of BGP Allowed Prefix. Support optional NEIGHBOR_TYPE in key. Support optional le and ge in prefixes_v4/prefixes_v6 list (e.g., 10.20.30.0/24 le 30). Work item tracking Microsoft ADO (number only): 30001113 How I did it Updated sonic-bgp-allowed-prefix.yang. Define optional value NEIGHBOR_TYPE in key. Define type bgp-allowed-ipv4-prefix and bgp-allowed-ipv6-prefix to support the optional suffix in prefixes_v4/prefixes_v6 list. How to verify it Verified by UT:
…D automatically (sonic-net#20878) #### Why I did it src/sonic-platform-daemons ``` * b276e41 - (HEAD -> master, origin/master, origin/HEAD) [SmartSwitch] Extend implementation of the DPU chassis daemon. (sonic-net#563) (9 hours ago) [Oleksandr Ivantsiv] ``` #### How I did it #### How to verify it #### Description for the changelog
Update EZB files to version 1.09 to support SAI 1.14.0.2 for ac3x(armhf) Update EZB files to version 1.09 to support SAI 1.14.0.3 for ac5x(arm64)
* Add stp submodule * Changing the sonic-stp repo url
* [Marvell] Falcon 3.2T HwSku support Signed-off-by: Rajkumar P R <[email protected]>
…ffic script (sonic-net#20635) [SmartSwitch] Added inbound traffic capability for DPU management traffic script
Fix nvidia smartswitch build pipeline Signed-off-by: Prabhat Aravind <[email protected]>
Why I did it BMP instance should not be launched on DPU database. Work item tracking Microsoft ADO (number only): How I did it Added some condition check to avoid bmp instance from being instantiated on DPU database instances. How to verify it local verified on KVM NPU platform.
…SIS_APP_DB (sonic-net#20369) Modify database.sh to create a initial SYSTEM_LAG_IDS_FREE_LIST in the CHASSIS_APP_DB on SUP during database-chassis startup Modify the database consistency check in swss.sh to append the lagid to the end of SYSTEM_LAG_IDS_FREE_LIST when lagid is released. Modify the lag_id_end=1023 (not 1024) in chassisdb.conf since BCM supports the large lagid is 1023 Signed-off-by: mlok <[email protected]>
Enable Multi DB
Why I did it Build bmp container into sonic-buildimage, and added relevant daemon/file handling. Work item tracking Microsoft ADO (number only):27588893 How I did it Build bmp container into sonic-buildimage, and added relevant daemon/file handling. How to verify it Local build successfully and verified in lab DUT.
…ly (sonic-net#20892) #### Why I did it src/sonic-bmp ``` * a2d576b - (HEAD -> master, origin/master, origin/HEAD) Update README.md to add azure pipeline status link (12 hours ago) [Feng-msft] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#20894) #### Why I did it src/sonic-swss ``` * eda63a9b - (HEAD -> master, origin/master, origin/HEAD) Vlanmgrd handling of portchannel does not exist more gracefully. (sonic-net#3367) (6 hours ago) [abdosi] ``` #### How I did it #### How to verify it #### Description for the changelog
…omatically (sonic-net#20854) #### Why I did it src/sonic-swss-common ``` * ebd2afb - (HEAD -> master, origin/master, origin/HEAD) Supports FRR-VRRP configuration (sonic-net#813) (25 hours ago) [Philo] * fe30ccd - [DASH] Add DASH Meter Policy , Rule , Counter table definitions (sonic-net#949) (2 days ago) [Sundara Gurunathan] * 901f3b4 - [common] enable redispipeline to only publish after flush (sonic-net#895) (3 days ago) [Yijiao Qin] ``` #### How I did it #### How to verify it #### Description for the changelog
Run pipeline all mgmt tests with this SAI version on a T2 testbed.
Why I did it Keep using syslog for bmp since no large output. How I did it Revert previous bmp log change How to verify it revert change, pending verification pass.
Adding FRR CLI to support SRv6 static. The HLD for the feature is available at sonic-net/SONiC#1860 Signed-off-by: Carmine Scarpitta <[email protected]>
Why I did it To support the addition of two new tables in CONFIG_DB, i.e. SRV6_MY_SIDS and SRV6_MY_LOCATORS, in order to allow configuration for SRv6 in SONiC. Work item tracking Microsoft ADO (number only): 30513277 How I did it I define the YANG model based on SRv6 HLD. How to verify it Run the unit tests and build image.
Why I did it if critical process crashes or killed, bmp docker container will not be auto-restarted. How I did it /usr/bin/supervisor-proc-exit-listener takes in charge of critical process monitor and event publish, thus it should be autorestar-ted in any case, otherwise there might be issue if supervisor-proc-exit-listener crashes, or in some test cases like "docker exec bmp kill -SIGKILL -1" critical processes may not work correctly in some race condition (depends on whether supervisor-proc-exit-listener is the last one to be killed) When a container receives the SIGKILL signal to terminate its processes, the order in which the processes are actually terminated can depend on the scheduling and resource availability within the container. If supervisor-proc-exit-listener is killed first before critical process, container auto restart will not be launched as expected.
…t#21366) Why I did it Use debian mirror snapshot instead of debian version pinning. Because debian version pinning can't handle package uninstallation scenario.
…atically (sonic-net#21420) #### Why I did it src/sonic-snmpagent ``` * 9e2c50a - (HEAD -> master, origin/master, origin/HEAD) Fix snmp agent not-responding issue when high CPU utilization (sonic-net#345) (2 hours ago) [Jianquan Ye] ``` #### How I did it #### How to verify it #### Description for the changelog
…tomatically (sonic-net#21416) #### Why I did it src/sonic-linux-kernel ``` * 416e7a4 - (HEAD -> master, origin/master, origin/HEAD) Fix optoe's write_max when using native i2c driver (sonic-net#407) (6 hours ago) [Prince George] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#21422) #### Why I did it src/sonic-swss ``` * 4eb74f00 - (HEAD -> master, origin/master, origin/HEAD) [orchagent] Fix: ERR swss#orchagent: :- setPortPvid: pvid setting for tunnel Port_EVPN_XXX is not allowed (sonic-net#3402) (9 hours ago) [Brad House] ``` #### How I did it #### How to verify it #### Description for the changelog
sonic-net#21355) Why I did it It's one part of the fixes of sonic-net#21314 SNMP walker request will always timeout when 100% CPU utilization. Work item tracking Microsoft ADO 30112399: How I did it Enable SNMP dynamic frequency on packet chassis. How to verify it snmp/test_snmp_cpu.py(https://github.com/sonic-net/sonic-mgmt/blob/master/tests/snmp/test_snmp_cpu.py) tests the scenario.
Why I did it After docker-syncd-brcm-dnx-rpc is moved to bookworm in master, the libthrift*.so is not installed inside the syncd docker and the syncd process fails to come up. Work item tracking Microsoft ADO (number only): How I did it Installed libthrift-0.17.0 How to verify it Verified that the syncd dockers and swss dockers stay up and able to run Qos tests
Why I did it Improve the t1 config to align with YANG validation How I did it Add missing leafref and mandatory field to the config How to verify it YANG validation check on generated config
libthrift did not get installed in the Broadcom syncd RPC container. However, syncd-rpc requires it.
…utomatically (sonic-net#21437) #### Why I did it src/sonic-host-services ``` * 0430ada - (HEAD -> master, origin/master, origin/HEAD) Add implementation for DockerService.List (sonic-net#199) (16 hours ago) [Dawei Huang] ``` #### How I did it #### How to verify it #### Description for the changelog
Why I did it Add an additional platform to the SONiC support list Work item tracking Microsoft ADO (number only): How I did it Added necessary platform configurations and identification logic. Some iterations are still necessary on those. How to verify it An image containing this PR and the necessary driver changes should end up with links up. Which release branch to backport (provide reason below if selected) 202411 msft-202412 Description for the changelog Add initial support for Moby platform
Why I did it Fix front panel LEDs for Quicksilver Fix fan LEDs for Quicksilver Add Moby platform Work item tracking Microsoft ADO (number only): How I did it Updated Arista platform submodules
- Why I did it Update SAI Version SAIBuild245.3..13 - How I did it Upload SAI artifact and update mlnx-sai.mk file - How to verify it Run sonic-mgmt tests
…omatically (sonic-net#21449) #### Why I did it src/sonic-swss-common ``` * 5a4b4a5 - (HEAD -> master, origin/master, origin/HEAD) C API Exceptions (sonic-net#967) (2 hours ago) [erer1243] * b58a501 - Add swss::Table to c api (sonic-net#964) (12 hours ago) [erer1243] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#21375) #### Why I did it src/sonic-gnmi ``` * b017531 - (HEAD -> master, origin/master, origin/HEAD) Fix roles checking if cname not exist (sonic-net#339) (20 hours ago) [ganglv] * 3afc927 - Better modularization for gnoi_client.go (5 days ago) [Dawei Huang] |\ | failure_prs.log skip_prs.log 6059b18 - finish package sonic module. (6 days ago) [Dawei Huang] | failure_prs.log skip_prs.log 6867664 - finish packaging system module. (6 days ago) [Dawei Huang] | failure_prs.log skip_prs.log 1f77986 - seperate util and system module. (6 days ago) [Dawei Huang] | failure_prs.log skip_prs.log 35df113 - seperate flags into a package config. (6 days ago) [Dawei Huang] | failure_prs.log skip_prs.log 1405cd2 - format and naming clean up for gnoi_client.go. (7 days ago) [Dawei Huang] * aa547ad - Improve GNMI service to limit API access by role (sonic-net#335) (6 days ago) [ganglv] ``` #### How I did it #### How to verify it #### Description for the changelog
Fixed issue : sonic-net#20575 Why I did it "config-reload" in dualtor topologies were failing due to absence of TC_TO_DSCP Yang model. The above failure was seen after the the PR sonic-net/sonic-utilities#3102 How to verify it Step-1: In DUT add the yang file to "/usr/local/yang-models/sonic-tc-dscp-map.yang" to this path. Step-2: config reload -y Tested branch (Please provide the tested image version) 202411 Description for the changelog Adding YANG model for TC_TO_DSCP_MAP along with test files.
- Why I did it Add SAI_KEY_SPC5_LOSSY_SCHEDULING=1 and SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1 to SN5640-simx sai.profile, to support those features in the SKU - How I did it Added parameters to file - How to verify it Deploy SKU and check SAI logs
…ers and DSCP mapping (sonic-net#21427) - Why I did it To update buffers for Mellanox-SN5600-C256S1 and Mellanox-SN5600-C224O8 - How I did it Update proper files - qos.json.j2, buffer_defaults_objects.j2, buffers_defaults_t0/t1.j2 - How to verify it Run test and check SDK dumps values
Why I did it sonic-buildimage repo integrates submodule code into the image by creating submodule advance PRs. However, these PRs often fail, and as time passes or the number of submodule PRs increases, identifying which PR caused the test failure becomes increasingly challenging. To address this, we need a pipeline capable of quickly building a VS image based on specific submodule commit IDs and precisely running the failed tests. This approach will help us efficiently pinpoint the submodule PR responsible for the failure. How I did it Create a build VS and test pipeline, users could build VS image and run tests based on specific submodule name and id, branch, topology, test scripts, features.
… automatically (sonic-net#21460) #### Why I did it src/sonic-platform-common ``` * 75c320d - (HEAD -> master, origin/master, origin/HEAD) Change Virtium SSD which doesn't support SmartCMD, to use only smartctl (sonic-net#522) (4 hours ago) [Noa Or] ``` #### How I did it #### How to verify it #### Description for the changelog
) Why I did it This PR is a temporary change, once the rshim interface will be replaced this PR will not be required anymore To mount the dbus socket in pmon container as systemctl command has to be executed to start/stop service from PMON container during admin state/ reboot command execution dockers/docker-platform-monitor/Dockerfile.j2 - Addition of dbus package for mellanox specific platform in order to use dbus-send command files/build_templates/docker_image_ctl.j2 - Mount socket, since we need to use the systemctl command to start/stop service from pmon container How I did it How to verify it dbus-send commands in Pmon container can be performed in order to start / stop the [email protected] which is relevant for starting or stopping the rshim service
To have the latest DASH bmv2 pipeline and DASH libsai library for DPU KVM image. Work item tracking Microsoft ADO 30793749:
…e starting dash-engine (sonic-net#21452) Why I did it If the attached ports of dash-engine are not UP, the dash-engine will not be able to receive any packets. The below is the log when starting up dash-engine without the attached ports being UP: Calling target program-options parser Adding interface eth1 as port 0 [09:18:14.056] [bmv2] [D] [thread 7] Adding interface eth1 as port 0 [09:18:14.102] [bmv2] [E] [thread 7] Add port operation failed Adding interface eth2 as port 1 [09:18:14.102] [bmv2] [D] [thread 7] Adding interface eth2 as port 1 [09:18:14.150] [bmv2] [E] [thread 7] Add port operation failed Work item tracking Microsoft ADO 30887888: How I did it If the attached ports of dash-engine are not UP, ensure them to be UP before starting dash-engine How to verify it The dash-engine runs with ports added successfully. Adding interface eth1 as port 0 [09:40:16.810] [bmv2] [D] [thread 11] Adding interface eth1 as port 0 Adding interface eth2 as port 1 [09:40:16.863] [bmv2] [D] [thread 11] Adding interface eth2 as port 1
Fix sonic-net#20284 In 202405 and above, two extra steps are added before the start of every container which checks NUM_DPU and IS_DPU_DEVICE by parsing the platform.json file using the jq tool. This is only relevant for Smartswitch. However, this is adding some delay during the reconciliation phase of WR/FR resulting How I did it Set the environment variables for systemd by systemd-sonic-generator. Signed-off-by: Ze Gan <[email protected]>
…iple ptf nn agents connection (sonic-net#21070) When testing sonic with ptf dataplane connecting multiple ptf nn agents, some cases will fail because of packets queue in ptf were not polled thoroughly. This is a bug or missing feature in ptf: p4lang/ptf#207 as a short term quick fix, this PR will patch the ptf-py3 package and unblock our qualification process
- Why I did it To have the right sensors.conf file for SN5640 SIMX - How I did it Updated sensors.conf file under SN5640-SIMX platform folder - How to verify it run 'sensors' command on Mellanox SN5640 SIMX, and make sure no errors in output
flashrom recently started failing to build with the below error: ``` Cloning into 'flashrom-0.9.7'... /sonic/src/flashrom/flashrom-0.9.7 /sonic/src/flashrom fatal: 'tags/0.9.7' is not a commit and a branch 'flashrom-src' cannot be created from it ``` Nothing in sonic-buildimage has changed in relation to this so presumably flashrom upstream renamed their tags. This commit just fixes the formatting of the tag name to use the new format. Signed-off-by: Brad House (@bradh352)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I did it
Work item tracking
How I did it
How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)