Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGP Benchmark Performance #5

Open
wants to merge 70 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
664aef8
Merge pull request #2 from Azure/master
selldinesh Mar 15, 2021
e527d1f
Merge branch 'Azure:master' into master
selldinesh Jun 3, 2021
c773979
Merge branch 'Azure:master' into master
selldinesh Jun 16, 2021
8e73e0f
Adding BGP Benchmark testplan and test scripts
selldinesh Jun 25, 2021
265a81e
Merge branch 'Azure:master' into master
selldinesh Jun 25, 2021
914cc50
snappi 4.12 changes
selldinesh Jun 25, 2021
a7659d0
Test steps
selldinesh Jun 25, 2021
3b066da
Merge branch 'Azure:master' into master
selldinesh Jun 29, 2021
33a7d0b
snappi-0.4.14 changes
selldinesh Jun 29, 2021
8629730
Adding snappi fixtures
selldinesh Jun 29, 2021
689d51e
Resolving unused imports
selldinesh Jun 29, 2021
693a8fc
using snappi_api_serv_* fixtures
selldinesh Jun 29, 2021
b05bf49
v6 routes support
selldinesh Jul 2, 2021
bcd03b7
v6 support
selldinesh Jul 2, 2021
489fff1
try except block for interface configuration
selldinesh Jul 6, 2021
ec60200
linter and snappi-0.4.16 changes
selldinesh Jul 12, 2021
9f70680
removing unused variables
selldinesh Jul 12, 2021
39da3fc
Adding RIB-IN capacity test
selldinesh Jul 13, 2021
96f4c7c
final changes
selldinesh Jul 13, 2021
20e20be
Doc changes
selldinesh Jul 13, 2021
b1bcae5
thresholdixia/bgp/test_bgp_local_link_failover.py
selldinesh Jul 14, 2021
697d11b
comment removed
selldinesh Jul 14, 2021
38f57b0
testcase title change
selldinesh Jul 14, 2021
9c56ec5
resolving alert
selldinesh Jul 14, 2021
3f67368
Merge branch 'Azure:master' into master
selldinesh Jul 27, 2021
067ee51
Merge branch 'Azure:master' into master
selldinesh Jul 27, 2021
e7390c7
resolving merge conflict
selldinesh Jul 27, 2021
f1b9d39
Merge branch 'Azure:master' into bgp_convergence
selldinesh Jul 27, 2021
cef341b
updating merge conflict files
selldinesh Jul 27, 2021
acb1929
Merge branch 'Azure:master' into master
selldinesh Aug 4, 2021
7f34351
imports added
selldinesh Aug 5, 2021
74b541b
final changes
selldinesh Aug 5, 2021
c85d375
lag changes
selldinesh Sep 9, 2021
8f86a50
resolving alert
selldinesh Sep 16, 2021
9dbb56d
Adding reboot testcases
selldinesh Sep 17, 2021
5cf3c82
Merge branch 'Azure:master' into master
selldinesh Sep 17, 2021
a552fe3
Merge branch 'master' of https://github.com/selldinesh/sonic-mgmt int…
selldinesh Sep 17, 2021
4f818af
adding BGP convergence cases to tests/snappi folder
selldinesh Sep 17, 2021
6bbfa44
adding Reboot cases to tests/snappi folder
selldinesh Sep 17, 2021
e29f3b4
removing BGP files from test/ixia
selldinesh Sep 17, 2021
7d7af08
removing Reboot files from test/ixia
selldinesh Sep 17, 2021
7612f74
duthost config changes
selldinesh Sep 17, 2021
9345fd6
adding soft reboot case
selldinesh Sep 17, 2021
0df1594
Added try finally block
selldinesh Sep 17, 2021
adac26f
adding wait_group
selldinesh Oct 19, 2021
ce2791e
modified reboot_helper
selldinesh Oct 19, 2021
15f4ad8
resolving syntax err
selldinesh Nov 1, 2021
341aefa
resolving alerts
selldinesh Nov 2, 2021
aab8e15
resolving alert
selldinesh Nov 2, 2021
a53e4c8
deleting reboot scripts
selldinesh Nov 3, 2021
4b820f8
deleting init
selldinesh Nov 3, 2021
2a2e4ee
portname fix
selldinesh Nov 10, 2021
da9a10f
removed pdb
selldinesh Nov 10, 2021
5bda7c3
final changes
selldinesh Nov 12, 2021
970b65e
python exception block added
selldinesh Nov 22, 2021
14b7cf6
bgp config
selldinesh Nov 24, 2021
5de8993
bgp config and port speed change
selldinesh Nov 29, 2021
6b5ceec
resolving alerts
selldinesh Dec 6, 2021
bcf739d
resolving alert
selldinesh Dec 8, 2021
1894960
adding __init__.py
selldinesh Dec 8, 2021
75fa26d
modified rib-in capacity case
selldinesh Dec 9, 2021
78e0c7e
adding delta frames
selldinesh Dec 15, 2021
d86b7eb
added test comment
selldinesh Jan 5, 2022
bb2fe32
removed comment
selldinesh Jan 5, 2022
89ece91
modified for snappi 0.7.12 version
selldinesh Feb 8, 2022
c3ac9de
As-Path additions
selldinesh Feb 11, 2022
326464e
snappi fixture modification
selldinesh Feb 11, 2022
a4ec6a5
changing wait_until arguments
selldinesh Mar 2, 2022
2a47fcf
changing the arguments for wait_until function call
selldinesh Mar 16, 2022
a2c4540
Update bgp_convergence_helper.py
selldinesh Mar 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 190 additions & 0 deletions docs/testplan/BGP-Convergence-Testplan-for-Benchmark-Performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# BGP convergence test plan for benchmark performance

- [BGP convergence test plan for benchmark performance](#bgp-convergence-test-plan-for-benchmark-performance)
- [Overview](#Overview)
- [Scope](#Scope)
- [Testbed](#Keysight-Testbed)
- [Topology](#Topology)
- [SONiC switch as ToR](#SONiC-switch-as-ToR)
- [SONiC switch as Leaf](#SONiC-switch-as-Leaf)
- [Setup configuration](#Setup-configuration)
- [Test methodology](#Test-methodology)
- [Test cases](#Test-cases)
- [Test case # 1 – BGP Remote Link Failover Convergence (route withdraw)](#test-case--1--convergence-performance-when-remote-link-fails-route-withdraw)
- [Test objective](#Test-objective)
- [Test steps](#Test-steps)
- [Test results](#Test-results)
- [Test case](#Test-case)
- [Test case # 2 – BGP RIB-IN Convergence](#Test-case--2--RIB-IN-Convergence)
- [Test objective](#Test-objective-1)
- [Test steps](#Test-steps-1)
- [Test results](#Test-results-1)
- [Test case](#Test-case-1)
- [Test case # 3 - BGP Local Link Failover Convergence](#Test-case--3--Failover-convergence-with-local-link-failure)
- [Test objective](#Test-objective-2)
- [Test steps](#Test-steps-2)
- [Test results](#Test-results-2)
- [Test case](#Test-case-2)
- [Call for action](#Call-for-action)

## Overview
The purpose of these tests is to test the overall convergence of a data center network by simulating multiple network devices such as ToR/Leafs and using SONiC switch DUT as one of the ToR/Leaf, closely resembling production environment.

### Scope
These tests are targeted on fully functioning SONiC system. The purpose of these tests are to measure convergence when some unexpected failures such as remote link failure, local link failure, node failure or link faults etc occur and some expected failures such as maintenance or upgrade of devices occur in the SONiC system.

### Keysight Testbed
The tests will run on following testbeds:
* t0

![Single DUT Topology ](Img/Single_DUT_Topology.png)

## Topology
### SONiC switch as ToR

![SONiC DUT as ToR ](Img/Switch_as_ToR.png)

### SONiC switch as Leaf

![SONiC DUT as ToR ](Img/Switch_acting_as_leaf.png)

## Setup configuration
IPv4 EBGP neighborship will be configured between SONiC DUT and directly connected test ports. Test ports inturn will simulate the ToR's and Leafs by advertising IPv4/IPv6, dual-stack routes.

## Test Methodology
Following test methodologies will be used for measuring convergence.
* Traffic generator will be used to configure ebgp peering between chassis ports and SONiC DUT by advertising IPv4/IPv6, dual-stack routes.
* Receiving ports will be advertising the same VIP(virtual IP) addresses.
* Data traffic will be sent from server to these VIP addresses.
* Depending on the test case, the faults will be generated. Local link failures can be simulated on the port by "simulating link down" event.
* Remote link failures can be simulated by withdrawing the routes.
* Control to data plane convergence will be measured by noting down the precise time of the control plane event and the data plane event. Convergence will be measured by taking the difference between contol and data plane events. Traffic generator will create those events and provide us with the control to data plane convergence value under statistics.
* RIB-IN Convergence is the time it takes to install the routes in its RIB and then in its FIB to forward the traffic without any loss. In order to measure RIB-IN convergence, initially IPv4/IPv6 routes will not be advertised. Once traffic is sent, IPv4/IPv6 routes will be advertised and the timestamp will be noted. Once the traffic received rate goes above the configured threshold value, it will note down the data plane above threshold timestamp. The difference between these two event timestamps will provide us with the RIB-IN convergence value.
* Route capacity can be measured by advertising routes in a linear search fashion. By doing this we can figure out the maximum routes a switch can learn and install in its RIB and then in its FIB to forward traffic without any loss.

## Test cases
### Test case # 1 – BGP Remote Link Failover Convergence (route withdraw)
#### Test objective
Measure the convergence time when remote link failure event happens with in the network.

<p float="left">
<img src="Img/Single_Link_Failure.png" width="500" hspace="50"/>
<img src="Img/Failover_convergence.png" width="380" />
</p>


#### Test steps
* Configure IPv4 EBGP sessions between Keysight ports and the SONiC switch.
* Advertise IPv4 routes along with AS number via configured IPv4 BGP sessions.
* Configure and advertise same IPv4 routes from both the test ports.
* Configure another IPv4 session to send the traffic. This is the server port from which traffic will be sent to the VIP addresses.
* Start all protocols and verify that IPv4 BGP neighborship is established.
* Create a data traffic between the server port and receiver ports where the same VIP addresses are configured and enable tracking by "Destination Endpoint" and by "Destination session description".
* Set the desired threshold value for receiving traffic. By default we will be set to 90% of expected receiving rate.
* Apply and start the data traffic.
* Verify that traffic is equally distributed between the receiving ports without any loss.
* Simulate remote link failure by withdrawing the routes from one receiving port.
* Verify that the traffic is re-balanced and use the other available path to route the traffic.
* Drill down by "Destination Endpoint" under traffic statistics to get the control plane to data plane convergence value.
* In general the convergence value will fall in certain range. In order to achieve proper results, run the test multiple times and average out the test results.
* Set it back to default configuration.
#### Test results
| Event | Number Of IPv4 Routes | Convergence (ms) |
| :---: | :-: | :-: |
| Withdraw Routes | 1K | 388 |
| Withdraw Routes | 8K | 2870 |
| Withdraw Routes | 16K | 6188 |

For above test case, below are the test results when multiple remote link fails.

![Multiple link failure](Img/Multi_link_failure.png)

| Event | Number Of IPv4 Routes | Convergence (ms) |
| :---: | :-: | :-: |
| Withdraw Routes | 1K | 438 |
| Withdraw Routes | 8K | 2800 |
| Withdraw Routes | 16K | 7176 |

### Test Case
* sonic-mgmt/tests/ixia/bgp/test_bgp_remote_link_failover.py
### Test case # 2 – BGP RIB-IN Convergence
#### Test objective
Measure the convergence time to install the routes in its RIB and then in its FIB to forward the packets after the routes are advertised.

<p float="left">
<img src="Img/RIB-IN-Convergence_Topology.png" width="500" hspace="50"/>
<img src="Img/RIB-IN_Convergence_graph.png" width="380" />
</p>

#### Test steps
* Configure IPv4 EBGP sessions between Keysight ports and the SONiC switch.
* Configure IPv4 routes via configured IPv4 BGP sessions. Initially disable the routes so that they don't get advertised after starting the protocols.
* Configure the same IPv4 routes from both the test receiving ports.
* Configure another IPv4 session to send the traffic. This is the server port from which traffic will be sent to the VIP addresses.
* Start all protocols and verify that IPv4 BGP neighborship is established.
* Create a data traffic between the server port and receiver ports where the same VIP addresses are configured and enable tracking by "Destination Endpoint" and by "Destination session description".
* Set the desired threshold value for receiving traffic. By default we will be set to 90% of expected receiving rate.
* Apply and start the data traffic.
* Verify that no traffic is being forwarded.
* Enable/advertise the routes which are already configured.
* Control plane event timestamp will be noted down and once the receiving traffic rate goes above the configured threshold value, it will note down the data plane threshold timestamp.
* The difference between these two event timestamp will provide us with the RIB-IN convergence time.
* In general the convergence value will fall in certain range. In order to achieve proper results, run the test multiple times and average out the test results.
* Set it back to default configuration.
#### Test results
| Event | Number Of IPv4 Routes | Convergence (ms) |
| :---: | :-: | :-: |
| Advertise Routes | 1K | 493 |
| Advertise Routes | 8K | 2953 |
| Advertise Routes | 64K | 28301 |
| Advertise Routes | 98K | 43109 |
| Advertise Routes | 196K | 90615 |

In order to measure RIB-IN capacity of the switch, we can follow the same test methodology as RIB-IN convergence test. Below are the results for RIB-IN capacity test.

| Event | Number Of IPv4 Routes | Convergence (ms) | Loss % |
| :---: | :-: | :-: | :-: |
| Advertise Routes | 256K | - | 25 |
| Advertise Routes | 198K | - | 0.50 |
| Advertise Routes | 196K | 85079 | 0 |
| Advertise Routes | 195K | 84487 | 0 |
| Advertise Routes | 194K | 83285 | 0 |

### Test Case
* sonic-mgmt/tests/ixia/bgp/test_bgp_rib_in_convergence.py
* sonic-mgmt/tests/ixia/bgp/test_bgp_rib_in_capacity.py
### Test case # 3 - BGP Local Link Failover Convergence
#### Test objective
Measure the convergence time when local link failure event happens with in the network.
<p float="left">
<img src="Img/Local_Link_Failure.png" width="500" hspace="50"/>
</p>

#### Test steps
* Configure IPv4 EBGP sessions between Keysight ports and the SONiC switch.
* Advertise IPv4 routes along with AS number via configured IPv4 BGP sessions.
* Configure and advertise same IPv4 routes from both the test ports.
* Configure another IPv4 session to send the traffic. This is the server port from which traffic will be sent to the VIP addresses.
* Start all protocols and verify that IPv4 BGP neighborship is established.
* Create a data traffic between the server port and receiver ports where the same VIP addresses are configured and enable tracking by "Destination Endpoint" and by "Destination session description".
* Set the desired threshold value for receiving traffic. By default it will be set to 95% of expected receiving rate.
* Apply and start the data traffic.
* Verify that traffic is equally distributed between the receiving ports without any loss.
* Simulate local link failure by making port down on test tool.
* Verify that the traffic is re-balanced and use the other available path to route the traffic.
* Compute the failover convergence by the below formula
* Data Convergence Time(seconds) = (Tx Frames - Rx Frames) / Tx Frame Rate

### Test Results
Below table is the result of 3 way ECMP for 4 link flap iterations
| Event Name | No. of IPv4 Routes | Iterations | Avg Calculated Data Convergence Time(ms) |
| :---: | :-: | :-: | :-: |
| Test_Port_2 Link Failure | 1000 | 4 | 14.112 |
| Test_Port_3 Link Failure | 1000 | 4 | 14.336 |
| Test_Port_4 Link Failure | 1000 | 4 | 14.219 |

### Test Case
* sonic-mgmt/tests/ixia/bgp/test_bgp_local_link_failover.py
### Call for action
* Solicit experience in multi-DUT system test scenarios.

Binary file added docs/testplan/Img/Local_Link_Failure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 24 additions & 1 deletion tests/common/snappi/common_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
import ipaddr
from netaddr import IPNetwork
from tests.common.mellanox_data import is_mellanox_device as isMellanoxDevice

from ipaddress import IPv6Network, IPv6Address
from random import getrandbits

def increment_ip_address(ip, incr=1):
"""
Expand Down Expand Up @@ -654,3 +655,25 @@ def enable_packet_aging(duthost):
duthost.command("docker cp /tmp/packets_aging.py syncd:/")
duthost.command("docker exec syncd python /packets_aging.py enable")
duthost.command("docker exec syncd rm -rf /packets_aging.py")

def get_ipv6_addrs_in_subnet(subnet, number_of_ip):
"""
Get N IPv6 addresses in a subnet.
Args:
subnet (str): IPv6 subnet, e.g., '2001::1/64'
number_of_ip (int): Number of IP addresses to get
Return:
Return n IPv6 addresses in this subnet in a list.
"""

subnet = str(IPNetwork(subnet).network) + "/" + str(subnet.split("/")[1])
subnet = unicode(subnet, "utf-8")
ipv6_list = []
for i in range(number_of_ip):
network = IPv6Network(subnet)
address = IPv6Address(
network.network_address + getrandbits(
network.max_prefixlen - network.prefixlen))
ipv6_list.append(str(address))

return ipv6_list
99 changes: 96 additions & 3 deletions tests/common/snappi/snappi_fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
"""
import pytest
import snappi
import snappi_convergence
from ipaddress import ip_address, IPv4Address
from tests.common.fixtures.conn_graph_facts import conn_graph_facts,\
fanout_graph_facts
from tests.common.snappi.common_helpers import get_addrs_in_subnet,\
get_peer_snappi_chassis
get_ipv6_addrs_in_subnet, get_peer_snappi_chassis
from tests.common.snappi.snappi_helpers import SnappiFanoutManager, get_snappi_port_location
from tests.common.snappi.port import SnappiPortConfig, SnappiPortType
from tests.common.helpers.assertions import pytest_assert


@pytest.fixture(scope="module")
def snappi_api_serv_ip(tbinfo):
"""
Expand Down Expand Up @@ -410,4 +410,97 @@ def snappi_testbed_config(conn_graph_facts,
snappi_ports=snappi_ports)
pytest_assert(config_result is True, 'Fail to configure L3 interfaces')

return config, port_config_list
return config, port_config_list

@pytest.fixture(scope="module")
def tgen_ports(duthost,
conn_graph_facts,
fanout_graph_facts):

"""
Populate tgen ports info of T0 testbed and returns as a list
Args:
duthost (pytest fixture): duthost fixture
conn_graph_facts (pytest fixture): connection graph
fanout_graph_facts (pytest fixture): fanout graph
Return:
[{'card_id': '1',
'ip': '22.1.1.2',
'ipv6': '3001::2',
'ipv6_prefix': u'64',
'location': '10.36.78.238;1;2',
'peer_device': 'sonic-s6100-dut',
'peer_ip': u'22.1.1.1',
'peer_ipv6': u'3001::1',
'peer_port': 'Ethernet8',
'port_id': '2',
'prefix': u'24',
'speed': 'speed_400_gbps'},
{'card_id': '1',
'ip': '21.1.1.2',
'ipv6': '2001::2',
'ipv6_prefix': u'64',
'location': '10.36.78.238;1;1',
'peer_device': 'sonic-s6100-dut',
'peer_ip': u'21.1.1.1',
'peer_ipv6': u'2001::1',
'peer_port': 'Ethernet0',
'port_id': '1',
'prefix': u'24',
'speed': 'speed_400_gbps'}]
"""

speed_type = {'50000': 'speed_50_gbps',
'100000': 'speed_100_gbps',
'200000': 'speed_200_gbps',
'400000': 'speed_400_gbps'}

snappi_fanout = get_peer_snappi_chassis(conn_data=conn_graph_facts,
dut_hostname=duthost.hostname)
snappi_fanout_id = list(fanout_graph_facts.keys()).index(snappi_fanout)
snappi_fanout_list = SnappiFanoutManager(fanout_graph_facts)
snappi_fanout_list.get_fanout_device_details(device_number = snappi_fanout_id)
snappi_ports = snappi_fanout_list.get_ports(peer_device = duthost.hostname)
port_speed = None

for i in range(len(snappi_ports)):
if port_speed is None:
port_speed = int(snappi_ports[i]['speed'])

elif port_speed != int(snappi_ports[i]['speed']):
""" All the ports should have the same bandwidth """
return None

config_facts = duthost.config_facts(host=duthost.hostname,
source="running")['ansible_facts']
for port in snappi_ports:
port['location'] = get_snappi_port_location(port)
port['speed'] = speed_type[port['speed']]
try:
for port in snappi_ports:
peer_port = port['peer_port']
int_addrs = config_facts['INTERFACE'][peer_port].keys()
ipv4_subnet = [ele for ele in int_addrs if "." in ele][0]
if not ipv4_subnet:
raise Exception("IPv4 is not configured on the interface {}".format(peer_port))
port['peer_ip'], port['prefix'] = ipv4_subnet.split("/")
port['ip'] = get_addrs_in_subnet(ipv4_subnet, 1)[0]
ipv6_subnet = [ele for ele in int_addrs if ":" in ele][0]
if not ipv6_subnet:
raise Exception("IPv6 is not configured on the interface {}".format(peer_port))
port['peer_ipv6'], port['ipv6_prefix'] = ipv6_subnet.split("/")
port['ipv6'] = get_ipv6_addrs_in_subnet(ipv6_subnet, 1)[0]
except:
raise Exception('Configure IPv4 and IPv6 on DUT interfaces')

return snappi_ports


@pytest.fixture(scope='module')
def cvg_api(snappi_api_serv_ip,
snappi_api_serv_port):
api = snappi_convergence.api(location=snappi_api_serv_ip + ':' + str(snappi_api_serv_port),ext='ixnetwork')
yield api
if getattr(api, 'assistant', None) is not None:
api.assistant.Session.remove()

Empty file added tests/snappi/bgp/__init__.py
Empty file.
Empty file.
Loading