Work hero doc #8

mgheorghe · 2024-08-23T15:30:08Z

No description provided.

chrispsommers · 2024-08-23T16:46:28Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+https://www.keysight.com/us/en/product/944-1188/uhd400t.html
+https://www.keysight.com/us/en/products/network-test/network-test-hardware/xgs12-chassis-platform.html
+
+Amount of hardware needed varies based on the device performance. Curent DASH requirment specifies 24M CPS as minimum requirment but each vendor wants to showcase how much more they can do so based on that plus adding a 10%-20% for the headroom we can calculate the amount of hardware needed. 


Do you provide the BOM for what is needed? Why not spell it out here along with the calculations?

chrispsommers · 2024-08-23T16:48:06Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+#### Test Tools packet generator (keysight version)
+
+One solution to test the smart switch is makeing use of Keysight(Ixia) packet generator for TCP traffic we use CloudStorm & IxLoad and for UDP traffic we use Novus & IxNetwork and it is all mixed in by the UHD Connect.


Add a sentence explaining what UHD Connect is. The UHT400T data sheet is for a packet blaster. Do we have a data sheet for UHD-C?

chrispsommers · 2024-08-23T16:49:40Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+- DASH device port speeds are 100G or 200G or 400G, PAM4 or NRZ are UHD400C device port speeds are 100G or 200G or 400G, PAM4 or NRZ so far the 2 should interface with no issues.
+- IEEE defaults autoneg is preferable but at a minimum if AN is disabled please ensure FEC is enabled. With FEC disabled we observed few packet drops in the DACs and that can create a lot of hasle hunting down a lost packet that has nothing to do with DASH performance.  
+
+#### Testbed examples


It would be nice for some bullets summarizing the key features of each testbed. How does someone determine which one is appropriate for their needs?

added some bullets but it may need more work

chrispsommers · 2024-08-23T16:50:52Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+##### validate the hardware and software. 
+
+It ensures we can program the DPU via private API, SAI or DASH and that we can pass 1 packets end to end from traffic generator through the device under test and back.


It might be helpful to expand a bit on these various APIs and under what conditions we'd use one or the other.

I wiped the sentence and added a whole new test/paragraph about loading the dash configuration.

chrispsommers · 2024-08-23T16:52:07Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+##### can also provide best case scenario performance numbers
+
+its a maybe because 1 packets replicated milions of times may not necesarly work best for all hardware implementations.


This is very ambiguous. Do you mean, we could take the 1P test case but somehow send more packets to simulate worst-case?

rephrased "Not always, but occasionally, this test also yields the best case scenario values because the best case scenario is frequently reached at the lowest scale."

chrispsommers · 2024-08-23T16:52:40Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+### In between scale
+
+If Hero test scale numbers cannot be met we can add another checkpoint to gather aditional data before final implementation is ready


Do you have such a configuration?

no such configuration can be shared; it is just an intermediary phase in the development cycle, explained better

Before the final solution is finished, we can add another checkpoint to collect further data if the Hero test scale numbers are not fulfilled.
This will have custom scale values agreed upfront by all the parties and consitues an intermediary point in the DASH development.
Usually becomes irrelevant as soon as the Hero test scale is achieved.

chrispsommers · 2024-08-23T16:52:55Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+### Best case scenario
+
+If any of the previous tests have not shown the best case scenario we can run a test with the best case scenario in mind.


Wat does this mean?

see results section explains the best case,scenario.
what i wanted to say here is that during 1ip test, baby hero test or hero test, those may not show the best case scenario and in that case we can add one more datapoint showcasing base case scenario

i rephrased the sentence

chrispsommers · 2024-08-23T16:53:08Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+### Worst case scenario
+
+If any of the previous tests have not shown the worst case scenario we add this test as well (without exceeding hero test scale).


I don't know what this means.

rephrased "If we can find a scenario where we obtain lower performance numbers then the numbers previously obtained during (1ip, baby hero, hero test ...) this will be added as a new data point to the results."

chrispsommers · 2024-08-23T16:54:44Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+Latency value is most acurate when we have highest PPS, smallest packet, and zero packet loss. and is measured using Ixnetwork and Novus card.
+
+Aim for DPU is 2us, for smart switch we have to consider that packet travels twice through the NPU as well.


Latency through a switch is very dependent upon the queuing and congestion. Are you trying to find minimum latency through the switch? Do you have a way to measure just the switch latency w/o DPUs?

yes we can find the NPU/switch latency, it is considered a known variable. since the NPU is usually a 32x400G asic and we use only 8x400G for the test the NPU is usually not a bottleneck or point of issues.

rephrased

"When testing the smart switch we have to run a test to get the switch latency without running the traffic through the DPU and then get the total system latency with the understanding that each packets travel once through the NPU to reach the DPU, than it travel through teh DPU and once more it will travel through teh NPU after it leave the DPU.

smart switch latency = 2 x NPU latency + DPU latency"

chrispsommers · 2024-08-23T16:56:04Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+PPM may need to be adjusted between test gear and device under test to get that 100G or 200G or 400G perfect number.
+
+Consider looking at UHD400C stats and when looking at IxNetwork/Ixload stats will show less because the vxlan header is added later by UHD so we are interested in packet size as it enters the DPU x pps to get the throughput.


This is a lot to unpack, perhaps explain a little better.

chrispsommers · 2024-08-23T16:56:45Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+For TCP we use IxLoad since it has full TCP stack that is very configurable and can simulate a lot of diferent scenarios.
+
+While the hero test calls for 6 TCP packets SYN/SYNACK/ACK/FIN/FINACK/ACK, we make use of HTTP as aplication tha runs over TCP and on the wire we will end up with 7 packets for every conection. 


Maybe mention we do a 1-byte GET?

chrispsommers · 2024-08-23T16:58:04Z

documentation/general/program-scale-testing-requirements/hero-implementation-details.md

+
+PPS used for CPS test can be sen the the L23 stats in IxLoad.
+
+Keep an eye on TCP failures on client and server a retransmit is bad it simbolizes packet drop that was detected and TCP stack had to rentransmit. a conection drop is super extra bad it means even after 3-5 retries packet did not made it. 


Maybe put these kinds of tips and techniques in a quote > to stand out.

chrispsommers · 2024-08-23T17:53:46Z

My overall impression is that there is a lot of expertise, experience and practical advice and rationale in here, kind of jotted down quickly to get the big picture w/o worrying too much about making it clear and readable. Besides just spelling and grammar, I think it needs to be more readable in general and explain some more along the way. I think it could be a very valuable document.

chrispsommers

Other than minor spelling/grammar, LGTM.

mgheorghe added 15 commits August 22, 2024 09:00

update

682bbac

u

c275e81

Update README.md

4ccfc87

Update pre-dash-testing.md

9d4f8d9

Update pre-dash-testing.md

dcda57d

Update hero-implementation-details.md

438cf97

u

6cadba8

Update hero-implementation-details.md

878dedc

Update hero-implementation-details.md

70a4223

Update hero-implementation-details.md

b8f7b95

Update hero-implementation-details.md

f2ce071

Update hero-implementation-details.md

8d646a5

u

5f39b8e

u

094cd95

Update results.svg

2a95ce8

chrispsommers reviewed Aug 23, 2024

View reviewed changes

mgheorghe added 2 commits August 26, 2024 13:20

Update hero-implementation-details.md

ccc779f

Update hero-implementation-details.md

e92b121

chrispsommers approved these changes Oct 7, 2024

View reviewed changes

mgheorghe added 28 commits October 7, 2024 10:41

Update hero-implementation-details.md

c8049a2

Update hero-implementation-details.md

7143c2b

Update hero-implementation-details.md

2b6172a

Update hero-implementation-details.md

a4b4823

Update hero-implementation-details.md

dc30b51

Update hero-implementation-details.md

b8338da

Update hero-implementation-details.md

67914ed

Update hero-implementation-details.md

e79a6ed

Update hero-implementation-details.md

60e68df

Update hero-implementation-details.md

d1e7484

Update hero-implementation-details.md

d958238

Update hero-implementation-details.md

2e0bec0

Update hero-implementation-details.md

80db678

Update hero-implementation-details.md

3f3efaf

Update hero-implementation-details.md

b3226dc

Update hero-implementation-details.md

8b0cf75

Update .wordlist.txt

dbd982c

Update hero-implementation-details.md

cff8c38

Update hero-implementation-details.md

262147d

Update .wordlist.txt

e8b2af0

Update .wordlist.txt

c25e66a

Merge branch 'pr-hero-doc' into work-hero-doc

7d85503

Update hero-implementation-details.md

affa024

Update .wordlist.txt

21bebab

Update hero-implementation-details.md

e3526f1

Update hero-implementation-details.md

522a3b2

Update hero-implementation-details.md

ee85bd7

Update README.md

71bb446

mgheorghe merged commit 79ae65e into pr-hero-doc Oct 7, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Work hero doc #8

Work hero doc #8

mgheorghe commented Aug 23, 2024

chrispsommers Aug 23, 2024 •

edited

Loading

chrispsommers Aug 23, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

mgheorghe Aug 26, 2024

chrispsommers Aug 23, 2024

chrispsommers Aug 23, 2024

chrispsommers Aug 23, 2024

chrispsommers commented Aug 23, 2024

chrispsommers left a comment


		#### Test Tools packet generator (keysight version)

		One solution to test the smart switch is makeing use of Keysight(Ixia) packet generator for TCP traffic we use CloudStorm & IxLoad and for UDP traffic we use Novus & IxNetwork and it is all mixed in by the UHD Connect.


		##### validate the hardware and software.

		It ensures we can program the DPU via private API, SAI or DASH and that we can pass 1 packets end to end from traffic generator through the device under test and back.


		##### can also provide best case scenario performance numbers

		its a maybe because 1 packets replicated milions of times may not necesarly work best for all hardware implementations.


		### In between scale

		If Hero test scale numbers cannot be met we can add another checkpoint to gather aditional data before final implementation is ready


		### Best case scenario

		If any of the previous tests have not shown the best case scenario we can run a test with the best case scenario in mind.


		### Worst case scenario

		If any of the previous tests have not shown the worst case scenario we add this test as well (without exceeding hero test scale).


		Latency value is most acurate when we have highest PPS, smallest packet, and zero packet loss. and is measured using Ixnetwork and Novus card.

		Aim for DPU is 2us, for smart switch we have to consider that packet travels twice through the NPU as well.


		PPM may need to be adjusted between test gear and device under test to get that 100G or 200G or 400G perfect number.

		Consider looking at UHD400C stats and when looking at IxNetwork/Ixload stats will show less because the vxlan header is added later by UHD so we are interested in packet size as it enters the DPU x pps to get the throughput.


		For TCP we use IxLoad since it has full TCP stack that is very configurable and can simulate a lot of diferent scenarios.

		While the hero test calls for 6 TCP packets SYN/SYNACK/ACK/FIN/FINACK/ACK, we make use of HTTP as aplication tha runs over TCP and on the wire we will end up with 7 packets for every conection.


		PPS used for CPS test can be sen the the L23 stats in IxLoad.

		Keep an eye on TCP failures on client and server a retransmit is bad it simbolizes packet drop that was detected and TCP stack had to rentransmit. a conection drop is super extra bad it means even after 3-5 retries packet did not made it.

Work hero doc #8

Work hero doc #8

Conversation

mgheorghe commented Aug 23, 2024

chrispsommers Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrispsommers commented Aug 23, 2024

chrispsommers left a comment

Choose a reason for hiding this comment

chrispsommers Aug 23, 2024 •

edited

Loading