Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topotests: fix snmptrap log OID parsing #15234

Merged
merged 2 commits into from
Feb 13, 2024

Conversation

mwinter-osr
Copy link
Member

Fixes failures on the SNMP parsing in the traps on bgp_snmp_bgp4v2mib topotest. The original code skipped every 2nd request which will cause intermittent failures based on timing and how many SNMP traps are produced.

@frrbot frrbot bot added bugfix tests Topotests, make check, etc labels Jan 25, 2024
@@ -308,7 +308,7 @@ def __get_notif_bgp4v2_in_trap_file(router):
r2.vtysh_cmd("conf\nbgp snmp traps bgp4-mibv2")
r2.vtysh_cmd("conf\nno bgp snmp traps rfc4273")
rr.vtysh_cmd("clear bgp *")
sleep(30)
sleep(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a run_and_expect a check to see peering is back after the clear?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a run_and_expect a check to see peering is back after the clear?

Actually there is a run and expect afterwards. I've only reduced the time to speed things up. But without the sleep the first run_and_expect will fail anyway. Just a short delay seems to be more optimal before starting the run_and_expect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK -> This is just moving things around and praying that the sleep of 5 seconds is enough for reconvergence, especially on loaded systems.. The next run_and_expect is running snmp commands, not whether or not bgp has converged. That was what the original sleep was for an attempt to give the tests enough time to let convergence happen. WE HAVE FIXED too many cases where the tests make assumptions about state and make it incredibly hard to figure out what has gone wrong. WHY are we perpetuating it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the test based on the peer's feedback is not sufficient. Indeed, snmpd adds its processing and transmission times

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removed it. No idea why the original author added the sleep before the run_and_expect. Just wanted to lower the delay a bit. Hope this still works, but the initial check WILL go wrong, so now we have a sleep 10 instead of the sleep 5 (because that's the delay specified in the run_and_expect here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force pushed new changes to address the wait intervals. Should hopefully address voiced concerns by involved parties so we can get this merged.

@github-actions github-actions bot added the rebase PR needs rebase label Feb 6, 2024
Replace OID string parsing of snmptrap log files based on odd/even line
numbers with regex string search to prevent test failures in cases where
log entries don't match assumed order.

Signed-off-by: David Schweizer <[email protected]>
Remove sleep time in test_bgp_snmp_bgp4v2mib before run_and_expect and
reduce wait intervals for faster test convergence.

Signed-off-by: Martin Winter <[email protected]>
Signed-off-by: David Schweizer <[email protected]>
@davischw davischw force-pushed the snmp-topotest-fixes branch from 37642cf to 1bd9636 Compare February 7, 2024 12:57
@davischw davischw self-assigned this Feb 7, 2024
@davischw
Copy link
Contributor

davischw commented Feb 8, 2024

ci:rerun Unrelated tests failing

@davischw davischw removed their assignment Feb 9, 2024
@donaldsharp donaldsharp merged commit 9fd7bf4 into FRRouting:master Feb 13, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix master rebase PR needs rebase size/S tests Topotests, make check, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants