Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vtysh, lib: preprocess CLI graphs #16427

Closed

Conversation

eqvinox
Copy link
Contributor

@eqvinox eqvinox commented Jul 21, 2024

Store a parsed and built graph of the CLI nodes in vtysh, rather than parsing and building that graph every time vtysh starts up.

This provides a 3x to 5x reduction in vtysh startup overhead:

vtysh -c 'configure' -c 'interface lo' -c 'do show version'

  • before: 92.9M cycles, 1114 samples
  • after: 16.5M cycles, 330 samples

This improvement is particularly visible for users scripting vtysh -c calls, which notably includes topotests.

eqvinox added 2 commits July 21, 2024 09:46
Not used anywhere in FRR, kill it.

Signed-off-by: David Lamparter <[email protected]>
Use alloced=0 to indicate that the array used in a vector is not in fact
dynamically allocated memory (yet).

Signed-off-by: David Lamparter <[email protected]>
@eqvinox
Copy link
Contributor Author

eqvinox commented Jul 21, 2024

I should make note of two things here:

  • the speedup is even larger if you build vtysh with -no-pie. This is because the CLI graph serialized into vtysh contains a shitton of pointers, which turn into ELF relocations (300k of them to be exact) that the dynamic linker has to adjust at startup. This is absolutely showing up in perf data, but it's still much faster than dynamically parsing and creating the graphs at runtime.

  • the few remaining DEFUN statements in vtysh (i.e. custom handlers) are still visible as significant overhead in perf data. Trying to somehow integrate those into preprocessing as well would yield another bit of gain, but it's much harder to do & this is where the cutoff is for now.

eqvinox added 11 commits July 21, 2024 12:20
Allow using "+1" when meaningful (i.e. cmd_graph_merge wants -1 or +1)

Signed-off-by: David Lamparter <[email protected]>
The number of nodes in a graph will change as soon as cmd_graph_merge is
supported as an operation, therefore size this dynamically.

Signed-off-by: David Lamparter <[email protected]>
When merging graphs, it makes sense to allow starting with an empty one.

Signed-off-by: David Lamparter <[email protected]>
Export cmd_graph_merge() to python code via graph1.merge(graph2).

Signed-off-by: David Lamparter <[email protected]>
Make it a little easier to work on python code using this wrapper.

Signed-off-by: David Lamparter <[email protected]>
Add len(graph) and graph[i] wrappers to access arbitrary nodes in a
graph.

Signed-off-by: David Lamparter <[email protected]>
There's a wrapper for nodes' outgoing pointers, but not incoming yet.

Signed-off-by: David Lamparter <[email protected]>
FORK_TKN's join node is already exposed, mirror to expose JOIN_TKN's
fork node.

(contains minor cleanup to make checkpatch.pl shut up)

Signed-off-by: David Lamparter <[email protected]>
Expose all of the struct members of cmd_token, and retrieve them
dynamically rather than copying them around.  The problem with copying
them is that they can change as a result of merge(), and if there is an
existing wrapper object around it will not have its copy updated to
match.

Signed-off-by: David Lamparter <[email protected]>
The command graph has its tail end nodes pointing at the
`struct cmd_element` rather than a `struct cmd_token`.  This is a bit
weird to begin with, but becomes very annoying for the python bindings
where there is just no `struct cmd_element`.

Create a `CMD_ELEMENT_TKN` type for `cmd_token` instead, and replace the
tail end token in the python bindings with an instance of that.

Signed-off-by: David Lamparter <[email protected]>
There is entirely no point to these being conditional.  And pull them up
so the upcoming pre-parse code can work on a clean slate.

Signed-off-by: David Lamparter <[email protected]>
@eqvinox eqvinox force-pushed the vtysh-precompile-graph branch from 9df460c to 3306a7f Compare July 21, 2024 19:20
@eqvinox
Copy link
Contributor Author

eqvinox commented Jul 21, 2024

Report for command.h | 2 issues
===============================================
< ERROR: Macros with complex values should be enclosed in parentheses
< #362: FILE: /tmp/f1-4095093/command.h:362:
Report for vtysh.c | 2 issues
===============================================
< WARNING: externs should be avoided in .c files
< #1666: FILE: /tmp/f1-4095093/vtysh.c:1666:

These are junk warnings and will need to be ignored sigh

@eqvinox eqvinox force-pushed the vtysh-precompile-graph branch 3 times, most recently from 963da3d to 890bff9 Compare July 21, 2024 21:07
Store a parsed and built graph of the CLI nodes in vtysh, rather than
parsing and building that graph every time vtysh starts up.

This provides a 3x to 5x reduction in vtysh startup overhead:

`vtysh -c 'configure' -c 'interface lo' -c 'do show version'`

- before: 92.9M cycles, 1114 samples
- after: 16.5M cycles, 330 samples

This improvement is particularly visible for users scripting `vtysh -c`
calls, which notably includes topotests.

Signed-off-by: David Lamparter <[email protected]>
@eqvinox eqvinox force-pushed the vtysh-precompile-graph branch from 890bff9 to d0de866 Compare July 21, 2024 21:57
@eqvinox
Copy link
Contributor Author

eqvinox commented Jul 22, 2024

Looks like the remaining test failures are flaky timing. Faster vtysh will certainly highlight some problems with poorly written tests 😞

@donaldsharp
Copy link
Member

ci:rerun

@riw777 riw777 self-requested a review July 30, 2024 15:35
@donaldsharp donaldsharp mentioned this pull request Jul 31, 2024
@donaldsharp
Copy link
Member

I had this pushed in under #16501 closing this one

@donaldsharp donaldsharp closed this Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants