Initial plugin #1

dustinblack · 2024-10-30T19:38:02Z

Changes introduced with this PR

Adding a new plugin for RTLA timerlat. Please review all code, not just changes.

By contributing to this repository, I agree to the contribution guidelines.

jaredoconnell · 2024-11-05T13:55:38Z

arcaflow_plugin_rtla/rtla_plugin.py

+    exit = Event()
+    finished_early = False


Given my recent experiences with plugins in odd scenarios, I think we should move these from class variables to instance variables. To do that, just move them into a constructor.

That's interesting. Definitely matches my preference ... use class variables only for things that must be shared across instances, which presumably these would not be. But on the other hand is it realistic for a single container to have multiple simultaneous instances of the plugin class?

jaredoconnell · 2024-11-05T18:17:01Z

arcaflow_plugin_rtla/rtla_plugin.py

+                    latency_hist.append(row_obj)
+                else:
+                    stats_per_col.append(row_obj)
+            if re.match(r"^ALL", line) and not found_all:


Reordering this to put not found_all first could make it more efficient by skipping the pattern matching after it's found.

jaredoconnell · 2024-11-05T18:17:31Z

arcaflow_plugin_rtla/rtla_plugin.py

+        stats_per_col = []
+        found_all = False
+
+        for line in output.splitlines():


This section could use some comments. I have little idea on what it's doing since I'm not familiar with the output format.

jaredoconnell · 2024-11-05T18:20:45Z

arcaflow_plugin_rtla/rtla_plugin.py

+        if params.user_threads:
+            return "success", TimerlatOutput(
+                latency_hist,
+                stats_per_col,
+                latency_stats_schema.unserialize(total_irq_latency),
+                latency_stats_schema.unserialize(total_thr_latency),
+                latency_stats_schema.unserialize(total_usr_latency),
+            )
+
+        return "success", TimerlatOutput(
+            latency_hist,
+            stats_per_col,
+            latency_stats_schema.unserialize(total_irq_latency),
+            latency_stats_schema.unserialize(total_thr_latency),
+        )


It may make sense to combine these into the same output, and just use the Ternary Operator on the value for total_usr_latency.

jaredoconnell · 2024-11-05T18:21:34Z

arcaflow_plugin_rtla/rtla_schema.py

+    min: typing.Annotated[
+        int,
+        schema.name("minimum latency"),
+        schema.description("Minimum latency value"),
+    ] = None
+    avg: typing.Annotated[
+        int,
+        schema.name("average latency"),
+        schema.description("Average latency value"),
+    ] = None
+    max: typing.Annotated[
+        int,
+        schema.name("maximum latency"),
+        schema.description("Maximum latency value"),
+    ] = None


Should this mention the units?

This made me realize that the inputs allow us to change the unit, so I added that as a separate object in the output. Since this can be either ms or us based on the input, I can't state the unit directly in the description except maybe as an either/or.

jaredoconnell · 2024-11-05T18:22:15Z

arcaflow_plugin_rtla/rtla_schema.py

+@dataclass
+class TimerlatOutput:
+    latency_hist: typing.Annotated[
+        typing.List[typing.Any],


This uses an any type. What type is it outputting?

The output is a list of key:value pairs where the values are always integers, but the number of pairs and the names of the keys are determined by the input parameters and/or the particular CPU architecture of the system. I'm not sure how to define an output object that can account for this.

I've tried this and variations, but I just get exceptions: typing.List[typing.Dict[str, int]],

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6794, in _resolve_list_annotation cls._resolve_abstract_type( File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6281, in _resolve_abstract_type result = cls._resolve(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6342, in _resolve return cls._resolve_dict_annotation(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6875, in _resolve_dict_annotation args[1], arg_hints[1], tuple(values_path), scope IndexError: tuple index out of range The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/arcaflow_plugin_rtla/tests/test_arcaflow_plugin_rtla.py", line 3, in <module> import rtla_plugin File "/app/arcaflow_plugin_rtla/rtla_plugin.py", line 31, in <module> class StartTimerlatStep: File "/app/arcaflow_plugin_rtla/rtla_plugin.py", line 60, in StartTimerlatStep def run_timerlat( File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/plugin.py", line 143, in step_decorator scope = build_object_schema(outputs[response_id]) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 7013, in build_object_schema r = _SchemaBuilder.resolve(t, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6271, in resolve return cls._resolve_abstract_type(t, t, tuple(path), scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6281, in _resolve_abstract_type result = cls._resolve(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6324, in _resolve return cls._resolve_type(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6377, in _resolve_type return cls._resolve_class(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6569, in _resolve_class name, final_field = cls._resolve_dataclass_field( File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6409, in _resolve_dataclass_field underlying_type = cls._resolve_field(t.type, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6308, in _resolve_field result = cls._resolve(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6346, in _resolve return cls._resolve_annotated(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6728, in _resolve_annotated underlying_t = cls._resolve(args[0], args_hints[0], path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6340, in _resolve return cls._resolve_list_annotation(t, type_hints, path, scope) File "/usr/local/lib/python3.9/site-packages/arcaflow_plugin_sdk/schema.py", line 6799, in _resolve_list_annotation raise SchemaBuildException( arcaflow_plugin_sdk.schema.SchemaBuildException: Invalid schema definition for TimerlatOutput -> latency_hist -> typing.Annotated: Failed to create list type Error: building at STEP "RUN python -m coverage run tests/test_${package}.py && python -m coverage html -d /htmlcov --omit=/usr/local/*": while running runtime: exit status 1

Here is a sample of what is currently returned. Note the number of keys and their names (-001, -002, -003, etc) are determined by the user input to cpus and user-threads and the number of CPUs on the target system.

latency_hist: - index: '0' irq-001: '1380' thr-001: '0' usr-001: '0' irq-002: '1849' thr-002: '0' usr-002: '0' - index: '1' irq-001: '1436' thr-001: '0' usr-001: '0' irq-002: '1119' thr-002: '2' usr-002: '0'

With the SDK fix from arcalot/arcaflow-plugin-sdk-python#141 these are being re-defined as lists of dicts (though some related issues with that are still being addressed).

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

Signed-off-by: Dustin Black <[email protected]>

dbutenhof

The sample output helps a lot, although the code indicates at least one "special case" that's not covered -- possibly by accident? (See detailed comments.)

Basically, I think the parsing is disorganized, hard to follow, and likely to be hard to maintain. But we'll probably never need to look at it again (and if someone ever does need to maintain it, it'll probably be you, and definitely won't be me 🤣) ... so if you're happy that this works, go for it.

arcaflow_plugin_rtla/rtla_plugin.py

webbnh

🎉

arcaflow_plugin_rtla/rtla_plugin.py

webbnh

See below for a coding suggestion.

Note that the Renovate bot should be stopping by a half hour from now for the arcaflow-plugin-sdk update, if everything is working as hoped.

arcaflow_plugin_rtla/rtla_plugin.py

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

dbutenhof

I think that looks a lot cleaner; although getting rid of the output_lines indexing (e.g., with output_lines.pop(0)) might streamline it a bit by reducing extraneous code.

dbutenhof · 2024-11-11T15:08:44Z

arcaflow_plugin_rtla/rtla_plugin.py

+        is_summary = re.compile(r"ALL")
+
+        output_lines = output.splitlines()
+        line_num = 0


Instead of dealing with line numbers as list offsets, you could consider using output_lines.pop(0) to remove the first entry on each iteration.

You'd need to handle an IndexError exception if you don't actually want to check that len(output_lines)>0 first... but unless you get output radically different than you expect (in which case an unhandled exception is probably not the worst outcome), that's only an issue on "phase 3".

I didn't think about using .pop() and just discarding as I go, but that's probably a better solution than tracking the line number.

pop() would be good, but I've recommended using an iterator instead, below.

Alright, the idea of the .pop() made sense to me, but I'm getting really strange results every way that I try it. For what is gained, It's not worth me putting more time into it.

webbnh

Dustin, this looks great! I really like the phased parsing. Of course, I do have a couple of suggestions.

webbnh · 2024-11-11T15:21:11Z

arcaflow_plugin_rtla/rtla_plugin.py

            # Capture the column headers
-            elif re_isindex.match(line):
+            elif is_header.match(line):


Perhaps efficiency is not paramount (and, I don't even know how expensive regex parsing is), but given how simple this match is, you might consider using str.startswith() instead of match(). (Ditto for is_digit and is_summary.)

webbnh · 2024-11-11T16:46:06Z

arcaflow_plugin_rtla/rtla_plugin.py

+        output_lines = output.splitlines()
+        line_num = 0

-        for line in output.splitlines():
-            if re_isunit.match(line):
-                time_unit = re_isunit.match(line).group(1)
+        # Phase 1: Get the headers
+        for line_num, line in enumerate(output_lines):
+            # Get the time unit (user-selectable)
+            if is_time_unit.match(line):
+                time_unit = is_time_unit.match(line).group(1)
            # Capture the column headers
-            elif re_isindex.match(line):
+            elif is_header.match(line):
                col_headers = line.lower().split()
-            # Stats names repeat, so flag when have passed ^ALL
-            elif re_isall.match(line):
-                found_all = True
-            # Either this is a histogram bucket row, or the first time we have seen
-            # a row beginning with a stat name
-            elif (re_isdigit.match(line)) or (
-                line.split()[0] in stats_names and not found_all
-            ):
+                line_num += 1
+                break
+
+        # Phase 2: Get the columnar data
+        for i in range(line_num, len(output_lines)):
+            line_list = output_lines[i].split()
+            row_obj = {}
+            # Collect histogram buckets and column latency statistics
+            if not is_summary.match(line_list[0]):
                # Capture the columnar data
-                line_list = []
-                for element in line.split():
-                    try:
-                        line_list.append(int(element))
-                    except ValueError:
-                        line_list.append(element)
-                row_obj = dict(zip(col_headers, line_list))
-                if re_isdigit.match(line):
-                    latency_hist.append(row_obj)
+                if not is_digit.match(line_list[0]):
+                    # Stats index values are strings
+                    row_obj[col_headers[0]] = line_list[0][:-1]
+                    accumulator = stats_per_col
                else:
-                    stats_per_col.append(row_obj)
-            # Since we've encountered the summary statistics (marked by the line
-            # starting with "ALL"), generate key:value pairs instead of columnar data.
-            elif found_all and line.split()[0] in stats_names:
-                label = line.split()[0][:-1]
-                if label != "over":
-                    total_irq_latency[label] = line.split()[1]
-                    total_thr_latency[label] = line.split()[2]
-                    if params.user_threads:
-                        total_usr_latency[label] = line.split()[3]
+                    # Histogram index values are integers
+                    row_obj[col_headers[0]] = int(line_list[0])
+                row_obj = row_obj | dict(zip(col_headers[1:], map(int, line_list[1:])))
+                accumulator.append(row_obj)
+            else:
+                line_num = i + 1
+                break
+
+        # Phase 3: Get the stats summary as key:value pairs
+        for i in range(line_num, len(output_lines)):
+            line_list = output_lines[i].split()
+            label = line_list[0][:-1]
+            total_irq_latency[label] = line_list[1]
+            total_thr_latency[label] = line_list[2]
+            if params.user_threads:
+                total_usr_latency[label] = line_list[3]


Since you only want to visit each line once, I suggest using an iterator instead of enumeration:

Suggested change

output_lines = output.splitlines()

line_num = 0

for line in output.splitlines():

if re_isunit.match(line):

time_unit = re_isunit.match(line).group(1)

# Phase 1: Get the headers

for line_num, line in enumerate(output_lines):

# Get the time unit (user-selectable)

if is_time_unit.match(line):

time_unit = is_time_unit.match(line).group(1)

# Capture the column headers

elif re_isindex.match(line):

elif is_header.match(line):

col_headers = line.lower().split()

# Stats names repeat, so flag when have passed ^ALL

elif re_isall.match(line):

found_all = True

# Either this is a histogram bucket row, or the first time we have seen

# a row beginning with a stat name

elif (re_isdigit.match(line)) or (

line.split()[0] in stats_names and not found_all

):

line_num += 1

break

# Phase 2: Get the columnar data

for i in range(line_num, len(output_lines)):

line_list = output_lines[i].split()

row_obj = {}

# Collect histogram buckets and column latency statistics

if not is_summary.match(line_list[0]):

# Capture the columnar data

line_list = []

for element in line.split():

try:

line_list.append(int(element))

except ValueError:

line_list.append(element)

row_obj = dict(zip(col_headers, line_list))

if re_isdigit.match(line):

latency_hist.append(row_obj)

if not is_digit.match(line_list[0]):

# Stats index values are strings

row_obj[col_headers[0]] = line_list[0][:-1]

accumulator = stats_per_col

else:

stats_per_col.append(row_obj)

# Since we've encountered the summary statistics (marked by the line

# starting with "ALL"), generate key:value pairs instead of columnar data.

elif found_all and line.split()[0] in stats_names:

label = line.split()[0][:-1]

if label != "over":

total_irq_latency[label] = line.split()[1]

total_thr_latency[label] = line.split()[2]

if params.user_threads:

total_usr_latency[label] = line.split()[3]

# Histogram index values are integers

row_obj[col_headers[0]] = int(line_list[0])

row_obj = row_obj | dict(zip(col_headers[1:], map(int, line_list[1:])))

accumulator.append(row_obj)

else:

line_num = i + 1

break

# Phase 3: Get the stats summary as key:value pairs

for i in range(line_num, len(output_lines)):

line_list = output_lines[i].split()

label = line_list[0][:-1]

total_irq_latency[label] = line_list[1]

total_thr_latency[label] = line_list[2]

if params.user_threads:

total_usr_latency[label] = line_list[3]

output_lines = iter(output.splitlines())

# Phase 1: Get the headers

for line in output_lines:

# Get the time unit (user-selectable)

if is_time_unit.match(line):

time_unit = is_time_unit.match(line).group(1)

# Capture the column headers

elif is_header.match(line):

col_headers = line.lower().split()

break

# Phase 2: Collect histogram buckets and column latency statistics

for line in output_lines:

line_list = line.split()

# Collect statistics up until the summary section

# (We don't process the summary header line itself, we just skip it here.)

if is_summary.match(line_list[0]):

break

row_obj = dict(zip(col_headers[1:], map(int, line_list[1:])))

if not is_digit.match(line_list[0]):

# Stats index values are strings ending in a colon

row_obj[col_headers[0]] = line_list[0][:-1]

# When we hit the stats, switch to the other accumulator

accumulator = stats_per_col

else:

# Histogram index values are integers

row_obj[col_headers[0]] = int(line_list[0])

accumulator.append(row_obj)

# Phase 3: Get the stats summary as key:value pairs

for line in output_lines:

line_list = line.split()

label = line_list[0][:-1]

total_irq_latency[label] = line_list[1]

total_thr_latency[label] = line_list[2]

if params.user_threads:

total_usr_latency[label] = line_list[3]

That is, using the iterator allows us to avoid enumerating the output lines and having to keep track of where we are; we process the lines in sequence, moving to the next phase like a state machine.

Also, I tweaked Phase 2. Moving the break up allows us to avoid the extra level of indentation/complexity. And, using the result of the zip() call as the initializer for row_obj saves us from having to use the "update" operation to add it to row_obj later. And, I added/reworked the comments, slightly.

[Disclaimer: I didn't actually try running this code....]

I'll play with this as an alternative to the .pop(), suggested above by Dave, that was giving me fits.

This comment was marked as resolved.

Sign in to view

dustinblack marked this pull request as ready for review October 31, 2024 12:50

dustinblack marked this pull request as draft October 31, 2024 12:51

webbnh mentioned this pull request Oct 31, 2024

Update workflow for PRs arcalot/arcaflow-docsgen#45

Merged

dustinblack force-pushed the initial-plugin branch from eb58c53 to ba33530 Compare November 4, 2024 13:22

dustinblack and others added 21 commits November 5, 2024 13:45

set plugin name

1834911

initial working plugin

0fa9f03

minor cleanup

5eb2e9d

linting

4bafa5e

more linting

3f87931

add histogram collection

46f80d1

add collection of per-column stats

c5aaa2b

simplify total stats collection

429d46b

make user threads output optional

d638420

fix problem with integer list input

098ea83

correct flags input to command

eb1ae3c

tweaking subprocess; cleanup

6f0b65c

interrupt signal working

79cc90c

remove extraneous else clause

80235a3

Automatic update of README.md by arcaflow-docsgen arcabot

8e399cf

change signal and communicate order

ec5148d

add tests

748dc4b

post-process stdout directly instead of via file

04a3024

simplify test assertions due to build limitations

e14ed10

update readme

23a7f07

fix docker compose

eba5ae1

dustinblack force-pushed the initial-plugin branch from c0fd17d to eba5ae1 Compare November 5, 2024 12:45

dustinblack marked this pull request as ready for review November 5, 2024 12:57

dustinblack requested review from webbnh and a team November 5, 2024 12:57

dustinblack requested a review from dbutenhof November 5, 2024 17:52

updates from pr feedback

72651e4

jaredoconnell reviewed Nov 5, 2024

View reviewed changes

updates from feedback plus adding time_unit to the output

79fa255

dustinblack requested a review from jaredoconnell November 7, 2024 15:28

This comment was marked as resolved.

Sign in to view

dustinblack and others added 6 commits November 8, 2024 12:51

Update arcaflow_plugin_rtla/rtla_plugin.py

e3f0d91

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

Update arcaflow_plugin_rtla/rtla_plugin.py

c9d59cf

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

Update arcaflow_plugin_rtla/rtla_plugin.py

b236917

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

Update arcaflow_plugin_rtla/rtla_schema.py

e37a75a

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

updates from pr feedback

ae8aea2

Merge branch 'main' into initial-plugin

0612395

Signed-off-by: Dustin Black <[email protected]>

dustinblack requested a review from webbnh November 8, 2024 12:46

dustinblack and others added 2 commits November 8, 2024 14:08

linting

5ef2545

Automatic update of README.md by arcaflow-docsgen arcabot

b46b4f5

dbutenhof approved these changes Nov 8, 2024

View reviewed changes

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

webbnh approved these changes Nov 8, 2024

View reviewed changes

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

dustinblack and others added 2 commits November 8, 2024 19:23

update sdk and improve output typing

71a8ce3

Automatic update of README.md by arcaflow-docsgen arcabot

f7ea353

webbnh reviewed Nov 8, 2024

View reviewed changes

arcaflow_plugin_rtla/rtla_plugin.py Outdated Show resolved Hide resolved

dustinblack and others added 2 commits November 11, 2024 13:20

Apply suggestions from code review

3552dd0

Co-authored-by: Webb Scales <[email protected]> Signed-off-by: Dustin Black <[email protected]>

restructure the rtla output parsing into phases

1134025

dustinblack requested review from webbnh and dbutenhof November 11, 2024 14:24

regex variable renaming

2596160

dbutenhof approved these changes Nov 11, 2024

View reviewed changes

webbnh approved these changes Nov 11, 2024

View reviewed changes

iterator, matching, and comment updates from PR feedback

21f1759

dustinblack merged commit fc7c664 into main Nov 11, 2024
3 checks passed

dustinblack deleted the initial-plugin branch November 11, 2024 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial plugin #1

Initial plugin #1

dustinblack commented Oct 30, 2024 •

edited

Loading

This comment was marked as resolved.

jaredoconnell Nov 5, 2024

dbutenhof Nov 5, 2024

jaredoconnell Nov 5, 2024

jaredoconnell Nov 5, 2024

jaredoconnell Nov 5, 2024

jaredoconnell Nov 5, 2024

dustinblack Nov 7, 2024

jaredoconnell Nov 5, 2024

dustinblack Nov 7, 2024

dustinblack Nov 8, 2024

dustinblack Nov 8, 2024

dustinblack Nov 11, 2024

This comment was marked as resolved.

dbutenhof left a comment

webbnh left a comment

webbnh left a comment

dbutenhof left a comment

dbutenhof Nov 11, 2024

dustinblack Nov 11, 2024

webbnh Nov 11, 2024

dustinblack Nov 11, 2024

webbnh left a comment

webbnh Nov 11, 2024

webbnh Nov 11, 2024

dustinblack Nov 11, 2024

Initial plugin #1

Initial plugin #1

Conversation

dustinblack commented Oct 30, 2024 • edited Loading

Changes introduced with this PR

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

dbutenhof left a comment

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

dbutenhof left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dustinblack commented Oct 30, 2024 •

edited

Loading