Add manual benchmark workflow and S3 result persistence #429

fabianliebig · 2024-11-14T13:06:45Z

Hi everyone,

This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.

The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date and the commit hash in that order. Furthermore, boto3 package is added to interact with AWS components.

I look forward to your feedback.

AVHopp · 2024-11-18T07:58:04Z

@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;)

pyproject.toml

AVHopp

Code looks quite clean and nice. Some things need minor adjustments :)

benchmarks/persistance/s3_persistance.py

benchmarks/__main__.py

.github/workflows/manual_benchmark.yml

AVHopp · 2024-11-18T09:52:13Z

.github/workflows/manual_benchmark.yml

+        id: generate-token
+        uses: actions/create-github-app-token@v1
+        with:
+          app-id: ${{ vars.APP_ID }}


This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?

Sure. I think at some point a README will be necessary :D. The following information will need to be added:

APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.

APP_PRIVATE_KEY: The key from the GitHub app to assume its role.

AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)

TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.

Ok, then let's keep this comment open until all of these things have been added.

you can add a /benchmarks/README.md, just like we have one in the /tests/README.md. It would definitely be god to have your comprehensive instructions in written form

I will add one.

AdrianSosic

Hi @fabianliebig, here a first few thoughts

.github/workflows/manual_benchmark.yml

pyproject.toml

benchmarks/__main__.py

benchmarks/persistence/s3_persistence.py

AVHopp

Some minor comments and one rather major one where we might need to discuss a little bit.

benchmarks/definition/config.py

benchmarks/persistence/persistence.py

AdrianSosic

Hi @fabianliebig, thx for the draft. Here a few initial comment

benchmarks/persistence/persistence.py

AdrianSosic · 2024-11-22T08:24:26Z

benchmarks/persistence/persistence.py

+        Returns:
+            The object to be persisted.
+        """
+        return self._result.to_dict() | self._benchmark.to_dict()


What is the rational of storing the entire benchmark and not just it's ID?

I think it is a mather of taste and depend on what one would want to have documented and displayed in the dashboard. If I remember correctly, I had a chat with @AVHopp about that. Maybe he can share his opinion. I would want to avoid loading data in a strange way, so the question for me is rather if this data doubling (description, name, and so) for every result disturbs you and should be removed or separated from the varying result.

I think reason is that some of the details like the best_possible_result is currently part of the Benchmark definition - so if you want to display that just based on the persisted data, you need to store that, right?

Yes, that is what I mean. There will be redundancy following the current approach. But we might need further components to avoid that or store the benchmark definition once with a dedicated key just for the description.

In my opinion, this is fine for now. It semms to do necessary due to the way that responsibilities are handled in the Benchmark which will change anyway. So if that is fine for both of you, my vote would be "let's keep it as-is, but add a comment to the code here that this part needs to be changed later on"

I think it would be good to implement it directly. Without further API-Abstraction, the storage and its handling is the direct interface for all application. Inconsistency in the way of storing things can be a pain later on.

The reason why I did it that way what that things like the metadata may change. So there is not a clear schema to be defined beforehand.

A tradeoff might be to only include relevant data from the benchmark (such as the best possible value) and extend the JSON file if necessary. Also, the metadata dict from the result is fully extendable since the dashboard currently add all of its content in a loop. @AdrianSosic @AVHopp Please let me know what you preferer :)

For a large scaled system it would properly make sense to combine a filesystem like S3 with a database but I think that is a overkill based on the number of benchmarks and forecasted executions.

As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?

benchmarks/persistence/persistence.py

AVHopp

Few comments from my side.

benchmarks/persistence/persistence.py

…zable and adding a dedicated serialization module

AVHopp

Here is the hopefully final round of comments from my side :)

CHANGELOG.md

benchmarks/definition/config.py

benchmarks/persistence/persistence.py

AdrianSosic

Hi @fabianliebig, thanks so much. I think our discussion was definitely worth it, I like the path-based solution a lot more, and I also saw that you agree 👍🏼 very nice.

Here a few suggestions + problem catches. But no changes required for the architecture! Will have to jump now but will look at the revised version later this evening.

benchmarks/definition/config.py

benchmarks/serialization.py

CHANGELOG.md

benchmarks/persistence/persistence.py

…g folder and filename construction. This commit will drop the workflow id to allow overwriting and make the path homogenous constructable.

…nual pipeline run

…ure and persistence capabilities

AVHopp

Next round

AVHopp · 2024-11-25T16:06:36Z

benchmarks/persistence/persistence.py

+        ALLOWED_CHARS = (
+            "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-."
+        )
+        return "".join([char if char in ALLOWED_CHARS else "-" for char in string])


No, what I mean is if you e.g. have a "/" which is not allowed and "-" in your string. Then both will be a "-" in the sanitized string, and you do not know which of the "-" in the sanitized string have originally been "/". Any my question is whether we can ignore this or whether this could cause problems.

AVHopp · 2024-11-25T16:08:02Z

benchmarks/persistence/persistence.py

+        Returns:
+            The object to be persisted.
+        """
+        return self._result.to_dict() | self._benchmark.to_dict()


As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?

benchmarks/persistence/persistence.py

…from manual pipeline run Fix typo in PathConstructor class documentation regarding S3 bucket path handling

…from manual pipeline run Refactor path construction to return Path objects and update S3 upload key handling.

Scienfitz · 2024-11-25T18:10:37Z

.github/workflows/manual_benchmark.yml

+        id: generate-token
+        uses: actions/create-github-app-token@v1
+        with:
+          app-id: ${{ vars.APP_ID }}


you can add a /benchmarks/README.md, just like we have one in the /tests/README.md. It would definitely be god to have your comprehensive instructions in written form

.github/workflows/manual_benchmark.yml

benchmarks/serialization.py

Scienfitz · 2024-11-25T18:49:55Z

benchmarks/persistence/persistence.py

+
+
+@define
+class LocalFileSystemObjectWriter(ObjectWriter):


why was the local persistence implemented?
Apart from that, looking at make_object_writer if something goes wrong in CI and the env vars were wrong it would just write it locally and we would not immediately notice. In such cases a simple fail would be better

It came up in this thread. However, I see your point. Do you think it would be sufficient to change the logic by making the S3 Writer the default and only using local persistence when we are not using the pipeline? There are GitHub standard environment variables that could be checked.

There is also the CI env that is automatically set on Github that we could leverage. That is, we could hardwire the logic that when CI is present, stuff automatically goes to S3. That would make it fail-safe plus we could still execute locally

I've added that, please let me know if that is sufficient.

To me, this seems quite solid for the current intended use 👍🏼 But I'll let @Scienfitz either resolve or add more comments

Scienfitz · 2024-11-25T19:02:54Z

benchmarks/persistence/persistence.py

+            json.dump(object, file)
+
+
+def make_object_writer() -> ObjectWriter:


I dislike the make_ approach and we also often follow different patterns that are equally valid choices here (the make_ approach does not seem very needed here too imo)

Not sure why make_object_writer is even needed, given that we dont need local storage persistence and Im not sure why it was even implemented (see other comment)

make_path_constructor would become obsolete if PathConstructor would just hold a benchmark and result as attributes and the current constructor attribtutes were properties

That separation comes from a former code version. I think it would be fine to drop it. The PathConstructor class may also have a constructor which performs the work of make_path_constructor. The make_object_writer was my way to hide the choice of which class is used for writing, that can also be changed but I think local writing is good for testing purposes. @AdrianSosic What do you think?

Regarding make_path_constructor: removing it is fine for me 👍🏼 As you correctly pointed out, you can take the code exactly as-is and turn it into a corresponding from_... classmethod with appropriate name in the PathConstructor class as alternative constructor. So basically just attaching the logic to the class instead of standalone function. While we could theoretically also change the class such that it directly takes the method's inputs as attributes, I vote against it since the attrs mantra you should think about what the class needs and not how it is going to be instantiated has proven incredibly useful in the past 🙃

I also consider the possibility of being able to run stuff locally very useful. So somewhere this distinction needs to be made. But you can also just replace the function with a simple ternary if-else operator in the __main__.py, which actually might be more elegant given the current simplistic logic

Okay, changed it.

Looking good to me 👍🏼 But will let @Scienfitz resolve since he opened the thread

…from manual pipeline run Refactor serialization classes to use BenchmarkSerialization and update persistence handling for GitHub CI

…from manual pipeline run Refactor persistence handling to use dedicated storage classes and improve path construction for benchmark results

benchmarks/persistence/persistence.py

AdrianSosic · 2024-11-25T21:24:14Z

benchmarks/persistence/persistence.py

+    branch: str = field(validator=instance_of(str), init=False)
+    """The branch currently checked out."""
+
+    latest_baybe_tag: str = field(validator=instance_of(str))
+    """The latest BayBE version tag."""
+
+    execution_date_time: datetime = field(validator=instance_of(datetime))
+    """The date and time when the benchmark was executed."""
+
+    commit_hash: str = field(validator=instance_of(str))


RE branch and hash I just realized: this should not be the branch/hash that is currently checked out but rather at the time of the result creation, right? So it makes sense to me that the commit is populated based on the result, but why is this not the case for the branch? I mean, I see that the branch info is not stored in the result, but why does it then appear here? Is it relevant or not?

You are correct, what applies to the commit hash should also apply to the branch. I think it is relevant information to filter and should be conceptually treated the same as the commit hash. Therefore, I will move it to the result metadata.

Ok 👍🏼 Can you then please also rephrase the docstrings accordingly? I.e. ... checked out during benchmark execution. Also, while the "the latest BayBE version tag" works in the MetaData class, it doesn't work here since the creation time of the objects are different, i.e. you can't copy-paste the docstring. That is, there could theoretically be a more recent version tag already. So you need to explicitly say The latest BayBE version tag at benchmark execution time.

Okay, done.

…from manual pipeline run Refactor PathConstructor and ResultMetadata to include branch handling and validation

…from manual pipeline run Add docstring to _default_branch method

fabianliebig requested review from Scienfitz, AdrianSosic and AVHopp as code owners November 14, 2024 13:06

fabianliebig self-assigned this Nov 14, 2024

fabianliebig force-pushed the feature/persisting-benchmarking-results branch from f1cdf02 to 87dce85 Compare November 18, 2024 09:10

AVHopp reviewed Nov 18, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

AVHopp requested changes Nov 18, 2024

View reviewed changes

AdrianSosic requested changes Nov 18, 2024

View reviewed changes

AVHopp reviewed Nov 20, 2024

View reviewed changes

benchmarks/definition/config.py Outdated Show resolved Hide resolved

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

fabianliebig force-pushed the feature/persisting-benchmarking-results branch 2 times, most recently from 6ef448e to 9b57f37 Compare November 20, 2024 16:15

AdrianSosic reviewed Nov 22, 2024

View reviewed changes

AVHopp reviewed Nov 22, 2024

View reviewed changes

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

AVHopp reviewed Nov 22, 2024

View reviewed changes

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

Scienfitz marked this pull request as draft November 22, 2024 10:26

AVHopp reviewed Nov 22, 2024

View reviewed changes

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved

fabianliebig force-pushed the feature/persisting-benchmarking-results branch 4 times, most recently from aa83e58 to 22caf8e Compare November 23, 2024 08:28

fabianliebig added the new feature New functionality label Nov 23, 2024

fabianliebig added 2 commits November 25, 2024 08:44

Add manual benchmark workflow and S3 result persistence

c333e84

Refactor serialization handling by replacing SerialMixin with Seriali…

c7c79d8

…zable and adding a dedicated serialization module

fabianliebig force-pushed the feature/persisting-benchmarking-results branch from e99f861 to 1de1222 Compare November 25, 2024 07:50

fabianliebig marked this pull request as ready for review November 25, 2024 08:05

AVHopp reviewed Nov 25, 2024

View reviewed changes

AdrianSosic reviewed Nov 25, 2024

View reviewed changes

fabianliebig force-pushed the feature/persisting-benchmarking-results branch from 2afbc62 to c68866a Compare November 25, 2024 15:50

Refactor persistence handling: Restructure path building by separatin…

bae9c04

…g folder and filename construction. This commit will drop the workflow id to allow overwriting and make the path homogenous constructable.

fabianliebig added 3 commits November 25, 2024 17:11

Update lockfile

14f153a

Add persistence capability for benchmarking results on AWS S3 from ma…

c6ad754

…nual pipeline run

Update CONTRIBUTORS.md to reflect enhancements in benchmarking struct…

31cd5b2

…ure and persistence capabilities

fabianliebig force-pushed the feature/persisting-benchmarking-results branch from c68866a to 31cd5b2 Compare November 25, 2024 16:11

AVHopp reviewed Nov 25, 2024

View reviewed changes

fabianliebig and others added 4 commits November 25, 2024 17:30

fixup! Add persistence capability for benchmarking results on AWS S3 …

c48cc4d

…from manual pipeline run Fix typo in PathConstructor class documentation regarding S3 bucket path handling

Update Lambda function payload to use 'XL' for container_compute

391c760

fixup! Add persistence capability for benchmarking results on AWS S3 …

91ae549

…from manual pipeline run Refactor path construction to return Path objects and update S3 upload key handling.

Use Path converter instead of validator

4a55bc5

Scienfitz reviewed Nov 25, 2024

View reviewed changes

AdrianSosic and others added 7 commits November 25, 2024 20:08

Declare environment variable name as constant

283eb59

Replace hard-coded class names in docstrings

552972f

fixup! Add persistence capability for benchmarking results on AWS S3 …

3b42d0f

…from manual pipeline run Refactor serialization classes to use BenchmarkSerialization and update persistence handling for GitHub CI

fixup! Add persistence capability for benchmarking results on AWS S3 …

eaad6d0

…from manual pipeline run Refactor persistence handling to use dedicated storage classes and improve path construction for benchmark results

Simplify using ternary operator

baaeabc

Fix type annotations

cd87ebb

Apply some cosmetics

87a312b

AdrianSosic reviewed Nov 25, 2024

View reviewed changes

fabianliebig added 2 commits November 25, 2024 22:59

fixup! Add persistence capability for benchmarking results on AWS S3 …

8318e6c

…from manual pipeline run Refactor PathConstructor and ResultMetadata to include branch handling and validation

fixup! Add persistence capability for benchmarking results on AWS S3 …

5a007b5

…from manual pipeline run Add docstring to _default_branch method

		json.dump(object, file)


		def make_object_writer() -> ObjectWriter:

Add manual benchmark workflow and S3 result persistence #429

Are you sure you want to change the base?

Add manual benchmark workflow and S3 result persistence #429

Conversation

fabianliebig commented Nov 14, 2024 • edited Loading

AVHopp commented Nov 18, 2024

AVHopp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scienfitz Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AdrianSosic left a comment

Choose a reason for hiding this comment

AVHopp left a comment

Choose a reason for hiding this comment

AdrianSosic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AVHopp left a comment

Choose a reason for hiding this comment

AVHopp left a comment

Choose a reason for hiding this comment

AdrianSosic left a comment

Choose a reason for hiding this comment

AVHopp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Scienfitz Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianliebig commented Nov 14, 2024 •

edited

Loading

Scienfitz Nov 25, 2024 •

edited

Loading

Scienfitz Nov 25, 2024 •

edited

Loading