Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add manual benchmark workflow and S3 result persistence #429

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

fabianliebig
Copy link
Collaborator

@fabianliebig fabianliebig commented Nov 14, 2024

Hi everyone,

This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.

The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date and the commit hash in that order. Furthermore, boto3 package is added to interact with AWS components.

I look forward to your feedback.

@AVHopp
Copy link
Collaborator

AVHopp commented Nov 18, 2024

@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;)

@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch from f1cdf02 to 87dce85 Compare November 18, 2024 09:10
pyproject.toml Show resolved Hide resolved
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks quite clean and nice. Some things need minor adjustments :)

benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/persistance/s3_persistance.py Outdated Show resolved Hide resolved
benchmarks/__main__.py Show resolved Hide resolved
.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved
id: generate-token
uses: actions/create-github-app-token@v1
with:
app-id: ${{ vars.APP_ID }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think at some point a README will be necessary :D. The following information will need to be added:

  • APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.
  • APP_PRIVATE_KEY: The key from the GitHub app to assume its role.
  • AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)
  • TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then let's keep this comment open until all of these things have been added.

Copy link
Collaborator

@Scienfitz Scienfitz Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add a /benchmarks/README.md, just like we have one in the /tests/README.md. It would definitely be god to have your comprehensive instructions in written form

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add one.

Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabianliebig, here a first few thoughts

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
benchmarks/__main__.py Show resolved Hide resolved
benchmarks/__main__.py Show resolved Hide resolved
benchmarks/persistence/s3_persistence.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments and one rather major one where we might need to discuss a little bit.

benchmarks/definition/config.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch 2 times, most recently from 6ef448e to 9b57f37 Compare November 20, 2024 16:15
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabianliebig, thx for the draft. Here a few initial comment

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
Returns:
The object to be persisted.
"""
return self._result.to_dict() | self._benchmark.to_dict()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rational of storing the entire benchmark and not just it's ID?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a mather of taste and depend on what one would want to have documented and displayed in the dashboard. If I remember correctly, I had a chat with @AVHopp about that. Maybe he can share his opinion. I would want to avoid loading data in a strange way, so the question for me is rather if this data doubling (description, name, and so) for every result disturbs you and should be removed or separated from the varying result.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reason is that some of the details like the best_possible_result is currently part of the Benchmark definition - so if you want to display that just based on the persisted data, you need to store that, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I mean. There will be redundancy following the current approach. But we might need further components to avoid that or store the benchmark definition once with a dedicated key just for the description.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this is fine for now. It semms to do necessary due to the way that responsibilities are handled in the Benchmark which will change anyway. So if that is fine for both of you, my vote would be "let's keep it as-is, but add a comment to the code here that this part needs to be changed later on"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to implement it directly. Without further API-Abstraction, the storage and its handling is the direct interface for all application. Inconsistency in the way of storing things can be a pain later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I did it that way what that things like the metadata may change. So there is not a clear schema to be defined beforehand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tradeoff might be to only include relevant data from the benchmark (such as the best possible value) and extend the JSON file if necessary. Also, the metadata dict from the result is fully extendable since the dashboard currently add all of its content in a loop. @AdrianSosic @AVHopp Please let me know what you preferer :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a large scaled system it would properly make sense to combine a filesystem like S3 with a database but I think that is a overkill based on the number of benchmarks and forecasted executions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments from my side.

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
@Scienfitz Scienfitz marked this pull request as draft November 22, 2024 10:26
@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch 4 times, most recently from aa83e58 to 22caf8e Compare November 23, 2024 08:28
@fabianliebig fabianliebig added the new feature New functionality label Nov 23, 2024
@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch from e99f861 to 1de1222 Compare November 25, 2024 07:50
@fabianliebig fabianliebig marked this pull request as ready for review November 25, 2024 08:05
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the hopefully final round of comments from my side :)

CHANGELOG.md Outdated Show resolved Hide resolved
benchmarks/definition/config.py Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@AdrianSosic AdrianSosic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabianliebig, thanks so much. I think our discussion was definitely worth it, I like the path-based solution a lot more, and I also saw that you agree 👍🏼 very nice.

Here a few suggestions + problem catches. But no changes required for the architecture! Will have to jump now but will look at the revised version later this evening.

benchmarks/definition/config.py Show resolved Hide resolved
benchmarks/serialization.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch from 2afbc62 to c68866a Compare November 25, 2024 15:50
…g folder and filename construction.

This commit will drop the workflow id to allow overwriting and make the path homogenous constructable.
@fabianliebig fabianliebig force-pushed the feature/persisting-benchmarking-results branch from c68866a to 31cd5b2 Compare November 25, 2024 16:11
Copy link
Collaborator

@AVHopp AVHopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next round

ALLOWED_CHARS = (
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-."
)
return "".join([char if char in ALLOWED_CHARS else "-" for char in string])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, what I mean is if you e.g. have a "/" which is not allowed and "-" in your string. Then both will be a "-" in the sanitized string, and you do not know which of the "-" in the sanitized string have originally been "/". Any my question is whether we can ignore this or whether this could cause problems.

Returns:
The object to be persisted.
"""
return self._result.to_dict() | self._benchmark.to_dict()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?

benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
fabianliebig and others added 4 commits November 25, 2024 17:30
…from manual pipeline run

Fix typo in PathConstructor class documentation regarding S3 bucket path handling
…from manual pipeline run

Refactor path construction to return Path objects and update S3 upload key handling.
id: generate-token
uses: actions/create-github-app-token@v1
with:
app-id: ${{ vars.APP_ID }}
Copy link
Collaborator

@Scienfitz Scienfitz Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add a /benchmarks/README.md, just like we have one in the /tests/README.md. It would definitely be god to have your comprehensive instructions in written form

.github/workflows/manual_benchmark.yml Outdated Show resolved Hide resolved
benchmarks/serialization.py Outdated Show resolved Hide resolved


@define
class LocalFileSystemObjectWriter(ObjectWriter):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was the local persistence implemented?
Apart from that, looking at make_object_writer if something goes wrong in CI and the env vars were wrong it would just write it locally and we would not immediately notice. In such cases a simple fail would be better

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came up in this thread. However, I see your point. Do you think it would be sufficient to change the logic by making the S3 Writer the default and only using local persistence when we are not using the pipeline? There are GitHub standard environment variables that could be checked.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the CI env that is automatically set on Github that we could leverage. That is, we could hardwire the logic that when CI is present, stuff automatically goes to S3. That would make it fail-safe plus we could still execute locally

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that, please let me know if that is sufficient.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, this seems quite solid for the current intended use 👍🏼 But I'll let @Scienfitz either resolve or add more comments

json.dump(object, file)


def make_object_writer() -> ObjectWriter:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the make_ approach and we also often follow different patterns that are equally valid choices here (the make_ approach does not seem very needed here too imo)

Not sure why make_object_writer is even needed, given that we dont need local storage persistence and Im not sure why it was even implemented (see other comment)

make_path_constructor would become obsolete if PathConstructor would just hold a benchmark and result as attributes and the current constructor attribtutes were properties

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That separation comes from a former code version. I think it would be fine to drop it. The PathConstructor class may also have a constructor which performs the work of make_path_constructor. The make_object_writer was my way to hide the choice of which class is used for writing, that can also be changed but I think local writing is good for testing purposes. @AdrianSosic What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Regarding make_path_constructor: removing it is fine for me 👍🏼 As you correctly pointed out, you can take the code exactly as-is and turn it into a corresponding from_... classmethod with appropriate name in the PathConstructor class as alternative constructor. So basically just attaching the logic to the class instead of standalone function. While we could theoretically also change the class such that it directly takes the method's inputs as attributes, I vote against it since the attrs mantra you should think about what the class needs and not how it is going to be instantiated has proven incredibly useful in the past 🙃
  2. I also consider the possibility of being able to run stuff locally very useful. So somewhere this distinction needs to be made. But you can also just replace the function with a simple ternary if-else operator in the __main__.py, which actually might be more elegant given the current simplistic logic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, changed it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me 👍🏼 But will let @Scienfitz resolve since he opened the thread

AdrianSosic and others added 7 commits November 25, 2024 20:08
…from manual pipeline run

Refactor serialization classes to use BenchmarkSerialization and update persistence handling for GitHub CI
…from manual pipeline run

Refactor persistence handling to use dedicated storage classes and improve path construction for benchmark results
benchmarks/persistence/persistence.py Outdated Show resolved Hide resolved
Comment on lines 53 to 62
branch: str = field(validator=instance_of(str), init=False)
"""The branch currently checked out."""

latest_baybe_tag: str = field(validator=instance_of(str))
"""The latest BayBE version tag."""

execution_date_time: datetime = field(validator=instance_of(datetime))
"""The date and time when the benchmark was executed."""

commit_hash: str = field(validator=instance_of(str))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE branch and hash I just realized: this should not be the branch/hash that is currently checked out but rather at the time of the result creation, right? So it makes sense to me that the commit is populated based on the result, but why is this not the case for the branch? I mean, I see that the branch info is not stored in the result, but why does it then appear here? Is it relevant or not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, what applies to the commit hash should also apply to the branch. I think it is relevant information to filter and should be conceptually treated the same as the commit hash. Therefore, I will move it to the result metadata.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok 👍🏼 Can you then please also rephrase the docstrings accordingly? I.e. ... checked out during benchmark execution. Also, while the "the latest BayBE version tag" works in the MetaData class, it doesn't work here since the creation time of the objects are different, i.e. you can't copy-paste the docstring. That is, there could theoretically be a more recent version tag already. So you need to explicitly say The latest BayBE version tag at benchmark execution time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, done.

…from manual pipeline run

Refactor PathConstructor and ResultMetadata to include branch handling and validation
…from manual pipeline run

Add docstring to _default_branch method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants