-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add manual benchmark workflow and S3 result persistence #429
base: main
Are you sure you want to change the base?
Conversation
@fabianliebig Looking at your commit history, I think the "fixup" did not do what you wanted it to do, you might want to check that out again ;) |
f1cdf02
to
87dce85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks quite clean and nice. Some things need minor adjustments :)
id: generate-token | ||
uses: actions/create-github-app-token@v1 | ||
with: | ||
app-id: ${{ vars.APP_ID }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part where we still need to add some information/configuration to the repository itself, right? For the sake of documentation, can you elaborate a little bit on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I think at some point a README will be necessary :D. The following information will need to be added:
- APP_ID: The autogenerated number of the GitHub app used to generate a token for fetching temporary credentials for the runner.
- APP_PRIVATE_KEY: The key from the GitHub app to assume its role.
- AWS_ROLE_TO_ASSUME: The role to assume within the AWS-factory (must be deposited as a role to assume for the OIDC authentication)
- TEST_RESULT_S3_BUCKET: The name of the bucket where the results should be stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then let's keep this comment open until all of these things have been added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add a /benchmarks/README.md
, just like we have one in the /tests/README.md
. It would definitely be god to have your comprehensive instructions in written form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabianliebig, here a first few thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments and one rather major one where we might need to discuss a little bit.
6ef448e
to
9b57f37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabianliebig, thx for the draft. Here a few initial comment
Returns: | ||
The object to be persisted. | ||
""" | ||
return self._result.to_dict() | self._benchmark.to_dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rational of storing the entire benchmark and not just it's ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a mather of taste and depend on what one would want to have documented and displayed in the dashboard. If I remember correctly, I had a chat with @AVHopp about that. Maybe he can share his opinion. I would want to avoid loading data in a strange way, so the question for me is rather if this data doubling (description, name, and so) for every result disturbs you and should be removed or separated from the varying result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reason is that some of the details like the best_possible_result
is currently part of the Benchmark
definition - so if you want to display that just based on the persisted data, you need to store that, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I mean. There will be redundancy following the current approach. But we might need further components to avoid that or store the benchmark definition once with a dedicated key just for the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this is fine for now. It semms to do necessary due to the way that responsibilities are handled in the Benchmark
which will change anyway. So if that is fine for both of you, my vote would be "let's keep it as-is, but add a comment to the code here that this part needs to be changed later on"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to implement it directly. Without further API-Abstraction, the storage and its handling is the direct interface for all application. Inconsistency in the way of storing things can be a pain later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I did it that way what that things like the metadata may change. So there is not a clear schema to be defined beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tradeoff might be to only include relevant data from the benchmark (such as the best possible value) and extend the JSON file if necessary. Also, the metadata dict from the result is fully extendable since the dashboard currently add all of its content in a loop. @AdrianSosic @AVHopp Please let me know what you preferer :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a large scaled system it would properly make sense to combine a filesystem like S3 with a database but I think that is a overkill based on the number of benchmarks and forecasted executions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments from my side.
aa83e58
to
22caf8e
Compare
…zable and adding a dedicated serialization module
e99f861
to
1de1222
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the hopefully final round of comments from my side :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabianliebig, thanks so much. I think our discussion was definitely worth it, I like the path-based solution a lot more, and I also saw that you agree 👍🏼 very nice.
Here a few suggestions + problem catches. But no changes required for the architecture! Will have to jump now but will look at the revised version later this evening.
2afbc62
to
c68866a
Compare
…g folder and filename construction. This commit will drop the workflow id to allow overwriting and make the path homogenous constructable.
…nual pipeline run
…ure and persistence capabilities
c68866a
to
31cd5b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next round
ALLOWED_CHARS = ( | ||
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-." | ||
) | ||
return "".join([char if char in ALLOWED_CHARS else "-" for char in string]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, what I mean is if you e.g. have a "/" which is not allowed and "-" in your string. Then both will be a "-" in the sanitized string, and you do not know which of the "-" in the sanitized string have originally been "/". Any my question is whether we can ignore this or whether this could cause problems.
Returns: | ||
The object to be persisted. | ||
""" | ||
return self._result.to_dict() | self._benchmark.to_dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As already said, my vote is for "Keep-as-is". What about @AdrianSosic resp. is this thread still relevant?
…from manual pipeline run Fix typo in PathConstructor class documentation regarding S3 bucket path handling
…from manual pipeline run Refactor path construction to return Path objects and update S3 upload key handling.
id: generate-token | ||
uses: actions/create-github-app-token@v1 | ||
with: | ||
app-id: ${{ vars.APP_ID }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add a /benchmarks/README.md
, just like we have one in the /tests/README.md
. It would definitely be god to have your comprehensive instructions in written form
|
||
|
||
@define | ||
class LocalFileSystemObjectWriter(ObjectWriter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was the local persistence implemented?
Apart from that, looking at make_object_writer
if something goes wrong in CI and the env vars were wrong it would just write it locally and we would not immediately notice. In such cases a simple fail would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It came up in this thread. However, I see your point. Do you think it would be sufficient to change the logic by making the S3 Writer the default and only using local persistence when we are not using the pipeline? There are GitHub standard environment variables that could be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also the CI
env that is automatically set on Github that we could leverage. That is, we could hardwire the logic that when CI
is present, stuff automatically goes to S3. That would make it fail-safe plus we could still execute locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added that, please let me know if that is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this seems quite solid for the current intended use 👍🏼 But I'll let @Scienfitz either resolve or add more comments
json.dump(object, file) | ||
|
||
|
||
def make_object_writer() -> ObjectWriter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike the make_
approach and we also often follow different patterns that are equally valid choices here (the make_
approach does not seem very needed here too imo)
Not sure why make_object_writer
is even needed, given that we dont need local storage persistence and Im not sure why it was even implemented (see other comment)
make_path_constructor
would become obsolete if PathConstructor
would just hold a benchmark
and result
as attributes and the current constructor attribtutes were properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That separation comes from a former code version. I think it would be fine to drop it. The PathConstructor
class may also have a constructor which performs the work of make_path_constructor
. The make_object_writer
was my way to hide the choice of which class is used for writing, that can also be changed but I think local writing is good for testing purposes. @AdrianSosic What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Regarding
make_path_constructor
: removing it is fine for me 👍🏼 As you correctly pointed out, you can take the code exactly as-is and turn it into a correspondingfrom_...
classmethod
with appropriate name in thePathConstructor
class as alternative constructor. So basically just attaching the logic to the class instead of standalone function. While we could theoretically also change the class such that it directly takes the method's inputs as attributes, I vote against it since theattrs
mantrayou should think about what the class needs and not how it is going to be instantiated
has proven incredibly useful in the past 🙃 - I also consider the possibility of being able to run stuff locally very useful. So somewhere this distinction needs to be made. But you can also just replace the function with a simple ternary
if-else
operator in the__main__.py
, which actually might be more elegant given the current simplistic logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me 👍🏼 But will let @Scienfitz resolve since he opened the thread
…from manual pipeline run Refactor serialization classes to use BenchmarkSerialization and update persistence handling for GitHub CI
…from manual pipeline run Refactor persistence handling to use dedicated storage classes and improve path construction for benchmark results
branch: str = field(validator=instance_of(str), init=False) | ||
"""The branch currently checked out.""" | ||
|
||
latest_baybe_tag: str = field(validator=instance_of(str)) | ||
"""The latest BayBE version tag.""" | ||
|
||
execution_date_time: datetime = field(validator=instance_of(datetime)) | ||
"""The date and time when the benchmark was executed.""" | ||
|
||
commit_hash: str = field(validator=instance_of(str)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE branch and hash I just realized: this should not be the branch/hash that is currently checked out but rather at the time of the result creation, right? So it makes sense to me that the commit
is populated based on the result, but why is this not the case for the branch? I mean, I see that the branch info is not stored in the result, but why does it then appear here? Is it relevant or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, what applies to the commit hash should also apply to the branch. I think it is relevant information to filter and should be conceptually treated the same as the commit hash. Therefore, I will move it to the result metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok 👍🏼 Can you then please also rephrase the docstrings accordingly? I.e. ... checked out during benchmark execution
. Also, while the "the latest BayBE version tag" works in the MetaData class, it doesn't work here since the creation time of the objects are different, i.e. you can't copy-paste the docstring. That is, there could theoretically be a more recent version tag already. So you need to explicitly say The latest BayBE version tag at benchmark execution time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, done.
…from manual pipeline run Refactor PathConstructor and ResultMetadata to include branch handling and validation
…from manual pipeline run Add docstring to _default_branch method
Hi everyone,
This pull request introduces a class and a workflow designed to store the results of a benchmark run on an S3 bucket.
The key used for storage includes the identifier for the benchmark itself, the branch used, the release version, the current date and the commit hash in that order. Furthermore, boto3 package is added to interact with AWS components.
I look forward to your feedback.