-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New log_metadata
function, new oneof
filtering, additional run_metadata
filtering
#3182
Conversation
…nto feature/better-metadata
…nto feature/better-metadata
…nto feature/better-metadata
…nto feature/better-metadata
…nto feature/better-metadata
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
# and this call needs to happen during a step execution. We will use the | ||
# step context to fetch the step, run and possibly the model version and | ||
# attach the metadata accordingly. | ||
elif all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I like the current logic: If this is being called from a step and the user provides an artifact name, don't you think we should also link it to the pipeline run, step run and model version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid of the growing data duplication in the metadata model. We checked that currently the value field is an unrestricted text, what if the user logs a fatty metadata value and then it is magically multiplied into 4 occurrences? It is not very scalable, IMO, and some solution to reuse the values shall be in place if go this route. I have verbaly shared my thoughts with @bcdurak on that already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes 100% agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am investigating this atm. Feel free to review everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be solved in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are addressed or rejected, so good to go from my end, as soon as tests are all green (I shipped fixes for all of them, besides mac 3.9 slow - ignore that one)
Describe changes
This PR features three main changes:
The new
log_metadata
functionNow, all the metadata-creating functions are gathered under one method called
log_metadata
. It is possible to call this method with different inputs to log run metadata for artifact versions, model versions, steps, and runs.The
oneof
filteringThis allows us to filter entities using a new operator called
oneof
. You can possibly use this with IDs, tags, and all the other attributes likePipelineRunFilter(tag='oneof:["cats", "dogs"]')
.The
run_metadata
filteringThe custom filtering logic for runs, steps, artifact-, and model versions has been improved. Now, you can filter these entities using filters like
PipelineRunFilter(run_metadata={"accuracy": "gt:0.85"})
.Further changes
The remaining TODOs
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes