-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a helper to get the object store ids and the datasets size for every dataset in a job #125
Add a helper to get the object store ids and the datasets size for every dataset in a job #125
Conversation
…ery dataset in a job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks useful in a general sense, so it's a good idea to include it here I think.
Co-authored-by: Nuwan Goonasekera <[email protected]>
Co-authored-by: Nuwan Goonasekera <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. Can we also add a test for this please?
Co-authored-by: Nuwan Goonasekera <[email protected]>
Could you please point me where to add a test? Should I create a new file in A sample test would look like this: A minor update to the
Test:
|
Most helpers have been tested by exercising it via an actual yaml file. For example, the
However, since this is needlessly verbose, perhaps we could just write the unit test you wrote above in a new file named |
Yeah, sounds like a plan. I will create the file and push a new commit shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I noticed while reading this: #123
"Deferred datasets need to be excluded". Should some logic+test be added for that?
Thank you for pointing out that issue. I'm not sure what the deferred data job objects look like. However, if the input datasets are empty, we are already returning an empty dict, so this should be fine, right? It already silently handles these cases (and probably many others we are unaware of). |
I think you're right. A deferred dataset may return an empty object_store_id, but that's fine I guess, shouldn't break anything. If necessary, we can always do something in a follow up. |
This will be used in the ESG WP4 TPV meta-scheduling API for decision-making. If you think this is not useful for a larger audience, please close this PR.