-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add config model and corresponding tests #45
Conversation
I think there is still a lot of stuff we could strip out of the config model. I think we should take it down to the absolute simplest it can be. I think its better to add in things as we need them rather than port everything across just because we have it |
I wonder weather we should try to train a model first before bring in stuff from ocf_datapipes. |
Yeah I agree. I think I stripped out all mixins and properties we don't use minus sequence length properties (i think it's safe to strip out), system dropout (also at least not used in here, but correct me if I'm wrong), and this change of filename thing which I'm not sure if it's still in use. Unless you mean removing arguments inside the mixins, which is probably good to do but I wouldn't be sure what to remove (I guess I can just strip everything that tests/test_data/pvnet_test_config.yaml doesn't use?) |
Yes I agree that we want to avoid training a model on a different config to limit the searchspace for bugs, but if it's an exact copy I don't think there is any harm in bringing it over (though I guess there's always a risk of human error, especially since we're stripping stuff out... up for debate). I was thinking we can merge a copy and agree not to merge any changes to it before the model is trained, because depending on how long that takes not having a baseline for config here might block development of other stuff? |
Also, should we make |
I'd leave it for now. At the moment the |
I'm not sure about this, but I also really feel strongly about it. The current UK PVNet on prod doesn't make use of dropout (except for NWP delay which we should distinguish), though some of the models on dev do. So in that sense I think it could be optional. But feel free to change it if you think it makes more sense. However my preference of an "off" setting would be an empty list rather than We should definitely at least align those checks! We could delete the check from the function itself if we think it won't be used independently from the config, which has always been true for us. It is a very computationally cheap check though so I'd choose whichever we think is better code quality. @peterdudfield might have some thoughts on this? There is currently an additional way of turning off dropout which is to set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good now @AUdaltsova. My last comments are only nice-to-have's. You can decide what to do with them. Great work!
p.s. when you merge can you make it a squash merge? I think thats good practice so we don't make the git repo really big by storing unneeded history
I see what you mean and I would usually agree, but I think the whole concept of dropout being "off" is a bit of an illusion: there isn't really such thing as ambiguous delay, you either set it to something or it defaults to instant availability. Which is an "off" in the sense that it's not delayed, but it's still a very specific scenario that you are committing to, so I think it would be nice to think of that and consciously put 0 instead of thinking oh well, just do whatever it is you do, if that makes sense? EDIT: it occurred to me just now that since there is a separate handling of None it could be beneficial to replace it with an empty list to avoid going through some unnecessary motions. 0 will be treated as an actual dropout value so there will be some extra computation, just resulting in no changes (though probably not expensive at any rate).
Personally, I would go for a config check. This is a restriction on an input parameter and should be checked on input (same as the NWP provider check, which turns out has not been working all this time :) Obviously incorrect providers were being caught down the line when consts are suddenly not there, so it is not even strictly necessary, but I think it should stay and flag this right away instead of half-way through selection)
That is a very good point! Especially since dropout fraction defaults to "off", so can accidentally negate dropout timedeltas and not fail anywhere (I think) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #45 +/- ##
==========================================
+ Coverage 93.05% 93.12% +0.07%
==========================================
Files 22 26 +4
Lines 691 815 +124
==========================================
+ Hits 643 759 +116
- Misses 48 56 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@peterdudfield thought it's good redundancy in case we change the config or use the fnction without it, so I put it back in but made it a leq to be consistent with config check.
Added a check and test for that! |
Pull Request
Description
Copied over the relevant parts of the config model from ocf_datapipes (pv, nwp, gsp, satellite) and corresponding tests + data for them.
How Has This Been Tested?
Ran the test pack
Checklist: