-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add industry module #340
Add industry module #340
Conversation
Co-authored-by: Meijun Chen (tud-mchen6)
Co-authored-by: Meijun Chen (github/tud-mchen6)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. Will need @timtroendle to comment on the utils files. We have eurocalliopelib
at the moment to store utility functions. Should these go there? Or does the modularisation require a different approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should have a schema for this config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've created issue #347 to tackle this.
A schema is a great idea, but the module's configuration might change too much while we port features. We'd prefer to do it once the chemical and "other" sectors are ported, to ensure the configuration file is mature enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This config must go into the metadata that we create for model builds. See the build_metadata
rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This config must go into the metadata that we create for model builds. See the
build_metadata
rule.
The general config
already includes our module's configuration. So it is already in the metadata :)
# Include modules
configfile: "modules/industry/config.yaml"
Co-authored by: Meijun Chen (tud-mchen6)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an excited pull request, thanks!
In my review, I mainly focussed on the structure assuming @brynpickering has checked the data processing logic. There are a few consistency related issues that need addressing.
In addition, this change is not adding a feature and it's not show-casing the entire integration and no tests. Would it be possible to refactor the change so it's a feature? What I mean is: can we build this into the model? I understand you derive annual steal demand, so from there you could quite easily create flat steel demand profiles for all locations and put those into the model. Only then would we see the entire integration pipeline: model feature, documentation, and tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This config must go into the metadata that we create for model builds. See the build_metadata
rule.
|
||
import eurocalliopelib.utils as ec_utils | ||
import pandas as pd | ||
from utils import formatting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importing your own library functions in Snakemake requires careful consideration. Mainly, you want to make sure that changes in the library code trigger a re-execution of the rule. Currently, they don`t.
Would it make sense to move all re-usable code to the euro-calliope-lib? That will circumvent the problem. Currently the lib lives in the main repo, but we are considering to pull that into it's own repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a strong argument to use wrappers instead...
Moving these utils to another location (outside the module) sort of defeats the purpose of modularisation in the first place...
@irm-codebase and @tud-mchen6 maybe we can discuss the design briefly after the group meeting today? @brynpickering may be useful to include in the discussion. |
That would be good! No problem on my side 👍 |
For now only the steel sector configuration is validated.
…g' into add-industry-module
Co-authored with Ivan Ruiz Manuel (irm-codebase)
Relatively big update to this PR to fix #346. This was necessary because #354 made our pandas code go at a snail's pace. Moving to |
@irm-codebase if you can rebase this onto current |
@brynpickering updated to the newest JRC version (checksum passed, no difference 👍 ) I had to modify a couple of lines at the JRC level because the production dataset did not assign names to the variable. I also passed the attributes down to the variable level. Hope that's OK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, we're close to having this ready to merge in!
@brynpickering added most of your suggestions, with the exception of the gap filling test. Let me know if you find anything else during review. Otherwise, I'll merge this in and update the "chemical" and "other" cases. |
One thing that is possibly missing is documentation in the main euro-calliope docs about these new modules. But for now I think we can ignore that as we may well move this to a different repo before the next release... |
I would add documentation now. It's now in this repo and we may or may not move it somewhere else in the future. Like the model, the documentation is quite modular anyway, isn't it? So we could easily move the industry bit out of the documentation, too, should we decide to pull out this module into its own repo. We may as well decide to keep the documentation in the main repo and simply mention the optional module (one aspect we haven't discussed at all so far). |
@timtroendle I'd say that adding full documentation now is only additional overhead: this is only part of the module (we are still missing other industry and chemical industry), plus disaggregation, testing and scaling steps. Also, we have already added pre-eliminary documentation explaining usage (see the README.md). Besides, I agree with Bryn: moving module subworkflows into a separate repo makes a lot of sense. Still, I created issue #381 to make sure we stay on point with this. For now, let's finish this PR once and for all, please 🙏🙏🙏? |
Yeah fair point. Sorry, didn't want to be annoying, it's just easy to forget these things. |
…y-module Add industry module
…y-module Add industry module
…y-module Add industry module
Fixes #308, #310, #347, #345 and #346.
This pull request includes the "skeleton" of the industry module, and includes the steel industry sub-workflow as an example.
Checklist
Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.