-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance to create constraints over clustered year #1017
Conversation
Benchmark ResultsBenchmark in progress... |
dd763a1
to
e14d510
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1017 +/- ##
==========================================
+ Coverage 95.22% 95.27% +0.04%
==========================================
Files 29 29
Lines 1152 1184 +32
==========================================
+ Hits 1097 1128 +31
- Misses 55 56 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @abelsiqueira, thanks for the PR. I tested it and we get the same .lp file for 100rp comparing this PR and PR #1015. The SQL queries can e followed, so that is good, since they follow the same type of structure as the others in the code. I also like that the inflows aggregation is not an expression, it is better as a coefficients (maybe the same is useful for other aggregations, like the availability and demand profile aggregation).
I ran the benchmark locally with 100rp and the times to create the model are:
- PR Improve performance v0.11.0 #1015 1167s
- This PR 28.6s
My main comment is that in these changes the JuMP.add_to_expression! is lost in the code, and I think we still can gain more with change to (or am I not seeing/understanding something here)
Finally, since the inflow profiles are calculated in the coefficients, it is better to clean the function and delete the commented parts that are not used anymore (even some TODOs might be obsolete).
Thanks again, great work here!
Diego
src/model-preparation.jl
Outdated
view = true, | ||
grouped_var_table_name = "t_grouped_$(flow.table_name)_match_on_$(from_or_to)" | ||
if !_check_if_table_exists(connection, grouped_var_table_name) | ||
DuckDB.query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not executed so the codecov is complaining. Indeed, this table should already be computed in the rp
version - it's the same aggregation. This is only here in the case that the rp version is not called at all.
Is that possible? If yes, then we should create an issue to create a test case where that happens.
If not, then we can delete this and keep only the check and an error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The answer to the question is, no, it is not possible. The RP version will always call for the timeframe constraints. So, the check and an error message should be the way to go here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abelsiqueira Hi, I change the code here to have the error check, but it looks a bit ugly, so, feel free to suggest some changes. Thanks!
Hi Diego, I've replied to your On doing the same aggregation for everyone (or the opposite, not doing for any of them), it's a good question that should be investigate. Can you create an issue for that? I removed the other comments that I could see from this PR page and reduced the DuckDB query to remove the LEFT JOIN on the profiles. Hopefully the code still works. This should address all comments, but if it doesn't, feel free to pick it up while I'm back on the multi-year refactor. But if you do need me, feel free to ping me. |
@abelsiqueira Thank you for the reply and explanation. My question also arose when I saw this message in the Julia console using this code (for hourly 36rp): ┌ Warning: The addition operator has been used on JuMP expressions a large number of times. This warning is safe to ignore but may indicate that model generation is slower than necessary. For performance reasons, you should not add expressions in a loop. Instead of x += y, use add_to_expression!(x,y) to modify x in place. If y is a single variable, you may also use add_to_expression!(x, coef, y) for x += coef*y. So, if it is not this part, then I wonder where else we have the To reproduce the warning, use the files in the benchmark, and before creating the model, use the following command to change all the partitions to 1 hour: DuckDB.query(connection, "UPDATE assets_rep_periods_partitions SET partition = 1")
DuckDB.query(connection, "UPDATE flows_rep_periods_partitions SET partition = 1") Thanks! |
No ideas, but indeed there is something somewhere. I'm probably not touching this until Monday, though. |
great! thanks, since it seems it is happening in another part of the code, then we can address it in another issue and merge this one first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abelsiqueira, the PR is approved, feel free to merge it once you have looked at the error message that I suggest to add in the code.
@abelsiqueira, I forgot to mention: I am running the versions' benchmarks locally to get a full picture of improvements. That is why I am not benchmarking this PR anymore and returning the files to the 100rp benchmark. |
Hi @datejada, I've updated your error message a bit. I have not found any reason for the expression warning yet. I looked into the |
I will merge this now. Thanks for the help. |
This is built on top of #1015.
Closes #1014
Notes:
graph
and other similar things were removed in favour of DuckDB.rep_periods_mapping
table (and friends) to join the period from constraints with the rep_period from variables taking into account the weight.(asset, year)
and internally forrep_period
.:inflows_profile_aggregation
is also computed here. Although, in the end the solution I found was to create a completely separate query to aggregate the profile - and probably it doesn't make sense to compute this in this function (unless we need to make it faster and figure a way to join them), because the inflows don't require any flow information.@time
, but I would prefer that you test in your normal workflow.