-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Thread-)Parallelize bounds check routine for subcell IDP limiting #1736
(Thread-)Parallelize bounds check routine for subcell IDP limiting #1736
Conversation
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1736 +/- ##
==========================================
- Coverage 96.34% 96.34% -0.00%
==========================================
Files 451 451
Lines 35979 35996 +17
==========================================
+ Hits 34662 34677 +15
- Misses 1317 1319 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
EDIT: Outdated structure The current thread implementation is not the cleanest.
First, the "Dictionary-Layer" has to be first, since the call of EDIT: Outdated structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this data structure, but I also don't have any better ideas.
In the last commit (78957b7), I revised the memory structure for the IDP bounds check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment left, then go go go!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
For larger simulations, the bounds check routine for IDP limiting requires more and more time since it was not parallelized. After this PR, the loop over all elements is thread-parallelized.
For that, the deviation memory had to be changed:
I divided it into its global
idp_bounds_delta_global
and local componentidp_bounds_delta_local
- both are Dictionaries for the respective variable bounds.idp_bounds_delta_local
(containing the maximum deviations in the current timestep interval) is now a vector and therefore parallel-safe. Due to false sharing we extend the vector and use a stride size.idp_bounds_delta_global
(containing the global maximum deviations) doesn't need to be thread-safe, since it only uses the already parallel-computed result of the local maximum deviation.Additionally, this PR parallelizes resetting the subcell limiting coefficients
alpha
.