-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate #359
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/359/index.html |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #359 +/- ##
==========================================
- Coverage 95.48% 95.48% -0.01%
==========================================
Files 185 185
Lines 38621 38651 +30
==========================================
+ Hits 36879 36906 +27
- Misses 1742 1745 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
35dabd0
to
8dbb177
Compare
Quick note: This is ready for review, but currently still fails ec-phys regression locally and will need some small fixes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a really nice clean-up of that processing step that will make subsequent developments much easier and more flexible, and improve the ways how we can interact with and guide transformation pipelines manually when necessary.
One minor comment, nothing that strictly needs an action now.
if loop.pragma and any('nodep' in p.content.lower() for p in as_tuple(loop.pragma)): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's currently no test that covers this functionality.
# TODO: This utility needs some serious clean-up, so we're just testing | ||
# the bare basics here and promise to do better next time ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting this must have felt good 😏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comment 😏
Thanks for the review @reuterbal . However, I just confirmed that this still fails H24-dev locally, so will need to have another look. I'll remove the merge-ready label and rebase. Will ping once fixed. |
454e1d2
to
fc17ad2
Compare
Instead of SCAnnotate trying to find vector loops, the routine that creates them marks them with `!$loki vector`, which SCCAnnotate then translates to OpenACC pragmas. This also uses in-place updates in SCCAnnotate now to speed up processing.
A small bit of refactoring of the `SCCRevector` core routine also ensures that we now only detect `!$loki loop seq` loops inside driver-loops.
This also renames and refactors the `check_routine_pragmas` utility, which now only needs to check for genuine `!$loki routine seq` annotations.
We also change the static classmethods to regular methods to provide acces to the `directive` property in the follow-up.
Theres' a subtle bug, where they can be attributed to the body, and thus need moving explicitly. This was done by the provious utility, but never checked explciitly - so now we do test it!
Adds a small test and does not print data clauses if no arrays are passed to the routine.
fc17ad2
to
33514eb
Compare
Small integration note: I've needed to rebase manually, as one of the changes in this (the explicit removal of vector-section objects/labels) was implement in a similar, but separate implementation in a recent PR (#328 ). I've also found the bugs that were causing local H24-dev to fail, and added fixes+tests for those corner cases (top two commits). I think I'm happy with this now (confirmed to work locally with H24 ecphys regression), so please have another look and merge once ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks all good to me!
|
||
k = n | ||
do k=1, 3 | ||
n = k + 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be an integer literal 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it should. Fixed now
Note: Just testing for now...This PR changes the way we add OpenACC annotations in the SCC pipelines and generally refactors
SCCAnnotate
andSCCRevector
. The key change is that we now insert Loki-specific annotations for vector-loops, driver loops, routine annotations and vector reductions inSCCRevector
and then letSCCAnnotate
translate this into OpenACC directives. This is in preparation for the addition of an additional re-vectorisation scheme and the eventual support for OpenMP-offload in the SCC pipelines, as it will allowSCCAnnotate
to easily switch between directive flavours.There's also quite a bit of refactoring and clean-up, in particular in the upper control methods of
SCCAnnotate
, as well as the utility methodcheck_routine_pragmas
. Overall I hope this is now a bit cleaner and better structured for future development.In more detail:
wrap_vector_section
has been made a standalone routine (will be re-used in alternative re-vector scheme).SCCRevector
now annotates vector and sequential loops with respective!$loki loop
pragmas, as well as marking kernel routines!%loki routine seq|vector
and adding!$loki loop vector-reductions
for vector loops in marked vector-reduction regions.SCCRevector
also re-uses some of these utilities when re-vectoring kernel loops; the respective routines have been refactored to work on generic node "sections" (tuple ofNode
).SCCAnnotate
converts!$loki loop
and!$loki routine
directives into!$acc
equivalents and addsprivate
clauses to vector loops and!$acc data present
clauses to kernel routines.check_routine_pragmas
utility has been renamedcheck_routine_sequential
and no longer inserts pragmas at all. It merely checks if a routine has been marked with!$loki routine seq
. The second use case (check for repeated processing) is now handled explicitly inSCCAnnotate
when annotating kernel routines (which cannot happen twice!).SCCAnnotate
to make utility methods object-bound and re-use common ones in the driver processing code path. We still need theblock_dim
, but thehorizontal
is no longer needed inSCCAnnotate
.test_scc_vector.py
now also checks for the insertion of certain!$loki loop
pragmas.