Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate #359

Merged
merged 12 commits into from
Sep 19, 2024

Conversation

mlange05
Copy link
Collaborator

@mlange05 mlange05 commented Aug 9, 2024

Note: Just testing for now...

This PR changes the way we add OpenACC annotations in the SCC pipelines and generally refactors SCCAnnotate and SCCRevector. The key change is that we now insert Loki-specific annotations for vector-loops, driver loops, routine annotations and vector reductions in SCCRevector and then let SCCAnnotate translate this into OpenACC directives. This is in preparation for the addition of an additional re-vectorisation scheme and the eventual support for OpenMP-offload in the SCC pipelines, as it will allow SCCAnnotate to easily switch between directive flavours.

There's also quite a bit of refactoring and clean-up, in particular in the upper control methods of SCCAnnotate, as well as the utility method check_routine_pragmas. Overall I hope this is now a bit cleaner and better structured for future development.

In more detail:

  • wrap_vector_section has been made a standalone routine (will be re-used in alternative re-vector scheme).
  • SCCRevector now annotates vector and sequential loops with respective !$loki loop pragmas, as well as marking kernel routines !%loki routine seq|vector and adding !$loki loop vector-reductions for vector loops in marked vector-reduction regions.
  • SCCRevector also re-uses some of these utilities when re-vectoring kernel loops; the respective routines have been refactored to work on generic node "sections" (tuple of Node).
  • SCCAnnotate converts !$loki loop and !$loki routine directives into !$acc equivalents and adds private clauses to vector loops and !$acc data present clauses to kernel routines.
  • The check_routine_pragmas utility has been renamed check_routine_sequential and no longer inserts pragmas at all. It merely checks if a routine has been marked with !$loki routine seq. The second use case (check for repeated processing) is now handled explicitly in SCCAnnotate when annotating kernel routines (which cannot happen twice!).
  • A general tidy-up of SCCAnnotate to make utility methods object-bound and re-use common ones in the driver processing code path. We still need the block_dim, but the horizontal is no longer needed in SCCAnnotate.
  • test_scc_vector.py now also checks for the insertion of certain !$loki loop pragmas.

Copy link

github-actions bot commented Aug 9, 2024

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/359/index.html

Copy link

codecov bot commented Aug 9, 2024

Codecov Report

Attention: Patch coverage is 98.59155% with 3 lines in your changes missing coverage. Please review.

Project coverage is 95.48%. Comparing base (657fc0f) to head (208a99b).
Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
loki/transformations/single_column/vector.py 96.29% 2 Missing ⚠️
loki/transformations/single_column/annotate.py 98.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #359      +/-   ##
==========================================
- Coverage   95.48%   95.48%   -0.01%     
==========================================
  Files         185      185              
  Lines       38621    38651      +30     
==========================================
+ Hits        36879    36906      +27     
- Misses       1742     1745       +3     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 95.47% <98.59%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mlange05 mlange05 requested a review from reuterbal August 9, 2024 12:32
@mlange05 mlange05 marked this pull request as ready for review August 9, 2024 12:38
@mlange05 mlange05 force-pushed the naml-scc-annotate-translate branch from 35dabd0 to 8dbb177 Compare August 9, 2024 13:04
@mlange05
Copy link
Collaborator Author

Quick note: This is ready for review, but currently still fails ec-phys regression locally and will need some small fixes.

Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a really nice clean-up of that processing step that will make subsequent developments much easier and more flexible, and improve the ways how we can interact with and guide transformation pipelines manually when necessary.

One minor comment, nothing that strictly needs an action now.

Comment on lines +365 to +367
if loop.pragma and any('nodep' in p.content.lower() for p in as_tuple(loop.pragma)):
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's currently no test that covers this functionality.

Comment on lines -544 to -554
# TODO: This utility needs some serious clean-up, so we're just testing
# the bare basics here and promise to do better next time ;)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting this must have felt good 😏

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment 😏

@reuterbal reuterbal added the ready for merge This PR has been approved and is ready to be merged label Sep 5, 2024
@mlange05
Copy link
Collaborator Author

mlange05 commented Sep 6, 2024

Thanks for the review @reuterbal . However, I just confirmed that this still fails H24-dev locally, so will need to have another look. I'll remove the merge-ready label and rebase. Will ping once fixed.

@mlange05 mlange05 removed the ready for merge This PR has been approved and is ready to be merged label Sep 6, 2024
@mlange05 mlange05 force-pushed the naml-scc-annotate-translate branch from 454e1d2 to fc17ad2 Compare September 9, 2024 10:50
Instead of SCAnnotate trying to find vector loops, the routine that
creates them marks them with `!$loki vector`, which SCCAnnotate
then translates to OpenACC pragmas.

This also uses in-place updates in SCCAnnotate now to speed up
processing.
A small bit of refactoring of the `SCCRevector` core routine also
ensures that we now only detect `!$loki loop seq` loops inside
driver-loops.
This also renames and refactors the `check_routine_pragmas` utility,
which now only needs to check for genuine `!$loki routine seq`
annotations.
We also change the static classmethods to regular methods to
provide acces to the `directive` property in the follow-up.
Theres' a subtle bug, where they can be attributed to the body,
and thus need moving explicitly. This was done by the provious
utility, but never checked explciitly - so now we do test it!
Adds a small test and does not print data clauses if no arrays
are passed to the routine.
@mlange05 mlange05 force-pushed the naml-scc-annotate-translate branch from fc17ad2 to 33514eb Compare September 9, 2024 14:28
@mlange05
Copy link
Collaborator Author

mlange05 commented Sep 9, 2024

Small integration note: I've needed to rebase manually, as one of the changes in this (the explicit removal of vector-section objects/labels) was implement in a similar, but separate implementation in a recent PR (#328 ). I've also found the bugs that were causing local H24-dev to fail, and added fixes+tests for those corner cases (top two commits). I think I'm happy with this now (confirmed to work locally with H24 ecphys regression), so please have another look and merge once ready.

Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks all good to me!


k = n
do k=1, 3
n = k + 1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be an integer literal 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it should. Fixed now

@reuterbal reuterbal added the ready for merge This PR has been approved and is ready to be merged label Sep 19, 2024
@reuterbal reuterbal merged commit 4cf0f02 into main Sep 19, 2024
13 checks passed
@reuterbal reuterbal deleted the naml-scc-annotate-translate branch September 19, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for merge This PR has been approved and is ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants