-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] compiler: Add new HaloSpot optimization #2475
Conversation
@@ -1457,8 +1457,20 @@ def __init__(self, body, halo_scheme): | |||
self._halo_scheme = halo_scheme | |||
|
|||
def __repr__(self): | |||
functions = "(%s)" % ",".join(i.name for i in self.functions) | |||
return "<%s%s>" % (self.__class__.__name__, functions) | |||
fstrings = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we also print the slice for each halospot
i.e. v[t0] instead of just v.
What do you think?
self.halos == other.halos and | ||
self.dims == other.dims) | ||
|
||
def __hash__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was autocompleted by copilot.
Are people happy to have class instead of namedtuple?
@@ -95,8 +127,20 @@ def __init__(self, exprs, ispace): | |||
self._honored = frozendict(self._honored) | |||
|
|||
def __repr__(self): | |||
fnames = ",".join(i.name for i in set(self._mapper)) | |||
return "HaloScheme<%s>" % fnames | |||
fstrings = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same w above
@@ -366,6 +410,10 @@ def distributed_aindices(self): | |||
def loc_indices(self): | |||
return set().union(*[i.loc_indices.keys() for i in self.fmapper.values()]) | |||
|
|||
@cached_property | |||
def loc_values(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to get the values of loc_indices?
If so happy to drop this
|
||
return iet | ||
|
||
|
||
def _merge_halospots(iet): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved down, the diff is not "represetnative"
|
||
raw_loc_indices = {} | ||
# Since lifted, we need to update the loc_indices with known values | ||
# TODO: Can I get this in a more elegant way? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About getting w.g. (time%2)
def denest_halospots(iet): | ||
""" | ||
Denest nested HaloSpots. | ||
# TOFIX: This also merges HaloSpots that have different loc_indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a problem anymore, but I could not find some solution/custom idea that can trigger that even after the new pass
@@ -1005,6 +1008,93 @@ def test_avoid_haloupdate_if_distr_but_sequential(self, mode): | |||
calls = FindNodes(Call).visit(op) | |||
assert len(calls) == 0 | |||
|
|||
@pytest.mark.parallel(mode=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can further parametrize those, lots of code dupliucation
@@ -1357,7 +1446,7 @@ def test_merge_haloupdate_if_diff_locindices_v1(self, mode): | |||
op = Operator(eqns) | |||
|
|||
calls = FindNodes(Call).visit(op) | |||
assert len(calls) == 1 | |||
assert len(calls) == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improved as discussed in : https://devitocodes.slack.com/archives/C7JMLMSG0/p1729152715068149
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that we're adding a lot new complexity but I don't have an alternative at the moment
|
||
return iet | ||
|
||
|
||
def _merge_halospots(iet): | ||
def _hoist_and_drop(iet): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks really complex, I hoped we could get away with something easier. It's a pity we have to add this much complexity and so much overlap with either hoist_halospot and merge_halospot
as in, there are hoist and merge already, we're adding something "in between" these two, which so much repetition, to me it looks weird, as if we're still failing to capture the right logic |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2475 +/- ##
==========================================
+ Coverage 87.19% 87.22% +0.03%
==========================================
Files 238 238
Lines 45201 45332 +131
Branches 4012 4029 +17
==========================================
+ Hits 39415 39543 +128
- Misses 5104 5108 +4
+ Partials 682 681 -1 ☔ View full report in Codecov by Sentry. |
Closing as superseded by #2477 |
[Requesting comments]
Fix for issue #2448, supplemented with some code edits