Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Improve IndexDerivatives lowering #2183

Closed
wants to merge 49 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
83f892b
compiler: Emulate e.find(type) with (faster) search(e, type)
FabioLuporini Apr 28, 2023
a8e1743
compiler: Add IterationSpace.prefix
FabioLuporini May 3, 2023
a7cba2d
compiler: Use internal repr for IndexSums
FabioLuporini May 3, 2023
354cf5c
compiler: Add Bunch.__repr__
FabioLuporini May 3, 2023
379c571
compiler: Add ClusterGroup.rebuild
FabioLuporini May 4, 2023
f00228a
compiler: Add ClusterGroup.properties
FabioLuporini May 5, 2023
1189dcb
comiler: Add minmax_index()
FabioLuporini May 12, 2023
34daf46
compiler: Add DAG.all_predecessors
FabioLuporini Jun 9, 2023
d3c1d22
tools: Add DAG.find_paths
FabioLuporini Jun 14, 2023
7a34949
compiler: DROP ME AFTER REBASE WITH FUTURE PR: Fix Indexed._subs
FabioLuporini Jun 14, 2023
09b57eb
compiler: Generalize AffineIndexAccessFunction
FabioLuporini Jun 15, 2023
7ee89ef
compiler: Fix DefFunction printing
FabioLuporini Jun 20, 2023
6a9e320
compiler: aliases.Candidate -> ir.ExprGeometry
FabioLuporini Jun 22, 2023
30243f2
compiler: Generalize and enhance ExprGeometry
FabioLuporini Jun 26, 2023
dceaf1b
compiler: Tweak minmax_index
FabioLuporini Jun 27, 2023
bccca77
compiler: Add and_smart for guards auto-simplification
FabioLuporini Jul 4, 2023
87d0d2c
compiler: Improve Cluster.is_dense
FabioLuporini Jul 18, 2023
9d60dc8
compiler: Add IndexDerivative.base
FabioLuporini Jul 18, 2023
dc9bac4
compiler: Patch AffineIndexAccessFunction
FabioLuporini Jul 19, 2023
c9d3f96
compiler: Support 2-pass impls w unexpasion
FabioLuporini Jul 18, 2023
aa79d6d
compiler: Improve profiling of multipass implementations
FabioLuporini Jul 20, 2023
7d70ff9
compiler: Patch sync_sections
FabioLuporini Jul 21, 2023
7b039c6
compiler: Enhance CireIndexDerivatives
FabioLuporini Jul 21, 2023
998c9b3
compiler: Patch has_data_reuse to account for StencilDimension
FabioLuporini Jul 24, 2023
d246cb2
pep8 happiness
FabioLuporini Jul 24, 2023
e7ff141
compiler: Patch infer_dtype to support vector types
FabioLuporini Jul 27, 2023
8ecb7cc
compiler: Fix DDA involving ComponentAccesses
FabioLuporini Jul 28, 2023
a046fa3
api: Support pattern-matching par-tile
FabioLuporini Jul 31, 2023
756e0a1
compiler: Patch minimize_symbols for parlang backends
FabioLuporini Jul 31, 2023
3e7270b
compiler: Tweak pow_to_mul & factorize
FabioLuporini Aug 1, 2023
4b7c213
compiler: Expand along SteppingDimensions
FabioLuporini Aug 2, 2023
11ff50f
compiler: Update behavior of ClusterGroup.syncs
FabioLuporini Aug 2, 2023
44b752e
compiler: Add and exploit properties.is_parallel_atomic
FabioLuporini Aug 2, 2023
cc4d9ad
compiler: Enhance DDA across IndexDerivatives
FabioLuporini Aug 2, 2023
83d0924
compiler: Tidy up utilities
FabioLuporini Aug 4, 2023
0d18e85
compiler: Make IndexDerivatives homogeneous irrespective of matvec
FabioLuporini Aug 22, 2023
c9f006a
compiler: Fix 2-pass implementations with expand=False
FabioLuporini Aug 23, 2023
f1466d1
compiler: Tweak aliases selection
FabioLuporini Aug 25, 2023
6ed33ee
compiler: Patch CireIndexDerivatives
FabioLuporini Aug 25, 2023
e10d99f
compiler: Add DAG.roots
FabioLuporini Aug 29, 2023
ee501d5
compiler: Remame CIRE search/compose funcs
FabioLuporini Aug 29, 2023
1827ad4
compiler: Tweak cire-schedule behavior
FabioLuporini Aug 30, 2023
729bc3f
compiler: Improve DAG
FabioLuporini Aug 30, 2023
5242a1d
compiler: Make collect_derivatives stable
FabioLuporini Aug 30, 2023
3f17791
compiler: Tweak group aliases detection
FabioLuporini Aug 31, 2023
da40050
compiler: Make Bunch iterable
FabioLuporini Sep 4, 2023
6420df3
compiler: Drop .find where possible
FabioLuporini Apr 20, 2023
e4f186b
examples: Update expected output
FabioLuporini Sep 6, 2023
9041b7c
compiler: Fix deterministic codegen
FabioLuporini Sep 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions devito/core/operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,9 +329,9 @@ class OptOption(object):

class ParTileArg(tuple):

def __new__(cls, items, shm=0, tag=None):
def __new__(cls, items, rule=None, tag=None):
obj = super().__new__(cls, items)
obj.shm = shm
obj.rule = rule
obj.tag = tag
return obj

Expand Down Expand Up @@ -371,14 +371,15 @@ def __new__(cls, items, default=None):

try:
y = items[1]
if is_integer(y):
# E.g., ((32, 4, 8), 1)
# E.g., ((32, 4, 8), 1, 'tag')
if is_integer(y) or isinstance(y, str) or y is None:
# E.g., ((32, 4, 8), 'rule')
# E.g., ((32, 4, 8), 'rule', 'tag')
items = (ParTileArg(*items),)
else:
try:
# E.g., (((32, 4, 8), 1), ((32, 4, 4), 2))
# E.g., (((32, 4, 8), 1, 'tag0'), ((32, 4, 4), 2, 'tag1'))
# E.g., (((32, 4, 8), 'rule'), ((32, 4, 4), 'rule'))
# E.g., (((32, 4, 8), 'rule0', 'tag0'),
# ((32, 4, 4), 'rule1', 'tag1'))
items = tuple(ParTileArg(*i) for i in items)
except TypeError:
# E.g., ((32, 4, 8), (32, 4, 4))
Expand Down
13 changes: 12 additions & 1 deletion devito/finite_differences/differentiable.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,10 @@ def __new__(cls, *args, **kwargs):
return obj

def subs(self, *args, **kwargs):
if len(args) == 2:
old, new = args
if self == old:
return new
return self.func(*[getattr(a, 'subs', lambda x: a)(*args, **kwargs)
for a in self.args], evaluate=False)

Expand Down Expand Up @@ -556,6 +560,9 @@ def __repr__(self):

__str__ = __repr__

def _sympystr(self, printer):
return str(self)

def _hashable_content(self):
return super()._hashable_content() + (self.dimensions,)

Expand Down Expand Up @@ -621,7 +628,7 @@ def __eq__(self, other):
__hash__ = sympy.Basic.__hash__

def _hashable_content(self):
return (self.name, self.dimension, hash(tuple(self.weights)))
return (self.name, self.dimension, str(self.weights))

@property
def dimension(self):
Expand Down Expand Up @@ -665,6 +672,10 @@ def __new__(cls, expr, mapper, **kwargs):
def _hashable_content(self):
return super()._hashable_content() + (self.mapper,)

@cached_property
def base(self):
return self.expr.func(*[a for a in self.expr.args if a is not self.weights])

@property
def weights(self):
return self._weights
Expand Down
19 changes: 15 additions & 4 deletions devito/finite_differences/finite_difference.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,9 +207,11 @@ def generic_derivative(expr, dim, fd_order, deriv_order, matvec=direct, x0=None,
matvec, x0, symbolic, expand)


def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic, expand):
def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic,
expand):
# The stencil indices
indices, x0 = generate_indices(expr, dim, fd_order, side=side, matvec=matvec, x0=x0)
indices, x0 = generate_indices(expr, dim, fd_order, side=side, matvec=matvec,
x0=x0)

# Finite difference weights from Taylor approximation given these positions
if symbolic:
Expand All @@ -221,15 +223,24 @@ def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic
weights = [sympify(w).evalf(_PRECISION) for w in weights]

# Transpose the FD, if necessary
if matvec:
indices = indices.scale(matvec.val)
indices = indices.scale(matvec.val)

# Shift index due to staggering, if any
indices = indices.shift(-(expr.indices_ref[dim] - dim))

# The user may wish to restrict expansion to selected derivatives
if callable(expand):
expand = expand(dim)

if not expand and indices.expr is not None:
weights = Weights(name='w', dimensions=indices.free_dim, initvalue=weights)

if matvec == transpose:
# For homogenity, always generate e.g. `x + i0` rather than `x - i0`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Homogeneity

# for transpose and `x + i0` for direct
indices = indices.transpose()
weights = weights._subs(indices.free_dim, -indices.free_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weights.transpose() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was that way, just like the indices above, but I had to change it because... I don't remember exactly


# Inject the StencilDimension
# E.g. `x + i*h_x` into `f(x)` s.t. `f(x + i*h_x)`
expr = expr._subs(dim, indices.expr)
Expand Down
18 changes: 18 additions & 0 deletions devito/finite_differences/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,24 @@ def scale(self, v):

return IndexSet(self.dim, indices, expr=expr, fd=self.free_dim)

def transpose(self):
"""
Transpose the IndexSet.
"""
indices = tuple(reversed(self))

free_dim = StencilDimension(self.free_dim.name,
-self.free_dim._max,
-self.free_dim._min,
backward=True)

try:
expr = self.expr._subs(self.free_dim, -free_dim)
except AttributeError:
expr = None

return IndexSet(self.dim, indices, expr=expr, fd=free_dim)

def shift(self, v):
"""
Construct a new IndexSet with all indices shifted by `v`.
Expand Down
26 changes: 17 additions & 9 deletions devito/ir/clusters/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
import sympy

from devito.exceptions import InvalidOperator
from devito.ir.support import (Any, Backward, Forward, IterationSpace,
PARALLEL_IF_ATOMIC, pull_dims)
from devito.ir.support import (Any, Backward, Forward, IterationSpace, erange,
pull_dims)
from devito.ir.clusters.analysis import analyze
from devito.ir.clusters.cluster import Cluster, ClusterGroup
from devito.ir.clusters.visitors import Queue, QueueStateful, cluster_pass
Expand Down Expand Up @@ -121,10 +121,12 @@ def callback(self, clusters, prefix, backlog=None, known_break=None):
require_break = scope.d_flow.cause & maybe_break
if require_break:
backlog = [clusters[-1]] + backlog
# Try with increasingly smaller ClusterGroups until the ambiguity is gone
# Try with increasingly smaller ClusterGroups until the
# ambiguity is gone
return self.callback(clusters[:-1], prefix, backlog, require_break)

# Schedule Clusters over different IterationSpaces if this increases parallelism
# Schedule Clusters over different IterationSpaces if this increases
# parallelism
for i in range(1, len(clusters)):
if self._break_for_parallelism(scope, candidates, i):
return self.callback(clusters[:i], prefix, clusters[i:] + backlog,
Expand All @@ -146,8 +148,8 @@ def callback(self, clusters, prefix, backlog=None, known_break=None):
if not backlog:
return processed

# Handle the backlog -- the Clusters characterized by flow- and anti-dependences
# along one or more Dimensions
# Handle the backlog -- the Clusters characterized by flow- and
# anti-dependences along one or more Dimensions
idir = {d: Any for d in known_break}
stamp = Stamp()
for i, c in enumerate(list(backlog)):
Expand Down Expand Up @@ -278,7 +280,11 @@ def callback(self, clusters, prefix):
size = i.function.shape_allocated[d]
assert is_integer(size)

mapper[size][si].add(iaf)
# Resolve StencilDimensions in case of unexpanded expressions
# E.g. `i0 + t` -> `(t - 1, t, t + 1)`
iafs = erange(iaf)

mapper[size][si].update(iafs)

# Construct the ModuloDimensions
mds = []
Expand All @@ -288,7 +294,8 @@ def callback(self, clusters, prefix):
# SymPy's index ordering (t, t-1, t+1) afer modulo replacement so
# that associativity errors are consistent. This corresponds to
# sorting offsets {-1, 0, 1} as {0, -1, 1} assigning -inf to 0
siafs = sorted(iafs, key=lambda i: -np.inf if i - si == 0 else (i - si))
key = lambda i: -np.inf if i - si == 0 else (i - si)
siafs = sorted(iafs, key=key)

for iaf in siafs:
name = '%s%d' % (si.name, len(mds))
Expand Down Expand Up @@ -451,7 +458,8 @@ def normalize_reductions(cluster, sregistry, options):
"""
opt_mapify_reduce = options['mapify-reduce']

dims = [d for d, v in cluster.properties.items() if PARALLEL_IF_ATOMIC in v]
dims = [d for d in cluster.ispace.itdims
if cluster.properties.is_parallel_atomic(d)]

if not dims:
return cluster
Expand Down
62 changes: 40 additions & 22 deletions devito/ir/clusters/cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
Forward, Interval, IntervalGroup, IterationSpace,
DataSpace, Guards, Properties, Scope, detect_accesses,
detect_io, normalize_properties, normalize_syncs,
sdims_min, sdims_max)
minimum, maximum)
from devito.mpi.halo_scheme import HaloScheme, HaloTouch
from devito.symbolics import estimate_cost
from devito.tools import as_tuple, flatten, frozendict, infer_dtype
Expand Down Expand Up @@ -52,13 +52,7 @@ def __init__(self, exprs, ispace=None, guards=None, properties=None, syncs=None,

# Normalize properties
properties = Properties(properties or {})
for d in ispace.itdimensions:
properties = properties.add(d)
for i in properties:
for d in as_tuple(i):
if d not in ispace.itdimensions:
properties = properties.drop(d)
self._properties = properties
self._properties = tailor_properties(properties, ispace)

self._halo_scheme = halo_scheme

Expand All @@ -85,10 +79,7 @@ def from_clusters(cls, *clusters):

guards = root.guards

properties = {}
for c in clusters:
for d, v in c.properties.items():
properties[d] = normalize_properties(properties.get(d, v), v)
properties = reduce_properties(clusters)

try:
syncs = normalize_syncs(*[c.syncs for c in clusters])
Expand Down Expand Up @@ -213,12 +204,10 @@ def is_dense(self):
# at most PARALLEL_IF_PVT). This is a quick and easy check so we try it first
try:
pset = {PARALLEL, PARALLEL_IF_PVT}
grid = self.grid
for d in grid.dimensions:
if not any(pset & v for k, v in self.properties.items()
if d in k._defines):
raise ValueError
return True
target = set(self.grid.dimensions)
dims = {d for d in self.properties if d._defines & target}
if any(pset & self.properties[d] for d in dims):
return True
except ValueError:
pass

Expand Down Expand Up @@ -276,8 +265,8 @@ def dspace(self):
continue

intervals = [Interval(d,
min([sdims_min(i) for i in offs]),
max([sdims_max(i) for i in offs]))
min([minimum(i) for i in offs]),
max([maximum(i) for i in offs]))
for d, offs in v.items()]
intervals = IntervalGroup(intervals)

Expand Down Expand Up @@ -418,15 +407,21 @@ def scope(self):
def ispace(self):
return self._ispace

@cached_property
def properties(self):
return tailor_properties(reduce_properties(self), self.ispace)

@cached_property
def guards(self):
"""The guards of each Cluster in self."""
return tuple(i.guards for i in self)

@cached_property
def syncs(self):
"""The synchronization operations of each Cluster in self."""
return tuple(i.syncs for i in self)
"""
A view of the ClusterGroup's synchronization operations.
"""
return normalize_syncs(*[c.syncs for c in self])

@cached_property
def dspace(self):
Expand Down Expand Up @@ -461,3 +456,26 @@ def meta(self):
The data type and the data space of the ClusterGroup.
"""
return (self.dtype, self.dspace)


# *** Utils

def reduce_properties(clusters):
properties = {}
for c in clusters:
for d, v in c.properties.items():
properties[d] = normalize_properties(properties.get(d, v), v)

return Properties(properties)


def tailor_properties(properties, ispace):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tailor_*
reduce_*
normalize_*
relax_*

better to add docstrings that would make clear what is the use of each one.
I feel the names are not really helpful to someone who would read the code.
ofc they are all used in different ways, different args and context.... just saying how it feels

for d in ispace.itdimensions:
properties = properties.add(d)

for i in properties:
for d in as_tuple(i):
if d not in ispace.itdimensions:
properties = properties.drop(d)

return properties
2 changes: 1 addition & 1 deletion devito/ir/support/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .utils import * # noqa
from .vector import * # noqa
from .utils import * # noqa
from .basic import * # noqa
from .space import * # noqa
from .guards import * # noqa
Expand Down
Loading