Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change strides move assignment outside if #1402

Merged
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
726 commits
Select commit Hold shift + click to select a range
aaa13fe
Make other vertical loop examples run
Aug 2, 2023
386b839
Commented out check in node validation that accessNodes going in/out …
Sajohn-CH Aug 3, 2023
990f6ea
Merge remote-tracking branch 'upstream/fortran_frontend_candidate_2' …
Aug 3, 2023
4e72077
Remove debug print in RefineNestedAccess
Aug 3, 2023
88272c6
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 3, 2023
e335b69
Merge remote-tracking branch 'upstream/master' into thesis_playground
Sajohn-CH Aug 3, 2023
8f8e7f8
Fixed wrong memlet indices
Sajohn-CH Aug 3, 2023
fe123df
Remove more debug prints
Aug 7, 2023
5314b9d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 7, 2023
9bc158b
Make subgraph fusion able to deal with sympy sizes
Sajohn-CH Aug 7, 2023
47f3adc
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 7, 2023
ba5ea66
Fix change strides
Sajohn-CH Aug 7, 2023
c5cb801
Reset sdfg_nesting.py to state of upstream master
Aug 8, 2023
4398d7b
Fixed some mistakes in the subgraph fusion and vert_loop_10
Sajohn-CH Aug 8, 2023
9270cf5
Adjust data.py to size change of PLU
Aug 8, 2023
5e6d6fd
add sdfg.save in sdfg_nesting.py for avoid WARNING
Aug 8, 2023
7e276a4
Add --change-stride and --verbose-name to run_program.py
Aug 8, 2023
b123642
Update k-caching run in run2 with change_strides without k_caching
Aug 8, 2023
c51e23c
Add some better debug prints in utils/general.py
Sajohn-CH Aug 8, 2023
24c7e1f
Updated plotting scripts
Aug 9, 2023
30bd1e0
Added debug prints and fix for codegen error
Aug 9, 2023
010d694
Remove debug prints in mapFusion
Aug 9, 2023
ad285bf
some fixes in run_mwe regarding verbose_name
Sajohn-CH Aug 9, 2023
3a47bc1
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 9, 2023
94449f2
add device to gen_graphs.py
Sajohn-CH Aug 9, 2023
cb6fa2b
split plotting scripts into part which needs ncu and part which don't
Sajohn-CH Aug 9, 2023
d687b5d
Started to add info about ault25/A100
Aug 9, 2023
8b0add4
Added example MWE fortran program for use in thesis
Sajohn-CH Aug 11, 2023
a2f5319
Seprated ncu and tot time runs to limit scope of created input data
Aug 12, 2023
c24f287
Added simple logging framework
Aug 14, 2023
1983402
Set gpu block size manually
Aug 14, 2023
de142fc
Add schedule to maps for changing strides
Aug 14, 2023
1b19e52
Add validations to auto_opt
Sajohn-CH Aug 14, 2023
6814c07
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 14, 2023
ad18a10
Add save and store to optimize_sdfg
Sajohn-CH Aug 14, 2023
1bb9522
Add profile script for classes
Sajohn-CH Aug 14, 2023
03cc72a
Fix number of repetitions for run script for k_caching
Aug 14, 2023
f473956
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 14, 2023
d137ac4
Added changes in run config and my_auto_opt to toggle optimisations f…
Sajohn-CH Aug 14, 2023
b2122c3
Fix typo in log message
Aug 14, 2023
aa50e5d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 14, 2023
7a2ef38
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 14, 2023
88a3c2a
Rename variable
Sajohn-CH Aug 14, 2023
630a982
Pass new run_config parameters to optimize_sdfg
Sajohn-CH Aug 14, 2023
2797bf6
Move change of storage into auto_opt
Sajohn-CH Aug 14, 2023
f8e69fe
Moved viewing functions for results2 into different executable
Sajohn-CH Aug 15, 2023
67ce578
Adjustment in runscripts
Aug 15, 2023
bb3b402
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 15, 2023
6179a63
Fix logging file not appending
Aug 15, 2023
2a4836e
fix in subgraphfusion helpers when comparing symbolic to non symbolic…
Sajohn-CH Aug 15, 2023
b35b64e
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 15, 2023
fbc1e8a
add option to not use outer loop first in gen_graphs
Sajohn-CH Aug 16, 2023
5796785
Started to add plots for my transformations
Sajohn-CH Aug 16, 2023
1818562
Remove debug prints in subgrahp helpers
Sajohn-CH Aug 16, 2023
0db747a
Added barplot for tranpose kernel time
Sajohn-CH Aug 16, 2023
986d62d
Fix some things in the runscripts
Aug 17, 2023
8a13ac7
For full cloudsc, only remove symbols when they are there
Sajohn-CH Aug 17, 2023
f286605
Add script to continue autopt from sdfg file
Sajohn-CH Aug 17, 2023
670626c
Switched to builtin python logging framework
Aug 21, 2023
b43f41b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 21, 2023
df125e0
Add dace-auto-opt to run_mwe
Sajohn-CH Aug 21, 2023
e84b298
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 21, 2023
9add92d
Remove commented out code
Sajohn-CH Aug 21, 2023
edd96c8
Merge remote-tracking branch 'upstream/master' into thesis_playground
Sajohn-CH Aug 21, 2023
deca935
Added logfile with debug level
Aug 21, 2023
769075f
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 21, 2023
8eeb129
Fix transform map back to loop if needed to be done several times
Sajohn-CH Aug 21, 2023
d4b2c66
Full cloudsc code used by Lex
Sajohn-CH Aug 22, 2023
a014ce6
Change full cloudsc code to generate graph
Sajohn-CH Aug 22, 2023
cc88669
Increase size of testing parameters
Aug 22, 2023
c7ca159
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 22, 2023
8be55ed
Add fix for trivialmapexpansion
Sajohn-CH Aug 22, 2023
e586f0b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 22, 2023
08b3faf
Add lazy loading of basic SDFG
Sajohn-CH Aug 23, 2023
e2052dd
Split k_caching run-script into ncu and total time part
Aug 23, 2023
3816fa9
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 23, 2023
e3f69e9
Fixes regarding basic SDFG
Aug 24, 2023
56d796f
Fix some typos
Aug 25, 2023
c3ebf37
Move runconfig into own file
Sajohn-CH Aug 25, 2023
6223a4d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 25, 2023
071acb7
forgot to change one file
Sajohn-CH Aug 25, 2023
f48b0c5
Added some logging points and fixed some small mistakes in subgraph f…
Sajohn-CH Aug 26, 2023
5de3a1b
Cleaned up k-caching
Sajohn-CH Aug 26, 2023
5cc453d
add comments to my_auto_opt functions
Sajohn-CH Aug 26, 2023
5bd8b08
Add changes to cater for full cloudsc
Aug 27, 2023
df59ddb
Some printing and logging changes
Aug 27, 2023
5227a97
Some WIP changes to delete logs
Aug 27, 2023
d6f2ec1
Had to disable renaming of tmp arrays in subgraph fusion again
Aug 27, 2023
dd7b95a
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 27, 2023
c85299e
Don't drop empty rows
Sajohn-CH Aug 27, 2023
5fd3c01
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 27, 2023
a60a60c
Evaluate symbolic expression in subgraph fusion
Aug 27, 2023
d3a29c5
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Aug 27, 2023
837e133
Cannot transform intermediate arrays if one has symbolic range
Aug 27, 2023
f101add
Fix NCLV indices in vert loops and related changes
Aug 28, 2023
a9fdd75
Add cloudscexp3 with removed u, v, o3 in tendency_* arrays
Aug 28, 2023
1648084
Fix mistake in run2
Aug 28, 2023
20ee17d
Make storage_on_gpu optional
Aug 28, 2023
4735c64
Added more runscripts and fixes in profile_config
Aug 28, 2023
dc4158e
Fix composite in subgraph if not doing k_caching
Aug 28, 2023
752aaa0
Update KFLDX in params
Aug 28, 2023
ae7661d
Add debug build in gen_full_cloudsc.py
Aug 28, 2023
1d596e1
Add transfert to gpu option to gen graphs
Aug 28, 2023
199d938
Fix in map_expansion with wrong schedule
Sajohn-CH Aug 28, 2023
005441b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Aug 28, 2023
68585d4
Adadpt change in strides to gpu copies
Sajohn-CH Aug 28, 2023
4d3a376
WIP RefineNestedAccess fixed and fixes regarding simplify
Aug 30, 2023
b7b462e
Some changes in to generate full cloudsc
Aug 30, 2023
f791573
Add some logging prints and fixes with symbols
Aug 30, 2023
a72c1e0
New cloudsc version
Aug 30, 2023
3911825
Added some log messages
Sajohn-CH Sep 14, 2023
2a95239
Changes in subgraph fusion: Improved check if arrays shape can be cha…
Sajohn-CH Sep 14, 2023
55812f8
Changes in cloudscexp4
Sajohn-CH Sep 14, 2023
0a41bf3
Add ScalarFission, RefineNestedAccess and splitting of interstated ed…
Sajohn-CH Sep 14, 2023
06abd26
Add NPROMA to params
Sajohn-CH Sep 14, 2023
9aa913b
Update graph generation scripts to include NCLDQR
Sajohn-CH Sep 14, 2023
cf0586d
Add clear-basic-sdfg option to gen_graphs
Sajohn-CH Sep 14, 2023
05e80bc
Update gitignore
Sajohn-CH Sep 14, 2023
cd7a94b
Some WIP custom cun scripts
Sajohn-CH Sep 14, 2023
d911235
Fixes in subgraph fusion regarding missing symbols and chaning sdfg a…
Sajohn-CH Sep 15, 2023
03fc93b
Fix min/max adjustment when fusing maps of different sizes
Sajohn-CH Sep 15, 2023
7aab2c1
Adapt script to gen full cloudsc code
Sajohn-CH Sep 16, 2023
7f791b4
Updates in utils
Sajohn-CH Sep 16, 2023
a0449f9
Add runscripts to fix memlets and print shapes of arrays
Sajohn-CH Sep 16, 2023
651d61c
Fix wrong argparse usage
Sajohn-CH Sep 16, 2023
e5a80c7
Forgot to add function in gen_full_cloudsc
Sajohn-CH Sep 16, 2023
d2bdeff
Fix gen_full_cloudsc.py
Sep 16, 2023
3c8a28d
Add logfile args to gen_full_cloudsc
Sajohn-CH Sep 16, 2023
fab97d4
Fix greedy fuse in my_auto_opt and disable MapToForLoop
Sajohn-CH Sep 16, 2023
38a7f90
Fix looking at all access nodes in subgraph fusion and fix init map d…
Sajohn-CH Sep 18, 2023
2dd9a2f
Better names for nsdfg
Sajohn-CH Sep 18, 2023
2032eef
Better names for state when transforming maps to loop
Sajohn-CH Sep 18, 2023
cf2f7c2
Forgot to remove a logger call
Sajohn-CH Sep 18, 2023
c44ccc0
gen_full_cloudsc add optiont to compile custom
Sajohn-CH Sep 18, 2023
51ab575
Add k-caching and change_strides to gen_full_cloudsc.py
Sep 18, 2023
fbded49
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sep 18, 2023
75b1990
subgraph fusion: ZPFPLSX also treated as circular buffers and revert …
Sajohn-CH Sep 18, 2023
27e551e
Added helper scripts to print array shape and memlets
Sajohn-CH Sep 19, 2023
ce448be
print memlets can print memlet inside nsdfg
Sajohn-CH Sep 19, 2023
a393268
Adapted full cloudsc fixes to new names and set NCLDTOP=15
Sajohn-CH Sep 20, 2023
2db019a
Remove some unused parameters
Sajohn-CH Sep 20, 2023
b0c67a1
Add modulos to nsdfg going into access nodes
Sajohn-CH Sep 20, 2023
f88c97f
Disable memlet out of bond validation for now
Sajohn-CH Sep 20, 2023
e659e7a
Reversed incorrect offset in adding modulo
Sajohn-CH Sep 20, 2023
55d3cf3
Enable out-of-bonds check and fix min/max value in min/max
Sajohn-CH Sep 20, 2023
a0cc6f8
For modulo need offset if outside memlet is a range
Sajohn-CH Sep 21, 2023
27cafcc
Run cloudsc script
Sep 21, 2023
8166e71
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sep 21, 2023
b6046bb
add test for loop to map and back
Sajohn-CH Sep 21, 2023
e6c409d
Add loop to map to loop test
Sajohn-CH Sep 21, 2023
e7c9cb2
Add a simplify
Sajohn-CH Sep 21, 2023
021701f
Started to add dependency case
Sajohn-CH Sep 21, 2023
6445d2d
Added simple dependency case
Sajohn-CH Sep 21, 2023
57f5bff
Add sdfg flag and remove -O3 from compile flags for full cloudsc
Sajohn-CH Sep 22, 2023
66d2b4b
Remove blacklisted arrays before computing data_intermediate
Sajohn-CH Sep 22, 2023
a41e99f
Fix subgraph fusion fix circular buffers
Sajohn-CH Sep 24, 2023
9eb169d
Fixed some small mistakes in helpers
Sajohn-CH Sep 24, 2023
3c7deb4
Add device to full cloudsc sdfg
Sajohn-CH Sep 24, 2023
756acac
Use lower case device
Sajohn-CH Sep 24, 2023
19ffaff
Update runscript with building and store results into own file
Sajohn-CH Sep 25, 2023
d0b92de
Add sh cmake_configure to runscript
Sajohn-CH Sep 25, 2023
333c989
Remove blocking of -O3
Sajohn-CH Sep 25, 2023
34d9a36
Remove build folder before building*
Sep 25, 2023
328e129
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sep 25, 2023
ea393e6
Add optionall yinstrumentation when compiling
Sajohn-CH Sep 26, 2023
430a10c
Fix subgrah fusio helpers to better deal with min/max
Sajohn-CH Sep 27, 2023
4873300
Improve instrumentation code slightly
Sep 27, 2023
a56912e
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sep 27, 2023
0b97a36
Clean up code a bit and remove some more cruches
Sajohn-CH Sep 27, 2023
c7db81e
Add synchronize to instrument code
Sajohn-CH Sep 27, 2023
356102f
read nblocks size from generated code
Sep 27, 2023
ef14a7e
Fix problem with edge going out of global map
Sajohn-CH Sep 27, 2023
8890535
Also add cudaStreamSynchronize
Sep 28, 2023
e102d14
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sep 28, 2023
00aa9f0
Add NBLOCKS param to gen_full_cloudsc.py
Sajohn-CH Sep 28, 2023
c4df694
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Sep 28, 2023
cc3ee80
Added scripts to profile full cloudsc
Sajohn-CH Sep 28, 2023
c494b12
Update run script to allow for absolute paths
Sajohn-CH Sep 28, 2023
862f0b1
Changes in runscripts
Sep 29, 2023
220d2fb
Improvements in plotting full cloudsc
Sajohn-CH Sep 29, 2023
7ce5842
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Sep 29, 2023
d049982
Also log stdout if run fails
Sep 29, 2023
484de36
Add script to plot cloudsc cuda vs klon
Sajohn-CH Sep 30, 2023
a1185f8
Update plotting for full cloudsc
Sajohn-CH Sep 30, 2023
ad0a640
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH Sep 30, 2023
fa5fad2
Update plotscript for full cloudsc
Sajohn-CH Sep 30, 2023
c920bca
Update plotting scripts for full cloudsc
Sajohn-CH Sep 30, 2023
f4fba2c
Added logging message in my auto opt
Sajohn-CH Oct 4, 2023
7db834e
Updated plotting scripts
Sajohn-CH Oct 4, 2023
3174582
Added all the small helper scripts created during the thesis
Sajohn-CH Oct 4, 2023
b202d4c
Added all the fortran test programs created
Sajohn-CH Oct 4, 2023
1237668
Added a unfinished test created
Sajohn-CH Oct 4, 2023
0caaeaf
Added a test for K-caching
Sajohn-CH Oct 4, 2023
c980888
Add change strides test
Sajohn-CH Oct 5, 2023
e06ec94
Merge branch 'master' into cloudsc_k_caching_strides
Sajohn-CH Oct 9, 2023
f964d65
Moved change strides into own file
Sajohn-CH Oct 9, 2023
e526894
Removed thesis_playground folder
Sajohn-CH Oct 9, 2023
0f0a3a3
Undo changes in gitignore
Sajohn-CH Oct 9, 2023
abfd85e
Once in AUTHORS is enough
Sajohn-CH Oct 9, 2023
999f99d
Remove function defined twice
Sajohn-CH Oct 9, 2023
af416c2
Removed some unused changes
Sajohn-CH Oct 9, 2023
c55ed43
Undid some more changes
Sajohn-CH Oct 10, 2023
f88df12
Some more undoing of changes
Sajohn-CH Oct 10, 2023
61d0f72
Removed cloudsc fortran tests
Sajohn-CH Oct 10, 2023
9e6acf0
Some more cleanup
Sajohn-CH Oct 10, 2023
ebca8aa
More cleanup
Sajohn-CH Oct 10, 2023
47688dd
Remove unfinished SwapLoopOrder
Sajohn-CH Oct 11, 2023
780ceb6
Removed some changes in testcases
Sajohn-CH Oct 11, 2023
4474740
Undo cleanup in ScalarToSymbol as this lead to failing
Sajohn-CH Oct 11, 2023
839a843
Fix ScalarToSymbol
Sajohn-CH Oct 11, 2023
5dd17f4
Removed some log prints
Sajohn-CH Oct 11, 2023
37a2944
Filter out changes in auto_opt
Sajohn-CH Oct 11, 2023
a99cc79
Cleanup imports on aut opt
Sajohn-CH Oct 11, 2023
27ae7e0
Cleanup loopToMap
Sajohn-CH Oct 11, 2023
d79063a
Added some comments
Sajohn-CH Oct 11, 2023
3d421c5
Add description and runscripts
Sajohn-CH Oct 12, 2023
170ec60
Add some more runscripts
Sajohn-CH Oct 12, 2023
1bfff62
Extend README and undo config_schema change
Sajohn-CH Oct 12, 2023
35935b5
Added further instructions
Sajohn-CH Oct 13, 2023
ab9021a
Adjust link
Sajohn-CH Oct 13, 2023
0694c46
Improved help messages slightly
Sajohn-CH Oct 13, 2023
f9bc9e3
More comments and files for run2
Sajohn-CH Oct 16, 2023
fea3be2
Added gpu_general file
Sajohn-CH Oct 16, 2023
6492437
Add note regarding GPU dependency
Sajohn-CH Oct 16, 2023
d32cd21
Removed unused function in execute_dace
Sajohn-CH Oct 16, 2023
56a480a
Add ncu python utils
Sajohn-CH Oct 16, 2023
784e622
Adjusted path to my_auto_opt
Sajohn-CH Oct 16, 2023
4f665ff
add flop computation file
Sajohn-CH Oct 16, 2023
51cadc4
Add programs.json
Sajohn-CH Oct 16, 2023
a267f5a
Fix programs.json
Sajohn-CH Oct 16, 2023
90f136c
Add run_program.py
Sajohn-CH Oct 16, 2023
5c9f0be
Made sure in flop computation to adjust for 1-index at fortran
Sajohn-CH Oct 16, 2023
402f789
Guard against symbols being None
Sajohn-CH Oct 16, 2023
2ef4ba2
Add view2
Sajohn-CH Oct 16, 2023
9d9178f
Add print utils
Sajohn-CH Oct 16, 2023
781e2ff
Add tabulate dependency
Sajohn-CH Oct 16, 2023
3770c37
Add 2nd basic SDFGB
Sajohn-CH Oct 16, 2023
0f5f979
Adapt readme for cloudsc
Sajohn-CH Oct 16, 2023
0fcde5f
Add tests for move_assignment_outside_if
Sajohn-CH Oct 18, 2023
bf9d775
Remove cloudsc_thesis folder
Sajohn-CH Oct 18, 2023
17c5697
Boiled down cloudsc_auto_opt
Sajohn-CH Oct 18, 2023
dea8bd3
Removed unwanted changes
Sajohn-CH Oct 18, 2023
4723f5c
Remove duplicate change strides function
Sajohn-CH Oct 18, 2023
82131d5
Add copyright information
Sajohn-CH Oct 18, 2023
83db636
Added current not working state of outside loop first tests
Sajohn-CH Oct 18, 2023
85d7263
Remove outside loop first
Sajohn-CH Oct 18, 2023
f266dfc
Remove wrongfully comitted file
Sajohn-CH Oct 18, 2023
6ed77fd
Merge branch 'master' into change_strides_move_assignment_outside_if
alexnick83 Nov 8, 2023
d086922
Added property for selecting the inner map(s) schedule.
alexnick83 Nov 8, 2023
66db79e
Switched to EnumProperty, None means using original map's schedule.
alexnick83 Nov 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions dace/transformation/change_strides.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.
""" This module provides a function to change the stride in a given SDFG """
from typing import List, Union, Tuple
import sympy

import dace
from dace.dtypes import ScheduleType
from dace.sdfg import SDFG, nodes, SDFGState
from dace.data import Array, Scalar
from dace.memlet import Memlet


def list_access_nodes(
sdfg: dace.SDFG,
array_name: str) -> List[Tuple[nodes.AccessNode, Union[SDFGState, dace.SDFG]]]:
"""
Find all access nodes in the SDFG of the given array name. Does not recourse into nested SDFGs.

:param sdfg: The SDFG to search through
:type sdfg: dace.SDFG
:param array_name: The name of the wanted array
:type array_name: str
:return: List of the found access nodes together with their state
:rtype: List[Tuple[nodes.AccessNode, Union[dace.SDFGState, dace.SDFG]]]
"""
found_nodes = []
for state in sdfg.states():
for node in state.nodes():
if isinstance(node, nodes.AccessNode) and node.data == array_name:
found_nodes.append((node, state))
return found_nodes


def change_strides(
sdfg: dace.SDFG,
stride_one_values: List[str],
schedule: ScheduleType) -> SDFG:
"""
Change the strides of the arrays on the given SDFG such that the given dimension has stride 1. Returns a new SDFG.

:param sdfg: The input SDFG
:type sdfg: dace.SDFG
:param stride_one_values: Length of the dimension whose stride should be set to one. Expects that each array has
only one dimension whose length is in this list. Expects that list contains name of symbols
:type stride_one_values: List[str]
:param schedule: Schedule to use to copy the arrays
:type schedule: ScheduleType
:return: SDFG with changed strides
:rtype: SDFG
"""
# Create new SDFG and copy constants and symbols
original_name = sdfg.name
sdfg.name = "changed_strides"
new_sdfg = SDFG(original_name)
for dname, value in sdfg.constants.items():
new_sdfg.add_constant(dname, value)
for dname, stype in sdfg.symbols.items():
new_sdfg.add_symbol(dname, stype)

changed_stride_state = new_sdfg.add_state("with_changed_strides", is_start_state=True)
inputs, outputs = sdfg.read_and_write_sets()
# Get all arrays which are persistent == not transient
persistent_arrays = {name: desc for name, desc in sdfg.arrays.items() if not desc.transient}

# Get the persistent arrays of all the transient arrays which get copied to GPU
for dname in persistent_arrays:
for access, state in list_access_nodes(sdfg, dname):
if len(state.out_edges(access)) == 1:
edge = state.out_edges(access)[0]
if isinstance(edge.dst, nodes.AccessNode):
if edge.dst.data in inputs:
inputs.remove(edge.dst.data)
inputs.add(dname)
if len(state.in_edges(access)) == 1:
edge = state.in_edges(access)[0]
if isinstance(edge.src, nodes.AccessNode):
if edge.src.data in inputs:
outputs.remove(edge.src.data)
outputs.add(dname)

# Only keep inputs and outputs which are persistent
inputs.intersection_update(persistent_arrays.keys())
outputs.intersection_update(persistent_arrays.keys())
nsdfg = changed_stride_state.add_nested_sdfg(sdfg, new_sdfg, inputs=inputs, outputs=outputs)
transform_state = new_sdfg.add_state_before(changed_stride_state, label="transform_data", is_start_state=True)
transform_state_back = new_sdfg.add_state_after(changed_stride_state, "transform_data_back", is_start_state=False)

# copy arrays
for dname, desc in sdfg.arrays.items():
if not desc.transient:
if isinstance(desc, Array):
new_sdfg.add_array(dname, desc.shape, desc.dtype, desc.storage,
desc.location, desc.transient, desc.strides,
desc.offset)
elif isinstance(desc, Scalar):
new_sdfg.add_scalar(dname, desc.dtype, desc.storage, desc.transient, desc.lifetime, desc.debuginfo)

new_order = {}
new_strides_map = {}

# Map of array names in the nested sdfg: key: array name in parent sdfg (this sdfg), value: name in the nsdfg
# Assumes that name changes only appear in the first level of nsdfg nesting
array_names_map = {}
for graph in sdfg.sdfg_list:
if graph.parent_nsdfg_node is not None:
if graph.parent_sdfg == sdfg:
for connector in graph.parent_nsdfg_node.in_connectors:
for in_edge in graph.parent.in_edges_by_connector(graph.parent_nsdfg_node, connector):
array_names_map[str(connector)] = in_edge.data.data

for containing_sdfg, dname, desc in sdfg.arrays_recursive():
shape_str = [str(s) for s in desc.shape]
# Get index of the dimension we want to have stride 1
stride_one_idx = None
this_stride_one_value = None
for dim in stride_one_values:
if str(dim) in shape_str:
stride_one_idx = shape_str.index(str(dim))
this_stride_one_value = dim
break

if stride_one_idx is not None:
new_order[dname] = [stride_one_idx]

new_strides = list(desc.strides)
new_strides[stride_one_idx] = sympy.S.One

previous_size = dace.symbolic.symbol(this_stride_one_value)
previous_stride = sympy.S.One
for i in range(len(new_strides)):
if i != stride_one_idx:
new_order[dname].append(i)
new_strides[i] = previous_size * previous_stride
previous_size = desc.shape[i]
previous_stride = new_strides[i]

new_strides_map[dname] = {}
# Create a map entry for this data linking old strides to new strides. This assumes that each entry in
# strides is unique which is given as otherwise there would be two dimension i, j where a[i, j] would point
# to the same address as a[j, i]
for new_stride, old_stride in zip(new_strides, desc.strides):
new_strides_map[dname][old_stride] = new_stride
desc.strides = tuple(new_strides)
else:
parent_name = array_names_map[dname] if dname in array_names_map else dname
if parent_name in new_strides_map:
new_strides = []
for stride in desc.strides:
new_strides.append(new_strides_map[parent_name][stride])
desc.strides = new_strides

# Add new flipped arrays for every non-transient array
flipped_names_map = {}
for dname, desc in sdfg.arrays.items():
if not desc.transient:
flipped_name = f"{dname}_flipped"
flipped_names_map[dname] = flipped_name
new_sdfg.add_array(flipped_name, desc.shape, desc.dtype,
desc.storage, desc.location, True,
desc.strides, desc.offset)

# Deal with the inputs: Create tasklet to flip them and connect via memlets
# for input in inputs:
for input in set([*inputs, *outputs]):
if input in new_order:
flipped_data = flipped_names_map[input]
if input in inputs:
changed_stride_state.add_memlet_path(changed_stride_state.add_access(flipped_data), nsdfg,
dst_conn=input, memlet=Memlet(data=flipped_data))
# Simply need to copy the data, the different strides take care of the transposing
arr = sdfg.arrays[input]
tasklet, map_entry, map_exit = transform_state.add_mapped_tasklet(
name=f"transpose_{input}",
map_ranges={f"_i{i}": f"0:{s}" for i, s in enumerate(arr.shape)},
inputs={'_in': Memlet(data=input, subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))},
code='_out = _in',
outputs={'_out': Memlet(data=flipped_data,
subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))},
external_edges=True,
schedule=schedule,
)
# Do the same for the outputs
for output in outputs:
if output in new_order:
flipped_data = flipped_names_map[output]
changed_stride_state.add_memlet_path(nsdfg, changed_stride_state.add_access(flipped_data),
src_conn=output, memlet=Memlet(data=flipped_data))
# Simply need to copy the data, the different strides take care of the transposing
arr = sdfg.arrays[output]
tasklet, map_entry, map_exit = transform_state_back.add_mapped_tasklet(
name=f"transpose_{output}",
map_ranges={f"_i{i}": f"0:{s}" for i, s in enumerate(arr.shape)},
inputs={'_in': Memlet(data=flipped_data,
subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))},
code='_out = _in',
outputs={'_out': Memlet(data=output, subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))},
external_edges=True,
schedule=schedule,
)
# Deal with any arrays which have not been flipped (should only be scalars). Connect them directly
for dname, desc in sdfg.arrays.items():
if not desc.transient and dname not in new_order:
if dname in inputs:
changed_stride_state.add_memlet_path(changed_stride_state.add_access(dname), nsdfg, dst_conn=dname,
memlet=Memlet(data=dname))
if dname in outputs:
changed_stride_state.add_memlet_path(nsdfg, changed_stride_state.add_access(dname), src_conn=dname,
memlet=Memlet(data=dname))

return new_sdfg
2 changes: 1 addition & 1 deletion dace/transformation/dataflow/map_expansion.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def apply(self, graph: dace.SDFGState, sdfg: dace.SDFG):
new_maps = [
nodes.Map(current_map.label + '_' + str(param), [param],
subsets.Range([param_range]),
schedule=dtypes.ScheduleType.Sequential)
schedule=current_map.schedule)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will not work for GPU maps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate why do you think this would not work in GPU maps? The reason why I put this in is that I had maps expanded which were not sequential and suddenly one of the maps was sequential, which goes against my understanding of what MapExpansion should do. Though when I disable this change some (maybe all, didn't check all) of the failing tests pass again.

for param, param_range in zip(current_map.params[1:], current_map.range[1:])
]
current_map.params = [current_map.params[0]]
Expand Down
3 changes: 2 additions & 1 deletion dace/transformation/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1137,7 +1137,8 @@ def traverse(state: SDFGState, treenode: ScopeTree):
ntree.state = nstate
treenode.children.append(ntree)
for child in treenode.children:
traverse(getattr(child, 'state', state), child)
if hasattr(child, 'state') and child.state != state:
traverse(getattr(child, 'state', state), child)

traverse(state, stree)
return stree
Expand Down
1 change: 1 addition & 0 deletions dace/transformation/interstate/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@
from .move_loop_into_map import MoveLoopIntoMap
from .trivial_loop_elimination import TrivialLoopElimination
from .multistate_inline import InlineMultistateSDFG
from .move_assignment_outside_if import MoveAssignmentOutsideIf
113 changes: 113 additions & 0 deletions dace/transformation/interstate/move_assignment_outside_if.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.
"""
Transformation to move assignments outside if statements to potentially avoid warp divergence. Speedup gained is
questionable.
"""

import ast
import sympy as sp

from dace import sdfg as sd
from dace.sdfg import graph as gr
from dace.sdfg.nodes import Tasklet, AccessNode
from dace.transformation import transformation


class MoveAssignmentOutsideIf(transformation.MultiStateTransformation):

if_guard = transformation.PatternNode(sd.SDFGState)
if_stmt = transformation.PatternNode(sd.SDFGState)
else_stmt = transformation.PatternNode(sd.SDFGState)

@classmethod
def expressions(cls):
sdfg = gr.OrderedDiGraph()
sdfg.add_nodes_from([cls.if_guard, cls.if_stmt, cls.else_stmt])
sdfg.add_edge(cls.if_guard, cls.if_stmt, sd.InterstateEdge())
sdfg.add_edge(cls.if_guard, cls.else_stmt, sd.InterstateEdge())
return [sdfg]

def can_be_applied(self, graph, expr_index, sdfg, permissive=False):
# The if-guard can only have two outgoing edges: to the if and to the else part
guard_outedges = graph.out_edges(self.if_guard)
if len(guard_outedges) != 2:
return False

# Outgoing edges must be a negation of each other
if guard_outedges[0].data.condition_sympy() != (sp.Not(guard_outedges[1].data.condition_sympy())):
return False

# The if guard should either have zero or one incoming edge
if len(sdfg.in_edges(self.if_guard)) > 1:
return False

# set of the variables which get a const value assigned
assigned_const = set()
# Dict which collects all AccessNodes for each variable together with its state
access_nodes = {}
# set of the variables which are only written to
self.write_only_values = set()
# Dictionary which stores additional information for the variables which are written only
self.assign_context = {}
for state in [self.if_stmt, self.else_stmt]:
for node in state.nodes():
if isinstance(node, Tasklet):
# If node is a tasklet, check if assigns a constant value
assigns_const = True
for code_stmt in node.code.code:
if not (isinstance(code_stmt, ast.Assign) and isinstance(code_stmt.value, ast.Constant)):
assigns_const = False
if assigns_const:
for edge in state.out_edges(node):
if isinstance(edge.dst, AccessNode):
assigned_const.add(edge.dst.data)
self.assign_context[edge.dst.data] = {"state": state, "tasklet": node}
elif isinstance(node, AccessNode):
if node.data not in access_nodes:
access_nodes[node.data] = []
access_nodes[node.data].append((node, state))

# check that the found access nodes only get written to
for data, nodes in access_nodes.items():
write_only = True
for node, state in nodes:
if node.has_reads(state):
# The read is only a problem if it is not written before -> the access node has no incoming edge
if state.in_degree(node) == 0:
write_only = False
else:
# There is also a problem if any edge is an update instead of write
for edge in [*state.out_edges(node), *state.out_edges(node)]:
if edge.data.wcr is not None:
write_only = False

if write_only:
self.write_only_values.add(data)

# Want only the values which are only written to and one option uses a constant value
self.write_only_values = assigned_const.intersection(self.write_only_values)

if len(self.write_only_values) == 0:
return False
return True

def apply(self, _, sdfg: sd.SDFG):
# create a new state before the guard state where the zero assignment happens
new_assign_state = sdfg.add_state_before(self.if_guard, label="const_assignment_state")

# Move all the Tasklets together with the AccessNode
for value in self.write_only_values:
state = self.assign_context[value]["state"]
tasklet = self.assign_context[value]["tasklet"]
new_assign_state.add_node(tasklet)
for edge in state.out_edges(tasklet):
state.remove_edge(edge)
state.remove_node(edge.dst)
new_assign_state.add_node(edge.dst)
new_assign_state.add_edge(tasklet, edge.src_conn, edge.dst, edge.dst_conn, edge.data)

state.remove_node(tasklet)
# Remove the state if it was emptied
if state.is_empty():
sdfg.remove_node(state)
return sdfg
48 changes: 48 additions & 0 deletions tests/transformations/change_strides_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.
import dace
from dace import nodes
from dace.dtypes import ScheduleType
from dace.memlet import Memlet
from dace.transformation.change_strides import change_strides


def change_strides_test():
sdfg = dace.SDFG('change_strides_test')
N = dace.symbol('N')
M = dace.symbol('M')
sdfg.add_array('A', [N, M], dace.float64)
sdfg.add_array('B', [N, M, 3], dace.float64)
state = sdfg.add_state()

task1, mentry1, mexit1 = state.add_mapped_tasklet(
name="map1",
map_ranges={'i': '0:N', 'j': '0:M'},
inputs={'a': Memlet(data='A', subset='i, j')},
outputs={'b': Memlet(data='B', subset='i, j, 0')},
code='b = a + 1',
external_edges=True,
propagate=True)

# Check that states are as expected
changed_sdfg = change_strides(sdfg, ['N'], ScheduleType.Sequential)
assert len(changed_sdfg.states()) == 3
assert len(changed_sdfg.out_edges(changed_sdfg.start_state)) == 1
work_state = changed_sdfg.out_edges(changed_sdfg.start_state)[0].dst
nsdfg = None
for node in work_state.nodes():
if isinstance(node, nodes.NestedSDFG):
nsdfg = node
# Check shape and strides of data inside nested SDFG
assert nsdfg is not None
assert nsdfg.sdfg.data('A').shape == (N, M)
assert nsdfg.sdfg.data('B').shape == (N, M, 3)
assert nsdfg.sdfg.data('A').strides == (1, N)
assert nsdfg.sdfg.data('B').strides == (1, N, M*N)


def main():
change_strides_test()


if __name__ == '__main__':
main()
Loading