-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change strides move assignment outside if #1402
Merged
alexnick83
merged 726 commits into
spcl:master
from
Sajohn-CH:change_strides_move_assignment_outside_if
Nov 8, 2023
Merged
Changes from 250 commits
Commits
Show all changes
726 commits
Select commit
Hold shift + click to select a range
aaa13fe
Make other vertical loop examples run
386b839
Commented out check in node validation that accessNodes going in/out …
Sajohn-CH 990f6ea
Merge remote-tracking branch 'upstream/fortran_frontend_candidate_2' …
4e72077
Remove debug print in RefineNestedAccess
88272c6
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
e335b69
Merge remote-tracking branch 'upstream/master' into thesis_playground
Sajohn-CH 8f8e7f8
Fixed wrong memlet indices
Sajohn-CH fe123df
Remove more debug prints
5314b9d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
9bc158b
Make subgraph fusion able to deal with sympy sizes
Sajohn-CH 47f3adc
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH ba5ea66
Fix change strides
Sajohn-CH c5cb801
Reset sdfg_nesting.py to state of upstream master
4398d7b
Fixed some mistakes in the subgraph fusion and vert_loop_10
Sajohn-CH 9270cf5
Adjust data.py to size change of PLU
5e6d6fd
add sdfg.save in sdfg_nesting.py for avoid WARNING
7e276a4
Add --change-stride and --verbose-name to run_program.py
b123642
Update k-caching run in run2 with change_strides without k_caching
c51e23c
Add some better debug prints in utils/general.py
Sajohn-CH 24c7e1f
Updated plotting scripts
30bd1e0
Added debug prints and fix for codegen error
010d694
Remove debug prints in mapFusion
ad285bf
some fixes in run_mwe regarding verbose_name
Sajohn-CH 3a47bc1
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 94449f2
add device to gen_graphs.py
Sajohn-CH cb6fa2b
split plotting scripts into part which needs ncu and part which don't
Sajohn-CH d687b5d
Started to add info about ault25/A100
8b0add4
Added example MWE fortran program for use in thesis
Sajohn-CH a2f5319
Seprated ncu and tot time runs to limit scope of created input data
c24f287
Added simple logging framework
1983402
Set gpu block size manually
de142fc
Add schedule to maps for changing strides
1b19e52
Add validations to auto_opt
Sajohn-CH 6814c07
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH ad18a10
Add save and store to optimize_sdfg
Sajohn-CH 1bb9522
Add profile script for classes
Sajohn-CH 03cc72a
Fix number of repetitions for run script for k_caching
f473956
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
d137ac4
Added changes in run config and my_auto_opt to toggle optimisations f…
Sajohn-CH b2122c3
Fix typo in log message
aa50e5d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 7a2ef38
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
88a3c2a
Rename variable
Sajohn-CH 630a982
Pass new run_config parameters to optimize_sdfg
Sajohn-CH 2797bf6
Move change of storage into auto_opt
Sajohn-CH f8e69fe
Moved viewing functions for results2 into different executable
Sajohn-CH 67ce578
Adjustment in runscripts
bb3b402
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
6179a63
Fix logging file not appending
2a4836e
fix in subgraphfusion helpers when comparing symbolic to non symbolic…
Sajohn-CH b35b64e
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH fbc1e8a
add option to not use outer loop first in gen_graphs
Sajohn-CH 5796785
Started to add plots for my transformations
Sajohn-CH 1818562
Remove debug prints in subgrahp helpers
Sajohn-CH 0db747a
Added barplot for tranpose kernel time
Sajohn-CH 986d62d
Fix some things in the runscripts
8a13ac7
For full cloudsc, only remove symbols when they are there
Sajohn-CH f286605
Add script to continue autopt from sdfg file
Sajohn-CH 670626c
Switched to builtin python logging framework
b43f41b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
df125e0
Add dace-auto-opt to run_mwe
Sajohn-CH e84b298
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 9add92d
Remove commented out code
Sajohn-CH edd96c8
Merge remote-tracking branch 'upstream/master' into thesis_playground
Sajohn-CH deca935
Added logfile with debug level
769075f
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
8eeb129
Fix transform map back to loop if needed to be done several times
Sajohn-CH d4b2c66
Full cloudsc code used by Lex
Sajohn-CH a014ce6
Change full cloudsc code to generate graph
Sajohn-CH cc88669
Increase size of testing parameters
c7ca159
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
8be55ed
Add fix for trivialmapexpansion
Sajohn-CH e586f0b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 08b3faf
Add lazy loading of basic SDFG
Sajohn-CH e2052dd
Split k_caching run-script into ncu and total time part
3816fa9
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
e3f69e9
Fixes regarding basic SDFG
56d796f
Fix some typos
c3ebf37
Move runconfig into own file
Sajohn-CH 6223a4d
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 071acb7
forgot to change one file
Sajohn-CH f48b0c5
Added some logging points and fixed some small mistakes in subgraph f…
Sajohn-CH 5de3a1b
Cleaned up k-caching
Sajohn-CH 5cc453d
add comments to my_auto_opt functions
Sajohn-CH 5bd8b08
Add changes to cater for full cloudsc
df59ddb
Some printing and logging changes
5227a97
Some WIP changes to delete logs
d6f2ec1
Had to disable renaming of tmp arrays in subgraph fusion again
dd7b95a
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
c85299e
Don't drop empty rows
Sajohn-CH 5fd3c01
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH a60a60c
Evaluate symbolic expression in subgraph fusion
d3a29c5
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
837e133
Cannot transform intermediate arrays if one has symbolic range
f101add
Fix NCLV indices in vert loops and related changes
a9fdd75
Add cloudscexp3 with removed u, v, o3 in tendency_* arrays
1648084
Fix mistake in run2
20ee17d
Make storage_on_gpu optional
4735c64
Added more runscripts and fixes in profile_config
dc4158e
Fix composite in subgraph if not doing k_caching
752aaa0
Update KFLDX in params
ae7661d
Add debug build in gen_full_cloudsc.py
1d596e1
Add transfert to gpu option to gen graphs
199d938
Fix in map_expansion with wrong schedule
Sajohn-CH 005441b
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH 68585d4
Adadpt change in strides to gpu copies
Sajohn-CH 4d3a376
WIP RefineNestedAccess fixed and fixes regarding simplify
b7b462e
Some changes in to generate full cloudsc
f791573
Add some logging prints and fixes with symbols
a72c1e0
New cloudsc version
3911825
Added some log messages
Sajohn-CH 2a95239
Changes in subgraph fusion: Improved check if arrays shape can be cha…
Sajohn-CH 55812f8
Changes in cloudscexp4
Sajohn-CH 0a41bf3
Add ScalarFission, RefineNestedAccess and splitting of interstated ed…
Sajohn-CH 06abd26
Add NPROMA to params
Sajohn-CH 9aa913b
Update graph generation scripts to include NCLDQR
Sajohn-CH cf0586d
Add clear-basic-sdfg option to gen_graphs
Sajohn-CH 05e80bc
Update gitignore
Sajohn-CH cd7a94b
Some WIP custom cun scripts
Sajohn-CH d911235
Fixes in subgraph fusion regarding missing symbols and chaning sdfg a…
Sajohn-CH 03fc93b
Fix min/max adjustment when fusing maps of different sizes
Sajohn-CH 7aab2c1
Adapt script to gen full cloudsc code
Sajohn-CH 7f791b4
Updates in utils
Sajohn-CH a0449f9
Add runscripts to fix memlets and print shapes of arrays
Sajohn-CH 651d61c
Fix wrong argparse usage
Sajohn-CH e5a80c7
Forgot to add function in gen_full_cloudsc
Sajohn-CH d2bdeff
Fix gen_full_cloudsc.py
3c8a28d
Add logfile args to gen_full_cloudsc
Sajohn-CH fab97d4
Fix greedy fuse in my_auto_opt and disable MapToForLoop
Sajohn-CH 38a7f90
Fix looking at all access nodes in subgraph fusion and fix init map d…
Sajohn-CH 2dd9a2f
Better names for nsdfg
Sajohn-CH 2032eef
Better names for state when transforming maps to loop
Sajohn-CH cf2f7c2
Forgot to remove a logger call
Sajohn-CH c44ccc0
gen_full_cloudsc add optiont to compile custom
Sajohn-CH 51ab575
Add k-caching and change_strides to gen_full_cloudsc.py
fbded49
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
75b1990
subgraph fusion: ZPFPLSX also treated as circular buffers and revert …
Sajohn-CH 27e551e
Added helper scripts to print array shape and memlets
Sajohn-CH ce448be
print memlets can print memlet inside nsdfg
Sajohn-CH a393268
Adapted full cloudsc fixes to new names and set NCLDTOP=15
Sajohn-CH 2db019a
Remove some unused parameters
Sajohn-CH b0c67a1
Add modulos to nsdfg going into access nodes
Sajohn-CH f88c97f
Disable memlet out of bond validation for now
Sajohn-CH e659e7a
Reversed incorrect offset in adding modulo
Sajohn-CH 55d3cf3
Enable out-of-bonds check and fix min/max value in min/max
Sajohn-CH a0cc6f8
For modulo need offset if outside memlet is a range
Sajohn-CH 27cafcc
Run cloudsc script
8166e71
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
b6046bb
add test for loop to map and back
Sajohn-CH e6c409d
Add loop to map to loop test
Sajohn-CH e7c9cb2
Add a simplify
Sajohn-CH 021701f
Started to add dependency case
Sajohn-CH 6445d2d
Added simple dependency case
Sajohn-CH 57f5bff
Add sdfg flag and remove -O3 from compile flags for full cloudsc
Sajohn-CH 66d2b4b
Remove blacklisted arrays before computing data_intermediate
Sajohn-CH a41e99f
Fix subgraph fusion fix circular buffers
Sajohn-CH 9eb169d
Fixed some small mistakes in helpers
Sajohn-CH 3c7deb4
Add device to full cloudsc sdfg
Sajohn-CH 756acac
Use lower case device
Sajohn-CH 19ffaff
Update runscript with building and store results into own file
Sajohn-CH d0b92de
Add sh cmake_configure to runscript
Sajohn-CH 333c989
Remove blocking of -O3
Sajohn-CH 34d9a36
Remove build folder before building*
328e129
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
ea393e6
Add optionall yinstrumentation when compiling
Sajohn-CH 430a10c
Fix subgrah fusio helpers to better deal with min/max
Sajohn-CH 4873300
Improve instrumentation code slightly
a56912e
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
0b97a36
Clean up code a bit and remove some more cruches
Sajohn-CH c7db81e
Add synchronize to instrument code
Sajohn-CH 356102f
read nblocks size from generated code
ef14a7e
Fix problem with edge going out of global map
Sajohn-CH 8890535
Also add cudaStreamSynchronize
e102d14
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
00aa9f0
Add NBLOCKS param to gen_full_cloudsc.py
Sajohn-CH c4df694
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH cc3ee80
Added scripts to profile full cloudsc
Sajohn-CH c494b12
Update run script to allow for absolute paths
Sajohn-CH 862f0b1
Changes in runscripts
220d2fb
Improvements in plotting full cloudsc
Sajohn-CH 7ce5842
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH d049982
Also log stdout if run fails
484de36
Add script to plot cloudsc cuda vs klon
Sajohn-CH a1185f8
Update plotting for full cloudsc
Sajohn-CH ad0a640
Merge branch 'thesis_playground' of github.com:Sajohn-CH/dace into th…
Sajohn-CH fa5fad2
Update plotscript for full cloudsc
Sajohn-CH c920bca
Update plotting scripts for full cloudsc
Sajohn-CH f4fba2c
Added logging message in my auto opt
Sajohn-CH 7db834e
Updated plotting scripts
Sajohn-CH 3174582
Added all the small helper scripts created during the thesis
Sajohn-CH b202d4c
Added all the fortran test programs created
Sajohn-CH 1237668
Added a unfinished test created
Sajohn-CH 0caaeaf
Added a test for K-caching
Sajohn-CH c980888
Add change strides test
Sajohn-CH e06ec94
Merge branch 'master' into cloudsc_k_caching_strides
Sajohn-CH f964d65
Moved change strides into own file
Sajohn-CH e526894
Removed thesis_playground folder
Sajohn-CH 0f0a3a3
Undo changes in gitignore
Sajohn-CH abfd85e
Once in AUTHORS is enough
Sajohn-CH 999f99d
Remove function defined twice
Sajohn-CH af416c2
Removed some unused changes
Sajohn-CH c55ed43
Undid some more changes
Sajohn-CH f88df12
Some more undoing of changes
Sajohn-CH 61d0f72
Removed cloudsc fortran tests
Sajohn-CH 9e6acf0
Some more cleanup
Sajohn-CH ebca8aa
More cleanup
Sajohn-CH 47688dd
Remove unfinished SwapLoopOrder
Sajohn-CH 780ceb6
Removed some changes in testcases
Sajohn-CH 4474740
Undo cleanup in ScalarToSymbol as this lead to failing
Sajohn-CH 839a843
Fix ScalarToSymbol
Sajohn-CH 5dd17f4
Removed some log prints
Sajohn-CH 37a2944
Filter out changes in auto_opt
Sajohn-CH a99cc79
Cleanup imports on aut opt
Sajohn-CH 27ae7e0
Cleanup loopToMap
Sajohn-CH d79063a
Added some comments
Sajohn-CH 3d421c5
Add description and runscripts
Sajohn-CH 170ec60
Add some more runscripts
Sajohn-CH 1bfff62
Extend README and undo config_schema change
Sajohn-CH 35935b5
Added further instructions
Sajohn-CH ab9021a
Adjust link
Sajohn-CH 0694c46
Improved help messages slightly
Sajohn-CH f9bc9e3
More comments and files for run2
Sajohn-CH fea3be2
Added gpu_general file
Sajohn-CH 6492437
Add note regarding GPU dependency
Sajohn-CH d32cd21
Removed unused function in execute_dace
Sajohn-CH 56a480a
Add ncu python utils
Sajohn-CH 784e622
Adjusted path to my_auto_opt
Sajohn-CH 4f665ff
add flop computation file
Sajohn-CH 51cadc4
Add programs.json
Sajohn-CH a267f5a
Fix programs.json
Sajohn-CH 90f136c
Add run_program.py
Sajohn-CH 5c9f0be
Made sure in flop computation to adjust for 1-index at fortran
Sajohn-CH 402f789
Guard against symbols being None
Sajohn-CH 2ef4ba2
Add view2
Sajohn-CH 9d9178f
Add print utils
Sajohn-CH 781e2ff
Add tabulate dependency
Sajohn-CH 3770c37
Add 2nd basic SDFGB
Sajohn-CH 0f5f979
Adapt readme for cloudsc
Sajohn-CH 0fcde5f
Add tests for move_assignment_outside_if
Sajohn-CH bf9d775
Remove cloudsc_thesis folder
Sajohn-CH 17c5697
Boiled down cloudsc_auto_opt
Sajohn-CH dea8bd3
Removed unwanted changes
Sajohn-CH 4723f5c
Remove duplicate change strides function
Sajohn-CH 82131d5
Add copyright information
Sajohn-CH 83db636
Added current not working state of outside loop first tests
Sajohn-CH 85d7263
Remove outside loop first
Sajohn-CH f266dfc
Remove wrongfully comitted file
Sajohn-CH 6ed77fd
Merge branch 'master' into change_strides_move_assignment_outside_if
alexnick83 d086922
Added property for selecting the inner map(s) schedule.
alexnick83 66db79e
Switched to EnumProperty, None means using original map's schedule.
alexnick83 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,210 @@ | ||
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved. | ||
""" This module provides a function to change the stride in a given SDFG """ | ||
from typing import List, Union, Tuple | ||
import sympy | ||
|
||
import dace | ||
from dace.dtypes import ScheduleType | ||
from dace.sdfg import SDFG, nodes, SDFGState | ||
from dace.data import Array, Scalar | ||
from dace.memlet import Memlet | ||
|
||
|
||
def list_access_nodes( | ||
sdfg: dace.SDFG, | ||
array_name: str) -> List[Tuple[nodes.AccessNode, Union[SDFGState, dace.SDFG]]]: | ||
""" | ||
Find all access nodes in the SDFG of the given array name. Does not recourse into nested SDFGs. | ||
|
||
:param sdfg: The SDFG to search through | ||
:type sdfg: dace.SDFG | ||
:param array_name: The name of the wanted array | ||
:type array_name: str | ||
:return: List of the found access nodes together with their state | ||
:rtype: List[Tuple[nodes.AccessNode, Union[dace.SDFGState, dace.SDFG]]] | ||
""" | ||
found_nodes = [] | ||
for state in sdfg.states(): | ||
for node in state.nodes(): | ||
if isinstance(node, nodes.AccessNode) and node.data == array_name: | ||
found_nodes.append((node, state)) | ||
return found_nodes | ||
|
||
|
||
def change_strides( | ||
sdfg: dace.SDFG, | ||
stride_one_values: List[str], | ||
schedule: ScheduleType) -> SDFG: | ||
""" | ||
Change the strides of the arrays on the given SDFG such that the given dimension has stride 1. Returns a new SDFG. | ||
|
||
:param sdfg: The input SDFG | ||
:type sdfg: dace.SDFG | ||
:param stride_one_values: Length of the dimension whose stride should be set to one. Expects that each array has | ||
only one dimension whose length is in this list. Expects that list contains name of symbols | ||
:type stride_one_values: List[str] | ||
:param schedule: Schedule to use to copy the arrays | ||
:type schedule: ScheduleType | ||
:return: SDFG with changed strides | ||
:rtype: SDFG | ||
""" | ||
# Create new SDFG and copy constants and symbols | ||
original_name = sdfg.name | ||
sdfg.name = "changed_strides" | ||
new_sdfg = SDFG(original_name) | ||
for dname, value in sdfg.constants.items(): | ||
new_sdfg.add_constant(dname, value) | ||
for dname, stype in sdfg.symbols.items(): | ||
new_sdfg.add_symbol(dname, stype) | ||
|
||
changed_stride_state = new_sdfg.add_state("with_changed_strides", is_start_state=True) | ||
inputs, outputs = sdfg.read_and_write_sets() | ||
# Get all arrays which are persistent == not transient | ||
persistent_arrays = {name: desc for name, desc in sdfg.arrays.items() if not desc.transient} | ||
|
||
# Get the persistent arrays of all the transient arrays which get copied to GPU | ||
for dname in persistent_arrays: | ||
for access, state in list_access_nodes(sdfg, dname): | ||
if len(state.out_edges(access)) == 1: | ||
edge = state.out_edges(access)[0] | ||
if isinstance(edge.dst, nodes.AccessNode): | ||
if edge.dst.data in inputs: | ||
inputs.remove(edge.dst.data) | ||
inputs.add(dname) | ||
if len(state.in_edges(access)) == 1: | ||
edge = state.in_edges(access)[0] | ||
if isinstance(edge.src, nodes.AccessNode): | ||
if edge.src.data in inputs: | ||
outputs.remove(edge.src.data) | ||
outputs.add(dname) | ||
|
||
# Only keep inputs and outputs which are persistent | ||
inputs.intersection_update(persistent_arrays.keys()) | ||
outputs.intersection_update(persistent_arrays.keys()) | ||
nsdfg = changed_stride_state.add_nested_sdfg(sdfg, new_sdfg, inputs=inputs, outputs=outputs) | ||
transform_state = new_sdfg.add_state_before(changed_stride_state, label="transform_data", is_start_state=True) | ||
transform_state_back = new_sdfg.add_state_after(changed_stride_state, "transform_data_back", is_start_state=False) | ||
|
||
# copy arrays | ||
for dname, desc in sdfg.arrays.items(): | ||
if not desc.transient: | ||
if isinstance(desc, Array): | ||
new_sdfg.add_array(dname, desc.shape, desc.dtype, desc.storage, | ||
desc.location, desc.transient, desc.strides, | ||
desc.offset) | ||
elif isinstance(desc, Scalar): | ||
new_sdfg.add_scalar(dname, desc.dtype, desc.storage, desc.transient, desc.lifetime, desc.debuginfo) | ||
|
||
new_order = {} | ||
new_strides_map = {} | ||
|
||
# Map of array names in the nested sdfg: key: array name in parent sdfg (this sdfg), value: name in the nsdfg | ||
# Assumes that name changes only appear in the first level of nsdfg nesting | ||
array_names_map = {} | ||
for graph in sdfg.sdfg_list: | ||
if graph.parent_nsdfg_node is not None: | ||
if graph.parent_sdfg == sdfg: | ||
for connector in graph.parent_nsdfg_node.in_connectors: | ||
for in_edge in graph.parent.in_edges_by_connector(graph.parent_nsdfg_node, connector): | ||
array_names_map[str(connector)] = in_edge.data.data | ||
|
||
for containing_sdfg, dname, desc in sdfg.arrays_recursive(): | ||
shape_str = [str(s) for s in desc.shape] | ||
# Get index of the dimension we want to have stride 1 | ||
stride_one_idx = None | ||
this_stride_one_value = None | ||
for dim in stride_one_values: | ||
if str(dim) in shape_str: | ||
stride_one_idx = shape_str.index(str(dim)) | ||
this_stride_one_value = dim | ||
break | ||
|
||
if stride_one_idx is not None: | ||
new_order[dname] = [stride_one_idx] | ||
|
||
new_strides = list(desc.strides) | ||
new_strides[stride_one_idx] = sympy.S.One | ||
|
||
previous_size = dace.symbolic.symbol(this_stride_one_value) | ||
previous_stride = sympy.S.One | ||
for i in range(len(new_strides)): | ||
if i != stride_one_idx: | ||
new_order[dname].append(i) | ||
new_strides[i] = previous_size * previous_stride | ||
previous_size = desc.shape[i] | ||
previous_stride = new_strides[i] | ||
|
||
new_strides_map[dname] = {} | ||
# Create a map entry for this data linking old strides to new strides. This assumes that each entry in | ||
# strides is unique which is given as otherwise there would be two dimension i, j where a[i, j] would point | ||
# to the same address as a[j, i] | ||
for new_stride, old_stride in zip(new_strides, desc.strides): | ||
new_strides_map[dname][old_stride] = new_stride | ||
desc.strides = tuple(new_strides) | ||
else: | ||
parent_name = array_names_map[dname] if dname in array_names_map else dname | ||
if parent_name in new_strides_map: | ||
new_strides = [] | ||
for stride in desc.strides: | ||
new_strides.append(new_strides_map[parent_name][stride]) | ||
desc.strides = new_strides | ||
|
||
# Add new flipped arrays for every non-transient array | ||
flipped_names_map = {} | ||
for dname, desc in sdfg.arrays.items(): | ||
if not desc.transient: | ||
flipped_name = f"{dname}_flipped" | ||
flipped_names_map[dname] = flipped_name | ||
new_sdfg.add_array(flipped_name, desc.shape, desc.dtype, | ||
desc.storage, desc.location, True, | ||
desc.strides, desc.offset) | ||
|
||
# Deal with the inputs: Create tasklet to flip them and connect via memlets | ||
# for input in inputs: | ||
for input in set([*inputs, *outputs]): | ||
if input in new_order: | ||
flipped_data = flipped_names_map[input] | ||
if input in inputs: | ||
changed_stride_state.add_memlet_path(changed_stride_state.add_access(flipped_data), nsdfg, | ||
dst_conn=input, memlet=Memlet(data=flipped_data)) | ||
# Simply need to copy the data, the different strides take care of the transposing | ||
arr = sdfg.arrays[input] | ||
tasklet, map_entry, map_exit = transform_state.add_mapped_tasklet( | ||
name=f"transpose_{input}", | ||
map_ranges={f"_i{i}": f"0:{s}" for i, s in enumerate(arr.shape)}, | ||
inputs={'_in': Memlet(data=input, subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))}, | ||
code='_out = _in', | ||
outputs={'_out': Memlet(data=flipped_data, | ||
subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))}, | ||
external_edges=True, | ||
schedule=schedule, | ||
) | ||
# Do the same for the outputs | ||
for output in outputs: | ||
if output in new_order: | ||
flipped_data = flipped_names_map[output] | ||
changed_stride_state.add_memlet_path(nsdfg, changed_stride_state.add_access(flipped_data), | ||
src_conn=output, memlet=Memlet(data=flipped_data)) | ||
# Simply need to copy the data, the different strides take care of the transposing | ||
arr = sdfg.arrays[output] | ||
tasklet, map_entry, map_exit = transform_state_back.add_mapped_tasklet( | ||
name=f"transpose_{output}", | ||
map_ranges={f"_i{i}": f"0:{s}" for i, s in enumerate(arr.shape)}, | ||
inputs={'_in': Memlet(data=flipped_data, | ||
subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))}, | ||
code='_out = _in', | ||
outputs={'_out': Memlet(data=output, subset=", ".join(f"_i{i}" for i, _ in enumerate(arr.shape)))}, | ||
external_edges=True, | ||
schedule=schedule, | ||
) | ||
# Deal with any arrays which have not been flipped (should only be scalars). Connect them directly | ||
for dname, desc in sdfg.arrays.items(): | ||
if not desc.transient and dname not in new_order: | ||
if dname in inputs: | ||
changed_stride_state.add_memlet_path(changed_stride_state.add_access(dname), nsdfg, dst_conn=dname, | ||
memlet=Memlet(data=dname)) | ||
if dname in outputs: | ||
changed_stride_state.add_memlet_path(nsdfg, changed_stride_state.add_access(dname), src_conn=dname, | ||
memlet=Memlet(data=dname)) | ||
|
||
return new_sdfg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
113 changes: 113 additions & 0 deletions
113
dace/transformation/interstate/move_assignment_outside_if.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved. | ||
""" | ||
Transformation to move assignments outside if statements to potentially avoid warp divergence. Speedup gained is | ||
questionable. | ||
""" | ||
|
||
import ast | ||
import sympy as sp | ||
|
||
from dace import sdfg as sd | ||
from dace.sdfg import graph as gr | ||
from dace.sdfg.nodes import Tasklet, AccessNode | ||
from dace.transformation import transformation | ||
|
||
|
||
class MoveAssignmentOutsideIf(transformation.MultiStateTransformation): | ||
|
||
if_guard = transformation.PatternNode(sd.SDFGState) | ||
if_stmt = transformation.PatternNode(sd.SDFGState) | ||
else_stmt = transformation.PatternNode(sd.SDFGState) | ||
|
||
@classmethod | ||
def expressions(cls): | ||
sdfg = gr.OrderedDiGraph() | ||
sdfg.add_nodes_from([cls.if_guard, cls.if_stmt, cls.else_stmt]) | ||
sdfg.add_edge(cls.if_guard, cls.if_stmt, sd.InterstateEdge()) | ||
sdfg.add_edge(cls.if_guard, cls.else_stmt, sd.InterstateEdge()) | ||
return [sdfg] | ||
|
||
def can_be_applied(self, graph, expr_index, sdfg, permissive=False): | ||
# The if-guard can only have two outgoing edges: to the if and to the else part | ||
guard_outedges = graph.out_edges(self.if_guard) | ||
if len(guard_outedges) != 2: | ||
return False | ||
|
||
# Outgoing edges must be a negation of each other | ||
if guard_outedges[0].data.condition_sympy() != (sp.Not(guard_outedges[1].data.condition_sympy())): | ||
return False | ||
|
||
# The if guard should either have zero or one incoming edge | ||
if len(sdfg.in_edges(self.if_guard)) > 1: | ||
return False | ||
|
||
# set of the variables which get a const value assigned | ||
assigned_const = set() | ||
# Dict which collects all AccessNodes for each variable together with its state | ||
access_nodes = {} | ||
# set of the variables which are only written to | ||
self.write_only_values = set() | ||
# Dictionary which stores additional information for the variables which are written only | ||
self.assign_context = {} | ||
for state in [self.if_stmt, self.else_stmt]: | ||
for node in state.nodes(): | ||
if isinstance(node, Tasklet): | ||
# If node is a tasklet, check if assigns a constant value | ||
assigns_const = True | ||
for code_stmt in node.code.code: | ||
if not (isinstance(code_stmt, ast.Assign) and isinstance(code_stmt.value, ast.Constant)): | ||
assigns_const = False | ||
if assigns_const: | ||
for edge in state.out_edges(node): | ||
if isinstance(edge.dst, AccessNode): | ||
assigned_const.add(edge.dst.data) | ||
self.assign_context[edge.dst.data] = {"state": state, "tasklet": node} | ||
elif isinstance(node, AccessNode): | ||
if node.data not in access_nodes: | ||
access_nodes[node.data] = [] | ||
access_nodes[node.data].append((node, state)) | ||
|
||
# check that the found access nodes only get written to | ||
for data, nodes in access_nodes.items(): | ||
write_only = True | ||
for node, state in nodes: | ||
if node.has_reads(state): | ||
# The read is only a problem if it is not written before -> the access node has no incoming edge | ||
if state.in_degree(node) == 0: | ||
write_only = False | ||
else: | ||
# There is also a problem if any edge is an update instead of write | ||
for edge in [*state.out_edges(node), *state.out_edges(node)]: | ||
if edge.data.wcr is not None: | ||
write_only = False | ||
|
||
if write_only: | ||
self.write_only_values.add(data) | ||
|
||
# Want only the values which are only written to and one option uses a constant value | ||
self.write_only_values = assigned_const.intersection(self.write_only_values) | ||
|
||
if len(self.write_only_values) == 0: | ||
return False | ||
return True | ||
|
||
def apply(self, _, sdfg: sd.SDFG): | ||
# create a new state before the guard state where the zero assignment happens | ||
new_assign_state = sdfg.add_state_before(self.if_guard, label="const_assignment_state") | ||
|
||
# Move all the Tasklets together with the AccessNode | ||
for value in self.write_only_values: | ||
state = self.assign_context[value]["state"] | ||
tasklet = self.assign_context[value]["tasklet"] | ||
new_assign_state.add_node(tasklet) | ||
for edge in state.out_edges(tasklet): | ||
state.remove_edge(edge) | ||
state.remove_node(edge.dst) | ||
new_assign_state.add_node(edge.dst) | ||
new_assign_state.add_edge(tasklet, edge.src_conn, edge.dst, edge.dst_conn, edge.data) | ||
|
||
state.remove_node(tasklet) | ||
# Remove the state if it was emptied | ||
if state.is_empty(): | ||
sdfg.remove_node(state) | ||
return sdfg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved. | ||
import dace | ||
from dace import nodes | ||
from dace.dtypes import ScheduleType | ||
from dace.memlet import Memlet | ||
from dace.transformation.change_strides import change_strides | ||
|
||
|
||
def change_strides_test(): | ||
sdfg = dace.SDFG('change_strides_test') | ||
N = dace.symbol('N') | ||
M = dace.symbol('M') | ||
sdfg.add_array('A', [N, M], dace.float64) | ||
sdfg.add_array('B', [N, M, 3], dace.float64) | ||
state = sdfg.add_state() | ||
|
||
task1, mentry1, mexit1 = state.add_mapped_tasklet( | ||
name="map1", | ||
map_ranges={'i': '0:N', 'j': '0:M'}, | ||
inputs={'a': Memlet(data='A', subset='i, j')}, | ||
outputs={'b': Memlet(data='B', subset='i, j, 0')}, | ||
code='b = a + 1', | ||
external_edges=True, | ||
propagate=True) | ||
|
||
# Check that states are as expected | ||
changed_sdfg = change_strides(sdfg, ['N'], ScheduleType.Sequential) | ||
assert len(changed_sdfg.states()) == 3 | ||
assert len(changed_sdfg.out_edges(changed_sdfg.start_state)) == 1 | ||
work_state = changed_sdfg.out_edges(changed_sdfg.start_state)[0].dst | ||
nsdfg = None | ||
for node in work_state.nodes(): | ||
if isinstance(node, nodes.NestedSDFG): | ||
nsdfg = node | ||
# Check shape and strides of data inside nested SDFG | ||
assert nsdfg is not None | ||
assert nsdfg.sdfg.data('A').shape == (N, M) | ||
assert nsdfg.sdfg.data('B').shape == (N, M, 3) | ||
assert nsdfg.sdfg.data('A').strides == (1, N) | ||
assert nsdfg.sdfg.data('B').strides == (1, N, M*N) | ||
|
||
|
||
def main(): | ||
change_strides_test() | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will not work for GPU maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate why do you think this would not work in GPU maps? The reason why I put this in is that I had maps expanded which were not sequential and suddenly one of the maps was sequential, which goes against my understanding of what MapExpansion should do. Though when I disable this change some (maybe all, didn't check all) of the failing tests pass again.