-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New and Improved MapFusion #1629
base: main
Are you sure you want to change the base?
New and Improved MapFusion #1629
Conversation
Now using the 3.9 type hints.
But it is too restrictive.
When the function was fixing the innteriour of the second map, it did not remove the readiong.
It almost passes all fuction. However, the one that needs renaming are not yet done.
…t in the input and output set. However, it is very simple.
Before it was going to look for the memlet of the consumer or producer. However, one should actually only look at the memlets that are adjacent to the scope node. At least this is how the original worked. I noticed this because of the `buffer_tiling_test.py::test_basic()` test. I was not yet focused on maps that were nested and not multidimensional. It seems that the transformation has some problems there.
Whet it now cheks for covering (i.e. if the information to exchange is enough) it will now no longer decend into the maps, but only inspect the first outgoing/incomming edges of the map entrie and exit. I noticed that the other way was to restrictive, especially for map tiling.
Otherwise we can end up in recursion.
Before it was replacing the elimated variables by zero. Which actually worked pretty good, but I have now changed that such that `offset()` is used. I am not sure why I used `replace` in the first place, but I think that there was an issue. However, I am not sure.
…and_write_sets()`.
…so Issue spcl#1634 for more details. As it is written in the issue, I can not simply remove the check but I also have to adapte the tests. The main important one is `tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map` where the behaviour has changed. However, after carefull examination I am sure that the test is still correct, or better now works correct as there is no dependency.
…sabled in 5c49eee. The reason is because `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` fails (in auto optimizer mode). I remember that now. Also the issue is 1643.
Before it was a DaCe Property, but I relaized now that it should actually be a plain data member. This also solves lots of issues I had with serialization.
I will now add a very complicated test to ensure that it realy does what I want.
Let's see if the CI can handle it.
I have some trouble understanding the difference between -Serial and -Parallel and why both are necessary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that MapFusion has issues - but I am slightly confused by the current approach to provide multiple versions - it seems as though fixes to the original should be preferrable. Why should we maintain many different implementations, and not have one maintained one, and deprecate the rest? @tbennun - what is your view?
dace/sdfg/state.py
Outdated
# Union all subgraphs, so an array that was excluded from the read | ||
# set because it was written first is still included if it is read | ||
# in another subgraph | ||
for data, accesses in rs.items(): | ||
read_set[data] += accesses | ||
for data, accesses in ws.items(): | ||
write_set[data] += accesses | ||
return read_set, write_set | ||
return copy.deepcopy((read_set, write_set)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to make a copy here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, the subsets are still linked to the Memlets, so if you modify them then you change them at the Memlets.
This might be useful but since it is not possible to determine which subset belongs to which memlet, it does not make sense to maintain this link.
You could potentially copy them above, because some of them are constructed on demand anyway, however:
- The on demand construction is the minority case.
- Doing it here allows to do it in one big sweep.
dace/sdfg/state.py
Outdated
for e in out_edges: | ||
# skip empty memlets | ||
if e.data.is_empty(): | ||
if not isinstance(n, nd.AccessNode): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add more comments to make it easier to follow what is being done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% about the naming solution - having 3/5 different kinds of MapFusion seems like a poor solution that will lead to confusion - between Serial,Parallel, OTF and the original MapFusions versions are proliferating. Would it not be preferrable to fix/choose one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but the original map fusion was unable to do parallel fusing, the available solutions are:
- Combining serial and parallel map fusion into one
- Doing the above one but making parallel an opt in option.
The best solution is probably 2. because then everything will work as before. But I do not care what do you think @tbennun @acalotoiu
Now only the serial version is there. Also integrated the helper into the serial file.
This reverts commit 259d17c.
…f data can be removed. This is because the function is much less strict.
Before the intermediate scalar was a Register, but not the array. I noticed that some rediundand array removal transformation have problems with that. An alternative would be to make the array intermediate a register.
A new and improved version of the map fusion transformation.
The transformation is implemented in a class named
MapFusionSerial
, furthermore theMapFusionParallel
transformation is added, that allows to fuse parallel maps together.The new transformation analyses the graph more carefully when it checks if and how it should perform the fusing.
Special consideration was given about the correction of memlets.
However, there is still some aspects that should be improved and allowed to handle.
The new transformation produces graphs that are slightly different from before, and certain (other) transformations can not handle the resulting SDFG. For that reason a compatibility flag
strict_dataflow
was introduced. However, by default this flag is disabled. The only place where it is activated is inside the auto optimization function.Furthermore, the
SDFGState._read_and_write_sets()
function has been rewritten to handle the new SDFGs, because of some bugs. However, one bug has been kept because of other transformations that would fail otherwise.But it is a bug, tests were written to demonstrate this.
Collection of known issues in other transformation:
RefineNestedAccess
and `SDFGState._read_and_write_sets()