Replies: 1 comment
-
Yes if we have a remsi, we will end up with two
You're right. The remsi support we recently added is to handle code that is similar in nature to the triton matmul tutorial. Basically, if we distribute a tensor of [ Our implementation is very primitive and only supports the above particular use case, and that we either mod by row or column but not both. For more details please take a look at these two diagrams from our
Important assumptions:
This is feature has lots of complexity due to the interaction with triton masks, incrementing the offsets during a loop,... so we aim to only have the basic and important cases working. It is hard to statically figure out everything when triton is dynamic by nature. 😄
I have added some more support as we recently discussed in #68. 1D tensor doesn't work due to an assert, but we can get around that by doing
Technically we can't support I hope this helps! Thanks again for your interest in the project. |
Beta Was this translation helpful? Give feedback.
-
Hello community,
I have some questions about the recently added implementation of remsi.
From my understanding, if we have an array of ptrs with offsets like:
then with a remsi [2, 2, 2, 2] applied we will get:
Finally we have 2 continuous memrefs to handle. Each of them has size=2.
Beta Was this translation helpful? Give feedback.
All reactions