Why use floor divide in shape_div? #1770
-
Q: I am using the composition() function and encountered an unexpected result. My input was: lhs = (_256,(_32,_4)):(_32,(_1,_8192)) However, the result was: (200,(4,3)):(_32,(_8,_8192)) This is not what I expected. I was expecting: (200,(4,4)):(_32,(_8,_8192)) I believe the issue arises because the domain<1> of rhs is 13 instead of 12. And, I found shape_div() which is used in composition_impl() use floor divided rather than ceil divided. Can you help clarify this behavior? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Good catch. At the moment, this is considered a violation of the "divisibility condition" mentioned in the documentation. The static version fails
That said, I agree that the divisibility condition is actually too tight in cases like these and the above SHOULD work -- this is a known class of bugs. Fortunately, we have not found any applications that need this generalization, but, unfortunately, supporting it is a bit more complex than simply rounding up. In the near future, I plan to release a much more formal treatment of CuTe in a whitepaper along with some non-critical code updates and generalizations like this one. |
Beta Was this translation helpful? Give feedback.
After a cup of coffee on an actual workday, I realize that this particular
composition
case should fail and the current behavior is correct. One post-condition ofcomposition
is that the result iscompatible
with the rhs input, socomposition
can never perform any rounding at all. Because there is no possible output that satisfies all of the post-conditions ofcomposition
, it should fail on these inputs (perhaps with better runtime assertions, of course).There is a related set of known artificial limitations around
composition
andlogical_divide
that can be loosened, but this problem is not an example of them.