-
cutlass/include/cutlass/gemm/threadblock/index_remat.h Lines 65 to 69 in 5c447dd Can you help me understand why it helps reduce register liveness as the comment says? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
instead of
we prefer to do
just recompute simple things every time we use it so that we can have more free registers. |
Beta Was this translation helpful? Give feedback.
-
Drat, I was hoping the Cuda compiler did that optimization on its own. I can think of many places in my ML library where I failed to apply it. |
Beta Was this translation helpful? Give feedback.
instead of
we prefer to do
just recompute simple things every time we use it so that we can have more free registers.