-
I have a question is 4090 support FP8 Gemm, but it seems there is no specialization for SM89(Like GMMA), I only found SM90 specialization in cute. I wonder if sm89 use the same MMA instruction as SM90? And I also want to ask whether CUDA MATH will support FP8 math instruction like add, multiply etc in future. I only found fp8 datatype convert function in CUDA Documentations. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
At the moment, CUTLASS only supports fp8 for |
Beta Was this translation helpful? Give feedback.
-
not on ada and hopper. you could convert them to fp16 first and use hadd2 and hmul2 |
Beta Was this translation helpful? Give feedback.
-
FWIW this seems to be supported in latest CUTLASS 3.5 (need CUDA 12.4+). |
Beta Was this translation helpful? Give feedback.
sm_90
/sm_90a
has a lot more features thansm_89
.At the moment, CUTLASS only supports fp8 for
sm_90
/sm_90a
due tonvcc
lackingsm_89
fp8 support. See the previous discussion "it is coming".