Replies: 1 comment 16 replies
-
Hi, yes, Triton supports int8. Could you please share the error that you see? Maybe I can help fixing it. |
Beta Was this translation helpful? Give feedback.
-
Hi, yes, Triton supports int8. Could you please share the error that you see? Maybe I can help fixing it. |
Beta Was this translation helpful? Give feedback.
-
Is anyone doing quantization on x86 for fp32 to int8, I've been looking at some examples and I'm trying to do this for the test_matmul.py in the triton_shared/python/examples dir but am not able to do it. Does triton even support int8 at all? For example, if I modify the MLIR file after --triton-to-linalg-experimental by changing f32 to i8 I run into translation errors when trying to lower to affine loops. If I change float32 to int8 at the source level for test_matmul.py I run into errors about int8 not being supported in triton_shared: RuntimeError: "normal_kernel_cpu" not implemented for 'Char'
. I have seen this: https://pytorch.org/blog/int8-quantization/ but could not get this to work for test_matmul.py.
Beta Was this translation helpful? Give feedback.
All reactions