You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The MMA means the Matrix Multiplication API in tensorcore library.
Since I am working on the Binary Neural Network, I am wondering if it is possible to write a 1-bit implementation of LLM acceleration using the Triton library.
Hi, really good work, and appreciate it a lot.
I am curious whether Triton can support 1-bit acceleration for MMA. Also the further application to 1-bit GPTQ?
The text was updated successfully, but these errors were encountered: