1-bit acceleration support #7

NicoNico6 · 2023-04-14T23:11:09Z

Hi, really good work, and appreciate it a lot.

I am curious whether Triton can support 1-bit acceleration for MMA. Also the further application to 1-bit GPTQ?

fpgaminer · 2023-04-14T23:14:58Z

Thanks.

What do you mean by MMA?

I might add support for more bit widths if there's demand for it. AFAIK 4-bits is "optimal", which is why I've focused there with the work thus far.

NicoNico6 · 2023-04-17T14:25:21Z

Hi,

The MMA means the Matrix Multiplication API in tensorcore library.

Since I am working on the Binary Neural Network, I am wondering if it is possible to write a 1-bit implementation of LLM acceleration using the Triton library.

Thanks a lot for your answer and help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1-bit acceleration support #7

1-bit acceleration support #7

NicoNico6 commented Apr 14, 2023

fpgaminer commented Apr 14, 2023

NicoNico6 commented Apr 17, 2023

1-bit acceleration support #7

1-bit acceleration support #7

Comments

NicoNico6 commented Apr 14, 2023

fpgaminer commented Apr 14, 2023

NicoNico6 commented Apr 17, 2023