8-bit quantization MVP #347

robertknight · 2024-09-06T07:28:24Z

The issue tracks the work involved in an MVP of 8-bit quantization support. The goal is to be able to convert and run:

The text was updated successfully, but these errors were encountered:

robertknight · 2025-02-08T11:26:09Z

Initial support for running quantized models has been released as part of v0.16.0. The quantization guide has more details and steps for quantizing ONNX models with recommended settings.

robertknight mentioned this issue Sep 14, 2024

Make im2col packing mostly generic #358

Merged

robertknight pinned this issue Oct 13, 2024

robertknight mentioned this issue Jan 3, 2025

Make packing of GEMM inputs more flexible #510

Merged

robertknight added the quantization Issues related to support for quantized data types or operations label Feb 3, 2025

robertknight closed this as completed Feb 8, 2025

robertknight unpinned this issue Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8-bit quantization MVP #347

8-bit quantization MVP #347

robertknight commented Sep 6, 2024 •

edited

Loading

robertknight commented Feb 8, 2025

8-bit quantization MVP #347

8-bit quantization MVP #347

Comments

robertknight commented Sep 6, 2024 • edited Loading

robertknight commented Feb 8, 2025

robertknight commented Sep 6, 2024 •

edited

Loading