Skip to content

GPTQModel v1.5.0

Latest
Compare
Choose a tag to compare
@Qubitium Qubitium released this 24 Dec 02:01
· 9 commits to main since this release
4197cd8

What's Changed

⚡ Multi-modal (image-to-text) optimized quantization support has been added for Qwen 2-VL and Ovis 1.6-VL. Previous image-to-text model quantizations did not use image calibration data, resulting in less than optimal post-quantization results. Version 1.5.0 is the first release to provide a stable path for multi-modal quantization: only text layers are quantized.
🐛 Fixed Qwen 2-VL model quantization vram usage and post-quant file copy of relevant config files.
🐛 Fixed install/compilations in envs with wrong TORCH_CUDA_ARCH_LIST set (Nvidia docker images)
🐛 Warn about bad torch[cuda] install on Windows

Full Changelog: v1.4.5...v1.5.0