Release GPTQModel v1.5.0 · ModelCloud/GPTQModel

What's Changed

⚡ Multi-modal (image-to-text) optimized quantization support has been added for Qwen 2-VL and Ovis 1.6-VL. Previous image-to-text model quantizations did not use image calibration data, resulting in less than optimal post-quantization results. Version 1.5.0 is the first release to provide a stable path for multi-modal quantization: only text layers are quantized.
🐛 Fixed Qwen 2-VL model quantization vram usage and post-quant file copy of relevant config files.
🐛 Fixed install/compilations in envs with wrong TORCH_CUDA_ARCH_LIST set (Nvidia docker images)
🐛 Warn about bad torch[cuda] install on Windows

Fix backend not ipex by @CSY-ModelCloud in #930
Fix broken ipex check by @Qubitium in #933
Fix dynamic_cuda validation by @CSY-ModelCloud in #936
Fix bdist_wheel does not exist on old setuptools by @CSY-ModelCloud in #939
Add cuda warning on windows by @CSY-ModelCloud in #942
Add torch inference benchmark by @CL-ModelCloud in #940
Add modality to BaseModel by @ZX-ModelCloud in #937
[FIX] qwen_vl_utils should be locally import by @ZX-ModelCloud in #946
Filter torch cuda arch < 6.0 by @CSY-ModelCloud in #955
[FIX] wrong filepath was used when model_id_or_path was hugging model id by @ZX-ModelCloud in #956
Fix import error was not caught by @CSY-ModelCloud in #961

Full Changelog: v1.4.5...v1.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v1.5.0

What's Changed

Contributors