GitHub

MagR [Neurips 2024]

This repository contains the code for the Neurips 2024 paper MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization The current release includes the following features:

MagR.py: the main functions of MagR. There are some hyper-parameters in these function. $\alpha$: 0.001 (per-channel), 0.0001 (per-group); n_iter: 200. Theoretically, the more iterations, the better. But in order to balance the number of iterations and running time, 200 are chosen.
modelutils.py: model utilities
datautils.py: data utilities
quant.py: quantizer
optq.py: the implementations of OPTQ

Dependencies

torch: v2.3.0
transformers: v4.36.0
datasets: v2.18.0

All experiments were run on a single 80GB NVIDIA A100. However, most experiments are compatible with a GPU with significantly less memory.

Run MagR

# Quantize LlaMa2-7B for 4 bit by MagR+GPTQ per-channel quantization
python llama.py meta-llama/llama-2-7b-hf wikitext2 --wbits 4 --magr

Citation

If you found this work useful, please consider citing:

@article{zhang2024magr,
  title={MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization},
  author={Zhang, Aozhong and Wang, Naigang and Deng, Yanxia and Li, Xin and Yang, Zi and Yin, Penghang},
  journal={arXiv preprint arXiv:2406.00800},
  year={2024}
}

Acknowledgements

This code is based on GPTQ and GPTQ-for-LLaMa

Thanks to Meta AI for releasing LLaMA, a powerful LLM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MagR [Neurips 2024]

Dependencies

Run MagR

Citation

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
MagR.py		MagR.py
README.md		README.md
datautils.py		datautils.py
llama.py		llama.py
modelutils.py		modelutils.py
optq.py		optq.py
quant.py		quant.py

AozhongZhang/MagR

Folders and files

Latest commit

History

Repository files navigation

MagR [Neurips 2024]

Dependencies

Run MagR

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages