量化quantization 返回上层目录 === 七、量化原理 Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases