Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Set vllm-hpu-extension to 6ac93fb (#684)
remove expert_max hard code (#47) vLLM-Ext: Full enabling of ALiBi (#34) Add version inference via setuptools-scm (#58) Revert "vLLM-Ext: Full enabling of ALiBi (#34)" (#59) Remove punica_hpu.py from vllm_hpu_extension (#66) Removed previous (not-pipelined) pa implementation (#72) Add flag to enable running softmax in fp32 (#71) Update calibration readme link (#73) allow lm_head quantization in calibration process (#65) Pad to bmin if value is less (#67) Update pyproject.toml (#75) --------- Co-authored-by: Michał Kuligowski <[email protected]>
- Loading branch information