v0.0.2

github-actions released this 06 Sep 20:28

· 365 commits to main since this release

9304af9

What's Changed

Refactor fused modules by @casper-hansen in #18
fuse_layers bug fix by @qwopqwop200 in #21
support speedtest to benchmark FP16 model by @wanzhenchn in #25
Implement batch size for speed test by @casper-hansen in #26
[BUG] Fix illegal memory access + Quantized Multi-GPU support by @casper-hansen in #28
YaRN support for LLaMa models by @casper-hansen in #23

New Contributors

@wanzhenchn made their first contribution in #25

Full Changelog: v0.0.1...v0.0.2

Contributors

wanzhenchn, casper-hansen, and qwopqwop200

Assets 10