Skip to content

v2.6.1

Compare
Choose a tag to compare
@DefTruth DefTruth released this 14 Oct 05:08
· 51 commits to main since this release
7ba03a6

What's Changed

  • [From Author] Link CacheGen and CacheBlend to LMCache by @KuntaiDu in #80
  • 🔥[LORC] Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy by @DefTruth in #81
  • Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation by @DefTruth in #82
  • [LLM Inference] LARGE LANGUAGE MODEL INFERENCE ACCELERATION: A COMPREHENSIVE HARDWARE PERSPECTIVE by @DefTruth in #83
  • 🔥[PARALLELSPEC] PARALLELSPEC: PARALLEL DRAFTER FOR EFFICIENT SPECULATIVE DECODING by @DefTruth in #84

New Contributors

Full Changelog: v2.6...v2.6.1