Skip to content

v1.9

Compare
Choose a tag to compare
@DefTruth DefTruth released this 12 Aug 01:27
· 103 commits to main since this release
e6b8cf4

What's Changed

  • 🔥[DynamoLLM] DynamoLLM: Designing LLM Inference Clusters for Performa… by @DefTruth in #28
  • 🔥[Zero-Delay QKV Compression] Zero-Delay QKV Compression for Mitigati… by @DefTruth in #29
  • 🔥[Automatic Inference Engine Tuning] Towards SLO-Optimized LLM Servin… by @DefTruth in #30
  • 🔥🔥[500xCompressor] 500xCompressor: Generalized Prompt Compression for… by @DefTruth in #31
  • Bump up to v1.9 by @DefTruth in #32

Full Changelog: v1.8...v1.9