Skip to content

LMDeploy Release V0.3.0

Compare
Choose a tag to compare
@lvhan028 lvhan028 released this 03 Apr 01:55
· 459 commits to main since this release
4822fba

Highlight

  • Refactor attention and optimize GQA(#1258 #1307 #1116), achieving 22+ and 16+ RPS for internlm2-7b and internlm2-20b, about 1.8x faster than vLLM
  • Support new models, including Qwen1.5-MOE(#1372), DBRX(#1367), DeepSeek-VL(#1335)

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

Full Changelog: v0.2.6...v0.3.0