LMDeploy Release V0.2.4
What's Changed
💥 Improvements
- use stricter rules to get weight file by @irexyc in #1070
- check pytorch engine environment by @grimoire in #1107
- Update Dockerfile order to launch the http service by
docker run
directly by @AllentDan in #1162 - Support torch cache_max_entry_count by @grimoire in #1166
- Remove the manual model conversion during benchmark by @lvhan028 in #953
- update llama triton example by @zhyncs in #1153
🐞 Bug fixes
- fix embedding copy size by @irexyc in #1036
- fix pytorch engine with peft==0.8.2 by @grimoire in #1122
- support triton2.2 by @grimoire in #1137
- Add
top_k
in ChatCompletionRequest by @lvhan028 in #1174 - minor fix benchmark generation guide and script by @lvhan028 in #1175
📚 Documentations
🌐 Other
- Add eval ci by @RunningLeon in #1060
- Ete testcase add more models by @zhulinJulia24 in #1077
- Fix win ci by @irexyc in #1132
- bump version to v0.2.4 by @lvhan028 in #1171
Full Changelog: v0.2.3...v0.2.4