The Inference Perf project aims to provide GenAI inference performance benchmarking tool. It came out of wg-serving and is sponsored by SIG Scalability. See the proposal for more info.
This project is currently in development.
PDM Python Package Manager is utilized in this repository for dependecy management.
-
Setup virtual environment with
pdm
and install dependenciesmake all-deps
-
Run inference-perf CLI
pdm run inference-perf
Our community meeting is weekly at Th 11:30 PDT (Zoom Link, Meeting Notes).
We currently utilize the #wg-serving Slack channel for communications.
Contributions are welcomed, thanks for joining us!
Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.