Install vllm

!pip install vllm

Run in terminal, for example we will use Llama 2 7b, you can omit the host arguement however, it might cause error when working with remote server (it did in my case)

You can check vllm.entrypoints in https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py

python -m vllm.entrypoints.api_server --host 127.0.0.1 --model NousResearch/Llama-2-7b-chat-hf

Mimicking OpenAI inference with open source models & vLLM

!python -m vllm.entrypoints.openai.api_server --host 127.0.0.1 --model NousResearch/Llama-2-7b-chat-hf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Install vllm

Mimicking OpenAI inference with open source models & vLLM

Files

README.md

Latest commit

History

README.md

File metadata and controls

Install vllm

Mimicking OpenAI inference with open source models & vLLM