Run quickly a LLM in local as backend for development along with a Chat UI.
All installed via docker compose.
- docker compose (recommended V2).
- nvidia-container-toolkit installed if you have gpu.
- Configure
.env
.
COMPOSE_PROFILES
.gpu
(you need nvidia-container-toolkit installed) orcpu
.MODEL
. One from the ollama model library.
- Run docker compose.
docker compose up -d
- UI: http://localhost:3000
- OpenAI API: http://localhost:8000
Common docker compose commands useful in daily execution:
- Stop.
docker compose stop
- Show logs.
docker compose logs -f
- Remove all.
docker compose down -v
Example using Langchain:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(openai_api_base="http://localhost:8000", openai_api_key="ignored", model="mixtral", temperature=0.1)
print(llm.invoke("Who are you?"))