We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using openai llm provider and providing below for base url and api_key sends the request to llama-server.
base_url="http://localhost:5050/v1", # "http://:port" api_key = "sk-no-key-required"
but it fails with 500 internal server error due to openai llm provider expecting multimodal which has been disabled in llama-server.
Ollama is working but it doesn't allow integrated gpu offload which Llama server provides and it is faster for smaller models.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Using openai llm provider and providing below for base url and api_key sends the request to llama-server.
base_url="http://localhost:5050/v1", # "http://:port"
api_key = "sk-no-key-required"
but it fails with 500 internal server error due to openai llm provider expecting multimodal which has been disabled in llama-server.
Ollama is working but it doesn't allow integrated gpu offload which Llama server provides and it is faster for smaller models.
The text was updated successfully, but these errors were encountered: