Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Enable llama-server api option in llm_provider #359

Open
Raul824 opened this issue Feb 27, 2025 · 0 comments
Open

[FEATURE REQUEST] Enable llama-server api option in llm_provider #359

Raul824 opened this issue Feb 27, 2025 · 0 comments

Comments

@Raul824
Copy link

Raul824 commented Feb 27, 2025

Using openai llm provider and providing below for base url and api_key sends the request to llama-server.

base_url="http://localhost:5050/v1", # "http://:port"
api_key = "sk-no-key-required"

but it fails with 500 internal server error due to openai llm provider expecting multimodal which has been disabled in llama-server.

Ollama is working but it doesn't allow integrated gpu offload which Llama server provides and it is faster for smaller models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant