-
Notifications
You must be signed in to change notification settings - Fork 89
[feature request]: support for ollama self-hosted llm #35
Comments
I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL. |
Only fill the api field with the local server addr and not change the model name? It looks like they are using the diffrent api path between openai and ollama. I don't think this can work but i will give it a try. |
it doesn't work. maybe because the LM studio you used have the same api path with openai? |
It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out. |
Yep, ollama using the different api path (you can check it out in its doc: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) or you can see it in my screenshot in the decscription of the issue. example using mistral with recommand prompt template: curl -s http://localhost:11434/api/generate -d '{
"model": "mistral",
"stream": false,
"prompt":"<s>[INST] You are a url classifier, you based on the given url to classify the browser tab type as one of the following: Development, Utilities, Entertainment. Respond with only one single word (without any explaination or punctuation) from the given list. So for instance the following: https://github.com/skyf0cker/ai-group-tags will belong to: [/INST]Development</s>[INST]https://reddit.com[/INST]"
}' | jq '.response'
response: |
I definetely think supporting local LLM is the ideal choice. The task is well-suited for a small local LLM. Any contributions are welcome! 👍 |
right now you can use https://nitro.jan.ai/ it supports an openai compatible endpoint |
I think the task even doesn't need a local LLM, it can be done with traditional embedding. Just run the embedding and classify inside browser with JavaScript. It's faster and protecting users' privacy |
How about adding keywords for classification, and processing them just like Filter Rules? |
Yes, using the LLM is a bit of overkill, our task is relatively simple. |
I think the best solution is training or using a small model running in the browser |
I hold the identical opinion; I attempted the Microsoft/Phi-2 model (very small around 2.7G), but unfortunately, it did not perform well on this classification task. |
After a few test, i found that using the open source model like mistral 7B can also do this job well. By support these self-hosted model, users don't need to worry about the networking issues related to connecting to OpenAI's network and the potential costs associated with using the models.
The text was updated successfully, but these errors were encountered: