[feature request]: support for ollama self-hosted llm #35

skyf0cker · 2023-12-10T08:05:56Z

After a few test, i found that using the open source model like mistral 7B can also do this job well. By support these self-hosted model, users don't need to worry about the networking issues related to connecting to OpenAI's network and the potential costs associated with using the models.

nohzafk · 2023-12-10T08:13:10Z

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

skyf0cker · 2023-12-10T08:16:29Z

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

Only fill the api field with the local server addr and not change the model name? It looks like they are using the diffrent api path between openai and ollama. I don't think this can work but i will give it a try.

skyf0cker · 2023-12-10T08:18:26Z

I use LM Studio to start a local server with success, you can try to use http://localhost:11434 in the externsion options API URL.

it doesn't work. maybe because the LM studio you used have the same api path with openai?

nohzafk · 2023-12-10T08:22:31Z

It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.

skyf0cker · 2023-12-10T08:35:21Z

It has something to do with the prompt format and openai api compatability, LM Studio can handle them. I don't have experience with ollama, so you might need to figure it out.

Yep, ollama using the different api path (you can check it out in its doc: https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) or you can see it in my screenshot in the decscription of the issue.
It looks like using the specific prompt for Mistral can have a more compelling performance, according to doc.
However, I understand that all of these are achievable. If the support for Ollama is acknowledged at the product level, then I can make adjustments to the implementation details. As for the implementation itself, I can also dedicate some of my spare time to it.

example using mistral with recommand prompt template:

curl -s http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "stream": false,
  "prompt":"<s>[INST] You are a url classifier, you based on the given url to classify the browser tab type as one of the following: Development, Utilities, Entertainment. Respond with only one single word (without any explaination or punctuation) from the given list. So for instance the following: https://github.com/skyf0cker/ai-group-tags will belong to: [/INST]Development</s>[INST]https://reddit.com[/INST]"
}' | jq '.response'

response:
"Entertainment"

nohzafk · 2023-12-10T08:46:51Z

I definetely think supporting local LLM is the ideal choice. The task is well-suited for a small local LLM. Any contributions are welcome! 👍

tikikun · 2023-12-11T01:20:45Z

right now you can use https://nitro.jan.ai/ it supports an openai compatible endpoint

MichaelYuhe · 2023-12-11T01:55:09Z

I think the task even doesn't need a local LLM, it can be done with traditional embedding. Just run the embedding and classify inside browser with JavaScript. It's faster and protecting users' privacy

hqwuzhaoyi · 2023-12-15T06:37:23Z

How about adding keywords for classification, and processing them just like Filter Rules?

nohzafk · 2023-12-16T14:31:12Z

I believe that Candle is an excellent choice, and I recommend considering support it. Candle primarily focuses on serverless inference and provides the ability to run models within browsers using wasm.

#77 is also talking about local first computation support.

rainzee · 2023-12-20T16:46:19Z

Yes, using the LLM is a bit of overkill, our task is relatively simple.
#77 make big picture tradoff

MichaelYuhe · 2023-12-21T02:54:34Z

I think the best solution is training or using a small model running in the browser

nohzafk · 2023-12-21T05:56:27Z

I hold the identical opinion; I attempted the Microsoft/Phi-2 model (very small around 2.7G), but unfortunately, it did not perform well on this classification task.

MichaelYuhe added the enhancement New feature or request label Dec 10, 2023

MichaelYuhe added the need more discussion label Dec 11, 2023

MichaelYuhe mentioned this issue Dec 12, 2023

[feature request]: Add caching for request results #55

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request]: support for ollama self-hosted llm #35

[feature request]: support for ollama self-hosted llm #35

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023

skyf0cker commented Dec 10, 2023

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023 •

edited

Loading

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023

tikikun commented Dec 11, 2023

MichaelYuhe commented Dec 11, 2023

hqwuzhaoyi commented Dec 15, 2023

nohzafk commented Dec 16, 2023

rainzee commented Dec 20, 2023

MichaelYuhe commented Dec 21, 2023

nohzafk commented Dec 21, 2023

[feature request]: support for ollama self-hosted llm #35

[feature request]: support for ollama self-hosted llm #35

Comments

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023

skyf0cker commented Dec 10, 2023

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023 • edited Loading

skyf0cker commented Dec 10, 2023

nohzafk commented Dec 10, 2023

tikikun commented Dec 11, 2023

MichaelYuhe commented Dec 11, 2023

hqwuzhaoyi commented Dec 15, 2023

nohzafk commented Dec 16, 2023

rainzee commented Dec 20, 2023

MichaelYuhe commented Dec 21, 2023

nohzafk commented Dec 21, 2023

nohzafk commented Dec 10, 2023 •

edited

Loading