Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull missing model for Ollama LLM generation #148

Closed
erhant opened this issue May 10, 2024 · 4 comments · Fixed by #156
Closed

Pull missing model for Ollama LLM generation #148

erhant opened this issue May 10, 2024 · 4 comments · Fixed by #156

Comments

@erhant
Copy link
Contributor

erhant commented May 10, 2024

Is your feature request related to a problem? Please describe.

While using Ollama for LLM generation, if the model does not exist locally it will give an error saying that the model does not exist.

For example, the llm_ollama.rs example used llama2 (https://github.com/Abraxas-365/langchain-rust/blob/main/examples/llm_ollama.rs#L15) and if run the example you will get:

OpenAIError(ApiError(ApiError { message: "model 'llama2' not found, try pulling it first", type: Some("api_error"), param: None, code: None }))

Describe the solution you'd like
I would perhaps suggest that we provide a simple wrapper for Ollama in particular? We already have an OllamaEmbedder struct that kind of handles the client setup. We could have a more fine-grained setup where the embedder and LLM models are passed in by themselves.

  • For embedders, we dont need to pull explicitly as Ollama does it on their own (https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings)
  • For generation, we can always pull the model before generation is invoked. If the model does not exist, it will take some time to download it of course; if it exists it will take a few seconds to load it into memory. Then, generation will not give an error due to non-existing model.

Describe alternatives you've considered
We are using LangChain and Ollama-rs together right now as a workaround in our project https://github.com/andthattoo/dkn-search-node/, using the latter to pull the model.

@prabirshrestha
Copy link
Collaborator

I'm open for having Ollama for LLM generation. Currently ollama doesn't support open ai compatible api endpoints hence we have OllamaEmbedder. Even if it supports we might want it as there are other features such as keep-alive that isn't supported by open-ai. As long as we have flag for disabling auto pull and loading I'm ok with having this. Let us now if you are interested in a PR.

@erhant
Copy link
Contributor Author

erhant commented May 13, 2024

Sure I would like to give it a go! I feel like we will be repeating many of the setups that https://github.com/pepperoni21/ollama-rs already does, especially related to things like keep-alive and many other generation request settings.

It could perhaps be a feature-gated setting where if feature ollama is enabled, ollama-rs is used for all these tasks, embeddings as well.

@prabirshrestha
Copy link
Collaborator

Seems like ollama-rs might support function calling soon based on pepperoni21/ollama-rs#50 (comment). It will also be good to reuse it instead of create one.

@erhant would you like to contribute to Ollama llm?

@erhant
Copy link
Contributor Author

erhant commented May 17, 2024

Seems like ollama-rs might support function calling soon based on pepperoni21/ollama-rs#50 (comment). It will also be good to reuse it instead of create one.

@erhant would you like to contribute to Ollama llm?

#149 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants