Add Ollama multimodal llm support for image with prompt #14811

DohOnGit · 2023-12-17T08:00:23Z

Description: Adds a class OllamaMultiModal to the Ollama llms file: lanchain/llms/ollama.py. This class supports sending an image to the ollama endpoint for models that support the image with a prompt message.
Twitter handle: Daniel_OHeron

Passes make format, make lint, make test.

If this code change is useful. Will add unit test, notebook and documentation. As well as build out additional features. Please let me know.

Example usage:
`
from OllamaMultiModalMain import OllamaMultiModal
import base64

def encode_image_to_base64(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')

llm = OllamaMultiModal(model="bakllava")

while True:
user_input = input("User (text): ")

image_input = input("User (image path, press enter to skip): ")
image_data = encode_image_to_base64(image_input) if image_input else None

if image_data:
    response = llm.invoke(
        input=user_input, images=[image_data])
else:
    response = llm.invoke(input=user_input)
    # response = conversational_chain.predict(input=user_input)

print("Chat:", response)

`

vercel · 2023-12-17T08:00:27Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Dec 18, 2023 6:45am

…Modal.

DohOnGit · 2023-12-18T07:50:06Z

The last push was the rebased feature branch with all the latest additions added to langchain since pull request started. Not just the ones I made. Probably should not rebase my feature branch to the langchain master branch during a pull request going forward. Apologies .

jacoblee93 · 2023-12-19T02:46:53Z

Hey @DanielOHeron, thanks for the PR!

I've added support for this via a bound image param:

https://python.langchain.com/docs/integrations/llms/ollama#multi-modal

I think that is nicer for now since we don't need to create a second class. Going to close this for now.

DohOnGit · 2023-12-19T18:34:00Z

Hey @jacoblee93, This works great!

Thanks for solving this. Just had to install langchain-community for the latest Ollama integrations

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Dec 17, 2023

dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Dec 17, 2023

DohOnGit added 2 commits December 18, 2023 00:33

Add Ollama multimodal llm support for images. Added class OllamaMulti…

366dfdd

…Modal.

Added Spaces to Resolve Linting and Keep with Format Guidelines

3cbc34c

DohOnGit force-pushed the feature/ollama-multi-modal branch from 28c2b8f to 3cbc34c Compare December 18, 2023 06:45

jacoblee93 closed this Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ollama multimodal llm support for image with prompt #14811

Add Ollama multimodal llm support for image with prompt #14811

DohOnGit commented Dec 17, 2023

vercel bot commented Dec 17, 2023 •

edited

Loading

DohOnGit commented Dec 18, 2023

jacoblee93 commented Dec 19, 2023 •

edited

Loading

DohOnGit commented Dec 19, 2023

Add Ollama multimodal llm support for image with prompt #14811

Add Ollama multimodal llm support for image with prompt #14811

Conversation

DohOnGit commented Dec 17, 2023

vercel bot commented Dec 17, 2023 • edited Loading

DohOnGit commented Dec 18, 2023

jacoblee93 commented Dec 19, 2023 • edited Loading

DohOnGit commented Dec 19, 2023

vercel bot commented Dec 17, 2023 •

edited

Loading

jacoblee93 commented Dec 19, 2023 •

edited

Loading