Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OllamaDocumentEmbedder - allow batching embeddings #1224

Merged
merged 7 commits into from
Nov 28, 2024

Conversation

latifboubyan
Copy link
Contributor

Proposed Changes:

Use Ollama's batch embedding

How did you test it?

Manual verification as well as adding a unit test.

Checklist

@latifboubyan latifboubyan requested a review from a team as a code owner November 28, 2024 08:38
@latifboubyan latifboubyan requested review from davidsbatista and removed request for a team November 28, 2024 08:38
@CLAassistant
Copy link

CLAassistant commented Nov 28, 2024

CLA assistant check
All committers have signed the CLA.

@davidsbatista
Copy link
Contributor

@latifboubyan thanks for your contribution!

can you update the docstring of the def _embed_batch method?

]
reply = embedder.run(list_of_docs)
assert isinstance(reply, dict)
assert all(isinstance(element, float) for element in reply["documents"][0].embedding)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems you are only checking if the first document was embedded - can you make sure are asserting that all the 3 documents have an embedding?

@davidsbatista
Copy link
Contributor

overall, looks good to me, I've left too minor comments though - I will also ask @anakin87 to have a quick look

@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Nov 28, 2024
@anakin87 anakin87 changed the title refactor: use Ollama's batch embedding feat: OllamaDocumentEmbedder - allow batching embeddings Nov 28, 2024
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I pushed some minor refinements.

@anakin87 anakin87 merged commit eb2cfb1 into deepset-ai:main Nov 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration:ollama type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants