Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChromaDocumentStore fails to search if no metadata is given #462

Closed
JohnnyRacer opened this issue Feb 21, 2024 · 5 comments · Fixed by #863
Closed

ChromaDocumentStore fails to search if no metadata is given #462

JohnnyRacer opened this issue Feb 21, 2024 · 5 comments · Fixed by #863
Assignees
Labels
bug Something isn't working contributions wanted! Looking for external contributions integration:chroma P1

Comments

@JohnnyRacer
Copy link

Hello, I am trying to use the ChromaDocumentStore in my pipeline. I've noticed that if I do not add any metadata and try to perform a search, it will fail with the following error:

File /usr/local/lib/python3.10/dist-packages/haystack_integrations/document_stores/chroma/document_store.py:193, in ChromaDocumentStore.search(self, queries, top_k)
    187 """
    188 Perform vector search on the stored documents
    189 """
    190 results = self._collection.query(
    191     query_texts=queries, n_results=top_k, include=["embeddings", "documents", "metadatas", "distances"]
    192 )
--> 193 return self._query_result_to_documents(results)

File /usr/local/lib/python3.10/dist-packages/haystack_integrations/document_stores/chroma/document_store.py:331, in ChromaDocumentStore._query_result_to_documents(self, result)
    329 # prepare metadata
    330 if metadatas := result.get("metadatas"):
--> 331     document_dict["meta"] = dict(metadatas[i][j])
    333 if embeddings := result.get("embeddings"):
    334     document_dict["embedding"] = np.array(embeddings[i][j])

TypeError: 'NoneType' object is not iterable

This is the snippet that reproduces this error:

from haystack import Document
from haystack_integrations.document_stores.chroma import ChromaDocumentStore

ds = ChromaDocumentStore()
ds.write_documents([Document(content=e) for e in ["Hello world", "Whats up", "How are you"]] )
ds.search(["Hello world"], top_k=1)

The Document object does not seem to require the metadata argument so I assume this is unexpected behavior.

@masci
Copy link
Contributor

masci commented Feb 22, 2024

That definitely looks like a bug, I'll look into it. Moving this issue to the integration repo for my convenience, thanks for reporting!

@masci masci transferred this issue from deepset-ai/haystack Feb 22, 2024
@TuanaCelik
Copy link
Contributor

Hey @JohnnyRacer - thanks for reporting some bugs lately both here and on the haystack repo.
We'd love to hear what you're working on with Haystack 2.0 🚀
Feel free to join us on Discord or connect with me on Linkedin

@JohnnyRacer
Copy link
Author

@TuanaCelik Thanks, I will check out the links!

@anakin87
Copy link
Member

See also #668

@MarcSchluperAtIntel
Copy link

MarcSchluperAtIntel commented Apr 18, 2024

On line 404 of chroma/document_store.py, insert a line.

if metadatas := result.get("metadatas"):
        if metadatas[i][j] is not None:         # avoid issue #462
                document_dict["meta"] = dict(metadatas[i][j])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contributions wanted! Looking for external contributions integration:chroma P1
Projects
Development

Successfully merging a pull request may close this issue.

8 participants