You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When running the Pipeline().run method, a KeyError: 'blob_data' is raised. This error originates from the WeaviateDocumentStore class in the haystack_integrations package.
Error message
# File "C:\RAG\issue_1.py", line 68, in <module>
# # router for different pipeline
# ^^^^^^^^^^^^^^^^^^^^^^^
# File "c:\Users\ttim3\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\pipeline.py", line 771, in run
# res = comp.run(**last_inputs[name])
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# File "c:\Users\ttim3\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack_integrations\components\retrievers\weaviate\embedding_retriever.py", line 74, in run
# documents = self._document_store._embedding_retrieval(
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# File "c:\Users\ttim3\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack_integrations\document_stores\weaviate\document_store.py", line 470, in _embedding_retrieval
# return [self._to_document(doc) for doc in result["data"]["Get"][collection_name]]
# ^^^^^^^^^^^^^^^^^^^^^^
# File "c:\Users\ttim3\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack_integrations\document_stores\weaviate\document_store.py", line 232, in _to_document
# data.pop("blob_data")
# KeyError: 'blob_data'
Expected behavior
Get query result.
Additional context
The script initializes a WeaviateDocumentStore, sets up documents with embeddings, and adds them to the document store. It then sets up a query pipeline with a text embedder and a WeaviateEmbeddingRetriever. When running a query through this pipeline, the error occurs.
It seems that the WeaviateDocumentStore is trying to remove the "blob_data" key from a dictionary, but this key does not exist in the dictionary. Due to the custom collection settings not containing the property "blob_data".
I did not see a requirement to add "blob_data" in the Weaviate or Haystack documentation.
To Reproduce
from haystack import Document
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack import Pipeline
from haystack_integrations.document_stores.weaviate.document_store import (
WeaviateDocumentStore,
)
from haystack.components.embedders import (
SentenceTransformersTextEmbedder,
SentenceTransformersDocumentEmbedder,
)
from haystack_integrations.components.retrievers.weaviate.embedding_retriever import (
WeaviateEmbeddingRetriever,
)
# initialize weaviate----
document_store = WeaviateDocumentStore(
url="http://localhost:8080",
collection_settings={
"class": "Article",
"properties": [
{
"name": "title",
"dataType": ["text"],
},
{
"name": "abstract",
"dataType": ["text"],
},
],
"vectorizer": "none",
},
)
# set up documents----
documents = [
Document(content="This is first", meta={"title": "hello", "abstract": "hello"}),
Document(content="This is second", meta={"name": "second"}),
]
text_embedder = SentenceTransformersTextEmbedder()
document_embedder = SentenceTransformersDocumentEmbedder()
document_embedder.warm_up()
documents_with_embeddings = document_embedder.run(documents)["documents"]
document_store.write_documents(documents_with_embeddings)
# set up query pipeline----
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
query_pipeline.add_component(
"retriever", WeaviateEmbeddingRetriever(document_store=document_store)
)
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
# query, retrieve and print result----
query = "Who lives in Berlin?"
result = query_pipeline.run({"text_embedder": {"text": query}})
Describe the bug
When running the Pipeline().run method, a KeyError: 'blob_data' is raised. This error originates from the WeaviateDocumentStore class in the haystack_integrations package.
Error message
Expected behavior
Get query result.
Additional context
The script initializes a WeaviateDocumentStore, sets up documents with embeddings, and adds them to the document store. It then sets up a query pipeline with a text embedder and a WeaviateEmbeddingRetriever. When running a query through this pipeline, the error occurs.
It seems that the WeaviateDocumentStore is trying to remove the "blob_data" key from a dictionary, but this key does not exist in the dictionary. Due to the custom collection settings not containing the property "blob_data".
I did not see a requirement to add "blob_data" in the Weaviate or Haystack documentation.
To Reproduce
System:
The text was updated successfully, but these errors were encountered: