Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AstraDB filtering by id doesn't work #1037

Closed
davidsbatista opened this issue Aug 29, 2024 · 0 comments · Fixed by #1053
Closed

AstraDB filtering by id doesn't work #1037

davidsbatista opened this issue Aug 29, 2024 · 0 comments · Fixed by #1053
Labels
bug Something isn't working integration:astra P3

Comments

@davidsbatista
Copy link
Contributor

Describe the bug
Filtering a document by id is not working, PoC below.

To Reproduce

  • Load some documents into the AstraDB
from haystack import Document
from haystack_experimental.components.splitters import HierarchicalDocumentSplitter
from haystack_integrations.document_stores.astra import AstraDocumentStore

doc_store = AstraDocumentStore()

docs = [Document(content="The monarch of the wild blue yonder rises from the eastern side of the horizon.")]
builder = HierarchicalDocumentSplitter(block_sizes={10, 3}, split_overlap=0, split_by="word")
docs = builder.run(docs)

for doc in docs["documents"]:
    if doc.meta["__level"] == 1:
        doc_store.write_documents([doc])
  • Confirm the documents are there:
doc_store.filter_documents()
  • Filter by id one of the documents, returns nothing
doc_store.filter_documents({"field": "id", "operator": "==", "value": "07d0889d132dec2420df0bf11e7db0aa8ad48d14649297e24196f9248cfdbfc6"})
>> []
  • Changing the "id" to "_id"
doc_store.filter_documents({"field": "_id", "operator": "==", "value": "07d0889d132dec2420df0bf11e7db0aa8ad48d14649297e24196f9248cfdbfc6"})
  • Makes it work:
Out[8]: [Document(id=07d0889d132dec2420df0bf11e7db0aa8ad48d14649297e24196f9248cfdbfc6, content: 'eastern side of the horizon.', meta: {'__block_size': 10, '__parent_id': 'a137aa4ef681e00b66e16e93d48b8544dc0bd4c358cf980bd9a376d2520395a3', '__children_ids': ['f0f2d98a6298facd3066dad8ba36342b8d4cb5a9824d74deb3527c006a2137aa', 'f3c4a2e0c9a1e268dc97db1e89584dc8affac3bce1fb33d262b14b3c827c942f'], '__level': 1, 'source_id': 'a137aa4ef681e00b66e16e93d48b8544dc0bd4c358cf980bd9a376d2520395a3', 'page_number': 1, 'split_id': 1, 'split_idx_start': 51})]

Describe your environment (please complete the following information):

  • OS: [e.g. iOS] mascOS Sonoma 14.1
  • Haystack version: 2.4
  • Integration version: astra-haystack-0.9.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integration:astra P3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants