Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using multiple arguments for FilterRetriever #1010

Closed
hammerdirt-analyst opened this issue Aug 20, 2024 · 0 comments · Fixed by #1072
Closed

Using multiple arguments for FilterRetriever #1010

hammerdirt-analyst opened this issue Aug 20, 2024 · 0 comments · Fixed by #1072
Assignees
Labels
bug Something isn't working integration:chroma P2

Comments

@hammerdirt-analyst
Copy link

Describe the bug
Hi, I am using the FilterRetriever and having problems with applying more than one filter.

This from a Chroma vectorstore. WITH ONE FILTER it works, I can not fiugre out how to put multiple filters.

the pseudocode:
retriever = FilterRetriever(context_document_store)

report_filters = {
'meta-key-a': param-a,
'meta-key-b'': param-b
}

this does not work:
result = retriever.run(report_filters)

this does work:
result = retriever.run({ 'meta-key-a': param-a})

In other words i can only pass one argument to the filter. Is this a problem with Chroma or whay am i missing here.

To Reproduce
context_document_store = ChromaDocumentStore()

document_pipeline = Pipeline()
document_pipeline.add_component("converter", MarkdownToDocument())
document_pipeline.add_component("cleaner", DocumentCleaner())
document_pipeline.add_component("splitter", DocumentSplitter(split_by="sentence", split_length=10))
document_pipeline.add_component("writer", DocumentWriter(document_store=context_document_store))
document_pipeline.connect("converter", "cleaner")
document_pipeline.connect("cleaner", "splitter")
document_pipeline.connect("splitter", "writer")

document_pipeline.run({"converter": {
"sources": ["vaud_report_results.md", "report_results.md", "lac_leman_g70.md"],
'meta':[
{'doc-id': 'Vaud 2015-11-15 2021-12-31', 'topic':'reports'},
{'doc-id': 'Bern 2015-11-15 2021-12-31', 'topic':'reports'},
{'doc-id': 'Lac Léman 2015-11-15 2021-12-31', 'topic': 'reports'}
]
}})

pipeline = Pipeline()
pipeline.add_component("converter", PyPDFToDocument())
pipeline.add_component("cleaner", DocumentCleaner())
pipeline.add_component("splitter", DocumentSplitter(split_by="page", split_length=1))
pipeline.add_component("writer", DocumentWriter(document_store=context_document_store))
pipeline.connect("converter", "cleaner")
pipeline.connect("cleaner", "splitter")
pipeline.connect("splitter", "writer")

file_names = [
'resources/brief_history_marine_litter.pdf',
'resources/coastline_litter_threshold_value_report_14_9_2020_final.pdf',
'resources/revealing_the_role_of_landuse.pdf',
'resources/Walvoort-ea-2021-Modelling-Forecasting-Beach-Litter-Assessment-Values-1.pdf',
'resources/eu-guide-marine-litter-2023.pdf',
'resources/land-use-marine-litter-malaysia.pdf'
]
metas = [
{'topic': 'history of research, methods of research'},
{'topic': 'threshold values, methods of calculation'},
{'topic': 'geospatial analysis, land use, feature evaluation'},
{'topic': 'threshold values, methods of calculation'},
{'topic': 'sampling protocols, methods of research'},
{'topic': 'geospatial analysis, land use, feature evaluation'}
]

pipeline.run({"converter": {"sources": file_names, "meta": metas}})

filters = { "operator": "AND",
"conditions": [
{"field": "meta.topic", "operator": "in", "value":['threshold values, methods of calculation']},
{"field": "meta.doc-id", "operator": "in", "value":['Lac Léman 2015-11-15 2021-12-31']},
]
}
retriever = FilterRetriever(context_document_store)
result = retriever.run(filters)
Describe your environment (please complete the following information):

  • OS: Pop!_OS 22.04
  • Haystack version: haystack-ai 2.31 (from conda list)
  • Integration version:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integration:chroma P2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants