Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement filtering for WeaviateDocumentStore #278

Merged
merged 13 commits into from
Feb 14, 2024
Merged

Conversation

silvanocerza
Copy link
Contributor

This PR add support for filters in WeaviateDocumentStore.filter_documents() method.

All tests are passing except for test_comparison_not_equal_with_dataframe.
For some reason that test is not returning what we expect from Weaviate.
I decided to skip it for the time being as I wanted to get this out of the door.

@silvanocerza silvanocerza requested a review from a team as a code owner January 26, 2024 17:24
@silvanocerza silvanocerza requested review from anakin87 and removed request for a team January 26, 2024 17:24
@silvanocerza silvanocerza self-assigned this Jan 26, 2024
@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Feb 13, 2024
@@ -28,6 +26,7 @@ dependencies = [
"haystack-ai",
"weaviate-client==3.*",
"haystack-pydoc-tools",
"python-dateutil",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add this to have proper parsing of ISO 8601 dates as datetime.fromisoformat() supports it properly only from 3.11.

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A great but challenging job!

I left two minor comments.

Comment on lines 126 to 137
if value is None:
# When the value is None and '>' is used we create a filter that would return a Document
# if it has a field set and not set at the same time.
# This will cause the filter to match no Document.
# This way we keep the behavior consistent with other Document Stores.
return {
"operator": "And",
"operands": [
{"path": field, "operator": "IsNull", "valueBoolean": False},
{"path": field, "operator": "IsNull", "valueBoolean": True},
],
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly, this is a trick to create a filter that returns no Documents.
It is done to keep the global filtering logic unaffected. Right?

If so, can we centralize that logic in a utility function?

if "field" not in condition:
# 'field' key is only found in comparison dictionaries.
# We assume this is a logic dictionary since it's not present.
return _parse_logical_condition(condition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some time ago, in another PR, Massi made me notice that this might be better handled in the caller. 🙂

@anakin87 anakin87 self-requested a review February 14, 2024 09:28
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@silvanocerza silvanocerza merged commit 3fad8ca into main Feb 14, 2024
9 checks passed
@silvanocerza silvanocerza deleted the weaviate-filters branch February 14, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration:weaviate type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants