Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Adjust serialization to handle PEP-585 generic types #7690

Merged
merged 4 commits into from
May 15, 2024

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented May 13, 2024

Why:

Improves type serialization, particularly focusing on better support for PEP 585 types (e.g., list[Document], including nested versions). This enhancement is crucial for enabling more accurate serialization of generics and nested types, which directly impacts the ability to match types like list[X] and List[X] in component connections after serialization.

What:

  • haystack/utils/type_serialization.py modification: Introduced better handling of PEP 585 types
  • Test enhancements: Added multiple test cases in test/utils/test_type_serialization.py to verify the serialization and deserialization of PEP 585 types, including more complex nested types like list[list[int]].

How can it be used:

  • Serialization of PEP 585 and nested types: Now users can serialize types like list[int] or more complex nested structures like list[list[str]] directly.
    # Example: Serializing a nested list type
    serialized_type = serialize_type(list[list[int]])
  • Deserialization for advanced type checks: Following serialization, types can be accurately deserialized, maintaining their structure for further type checks or processing.
    # Example: Deserializing nested types back to their original form
    original_type = deserialize_type("list[list[int]]")

How did you test it:

  • Added unit tests for both serialization and deserialization processes.
  • Verified the serialization of PEP 585 types like list[int], including deeply nested types (e.g., list[list[list[int]]]).
  • Ensured the deserialization process could accurately reconstruct the original types from their serialized form.
  • These tests validate the changes by ensuring that both serialization and deserialization work as expected with PEP 585 types and their nested versions.

Notes for the reviewer:

  • Impact on existing serialization: Reviewers should consider how these changes might interact with existing serialized types, particularly in edge cases not covered by the test scenarios.

@vblagoje vblagoje added this to the 2.1.2 milestone May 13, 2024
@vblagoje vblagoje requested review from a team as code owners May 13, 2024 14:36
@vblagoje vblagoje requested review from dfokina and julian-risch and removed request for a team May 13, 2024 14:36
@vblagoje
Copy link
Member Author

Update: this fails for python 3.8 only, other versions work. Please hold the review @julian-risch

@vblagoje
Copy link
Member Author

@julian-risch this should be ready now, please test in both 3.8 and 3.9 Python version and verify everything.

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 9069137605

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 15 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.06%) to 90.357%

Files with Coverage Reduction New Missed Lines %
components/fetchers/link_content.py 5 78.31%
utils/type_serialization.py 10 83.58%
Totals Coverage Status
Change from base Build 9030307905: -0.06%
Covered Lines: 6531
Relevant Lines: 7228

💛 - Coveralls

Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 I also tried out the code example from the issue with python 3.8 and 3.10.
Only thing I wondered about is why we raise a ValueError instead of a SerializationError, which we define in haystack.core.errors. It doesn't affect this PR though and I would prefer keeping this PR as simple as small as possible.

@vblagoje
Copy link
Member Author

LGTM! 👍 I also tried out the code example from the issue with python 3.8 and 3.10. Only thing I wondered about is why we raise a ValueError instead of a SerializationError, which we define in haystack.core.errors. It doesn't affect this PR though and I would prefer keeping this PR as simple as small as possible.

SerializationError was defined around the time when this type serialization code was written, lack of awareness most likely. Let's convert errors in a separate PR.

@vblagoje vblagoje merged commit c8d53b3 into main May 15, 2024
25 checks passed
@vblagoje vblagoje deleted the pep_585_type_ser branch May 15, 2024 12:25
davidsbatista pushed a commit that referenced this pull request May 16, 2024
* Adjust serialization to handle PEP-585 generic types

* Add reno note

* Simplify

* PEP 585 serialization handling in sys.version_info < (3, 9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type serialization/deserialization failure with PEP-585 generics
3 participants