-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [FC-0074] add support for annotated python dicts as avro map type #433
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
c3a47d0
feat: add support for Avro dict
Ian2012 ee00977
refactor: add support for dicts as avro maps instead of record
mariajgrimaldi aa080fc
test: add more test cases for serializing dicts
mariajgrimaldi a71cddd
refactor: address quality issues from latest commits
mariajgrimaldi a76066e
refactor: address PR reviews
mariajgrimaldi 29b9530
fix: address quality issues
mariajgrimaldi c3d9c29
test: add deserializer tests for dicts
mariajgrimaldi 0bb5686
test: add tests for deserialization errors
mariajgrimaldi 019c3fc
test: add test cases for schema generation with dict types
mariajgrimaldi 2470afe
fix: address quality issues
mariajgrimaldi cf60a69
refactor: rewrite comment with suggestion
mariajgrimaldi 32fcff0
refactor: include map type instead of replacing record
mariajgrimaldi a2c76f5
refactor: drop changes in forum events
mariajgrimaldi dbbdd06
refactor: address PR reviews
mariajgrimaldi 2769920
test: add missing test for de-serializing complex types
mariajgrimaldi 4cd3481
refactor: check for keys and values simple types while de-serializing
mariajgrimaldi bfe255c
docs: update docs for next release
mariajgrimaldi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,4 +5,4 @@ | |
more information about the project. | ||
""" | ||
|
||
__version__ = "9.15.2" | ||
__version__ = "9.16.0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
import io | ||
import os | ||
from datetime import datetime | ||
from typing import List | ||
from typing import List, Union | ||
from unittest import TestCase | ||
from uuid import UUID, uuid4 | ||
|
||
|
@@ -43,6 +43,7 @@ def generate_test_data_for_schema(schema): # pragma: no cover | |
'string': "default", | ||
'double': 1.0, | ||
'null': None, | ||
'map': {'key': 'value'}, | ||
} | ||
|
||
def get_default_value_or_raise(schema_field_type): | ||
|
@@ -71,6 +72,9 @@ def get_default_value_or_raise(schema_field_type): | |
elif sub_field_type == "record": | ||
# if we're dealing with a record, recurse into the record | ||
data_dict.update({key: generate_test_data_for_schema(field_type)}) | ||
elif sub_field_type == "map": | ||
# if we're dealing with a map, "values" will be the type of values in the map | ||
data_dict.update({key: {"key": get_default_value_or_raise(field_type["values"])}}) | ||
else: | ||
raise Exception(f"Unsupported type {field_type}") # pylint: disable=broad-exception-raised | ||
|
||
|
@@ -112,6 +116,24 @@ def generate_test_event_data_for_data_type(data_type): # pragma: no cover | |
datetime: datetime.now(), | ||
CCXLocator: CCXLocator(org='edx', course='DemoX', run='Demo_course', ccx='1'), | ||
UUID: uuid4(), | ||
dict[str, str]: {'key': 'value'}, | ||
dict[str, int]: {'key': 1}, | ||
dict[str, float]: {'key': 1.0}, | ||
dict[str, bool]: {'key': True}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about union types: dict[str, Union[str, int]]: ..., There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added more test cases and this covers even more than I initially thought: a701f78. Thanks for the suggestion! |
||
dict[str, CourseKey]: {'key': CourseKey.from_string("course-v1:edX+DemoX.1+2014")}, | ||
dict[str, UsageKey]: {'key': UsageKey.from_string( | ||
"block-v1:edx+DemoX+Demo_course+type@video+block@UaEBjyMjcLW65gaTXggB93WmvoxGAJa0JeHRrDThk", | ||
)}, | ||
dict[str, LibraryLocatorV2]: {'key': LibraryLocatorV2.from_string('lib:MITx:reallyhardproblems')}, | ||
dict[str, LibraryUsageLocatorV2]: { | ||
'key': LibraryUsageLocatorV2.from_string('lb:MITx:reallyhardproblems:problem:problem1'), | ||
}, | ||
dict[str, List[int]]: {'key': [1, 2, 3]}, | ||
dict[str, List[str]]: {'key': ["hi", "there"]}, | ||
dict[str, dict[str, str]]: {'key': {'key': 'value'}}, | ||
dict[str, dict[str, int]]: {'key': {'key': 1}}, | ||
dict[str, Union[str, int]]: {'key': 'value'}, | ||
dict[str, Union[str, int, float]]: {'key': 1.0}, | ||
} | ||
data_dict = {} | ||
for attribute in data_type.__attrs_attrs__: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change, right? Were dicts only introduced and used with the forums events, which you plan to break anyway? You don't have a changelog entry or a major version change (yet), but is that your plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, dictionaries were never fully supported. Besides this list with known unserializable events, which contain dictionaries in their payloads, we can confirm this by generating a schema for an event using dictionaries. Let's take this event as an example:
Which immediately raises a serialization error:
Because dicts were not a type supported in event_bus/avro/schema.py::_create_avro_field_definition before this PR. So no events with type dicts were ever sent through the event bus. Forum events were not supported either since they were listed in the known unserialiazable events, but I included them here as an example and will remove them shortly.
To maintain backward compatibility, I'll include the map type as an additional type instead of changing "record" to "map".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mariajgrimaldi.
/schemas
introduced in https://github.com/openedx/openedx-events/pull/225/files for unit testing? Is that part of the missing coverage?UPDATE: If we are missing test coverage, it would be great to get that back. Or, if this is a non-breaking change, then you should drop "record" to clean up the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some tests to solve the missing coverage. However, I still wonder if we're breaking something by replacing "record" with "map". I also checked the /schemas added in https://github.com/openedx/openedx-events/pull/225/files and they are still to be in the repository. Could you clarify which schemas are missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I now see the schemas. In theory, if our tests are working as planned, these should be snapshots of all the schemas and if there is no change detected, all would be good.
You say you fear dropping record is a breaking change. Is it possible to create a breaking test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, changing "record" to "map" should not be a breaking change since dicts were not supported before this PR. By testing the code with dicts without any support, I can infer that
L66
was only used to warn developers that dicts (mapped to avro records, which is a type we can't use because it clashes with the data attributes mapping) without annotations were not allowed, but either dicts with annotations since they were not supported. In any case, I added a test case that checks that the schema generated for dicts should map tomap
for future reference.Although I said I wondered whether this was a breaking change given my findings and considering what you mentioned before, I think we should trust the repo's coverage.