Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Union type with dataclass ambiguous error and support superset comparison #5858

Merged
merged 30 commits into from
Nov 20, 2024
Merged
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
f06cdc6
feat: fix Union type with dataclass ambiguous error
mao3267 Oct 18, 2024
8660db5
Merge branch 'master' of https://github.com/mao3267/flyte into fix/#5…
mao3267 Nov 1, 2024
47ccbd1
fix: direct json comparison for superset
mao3267 Nov 1, 2024
85489dc
fix: go.mod missing entry for error
mao3267 Nov 1, 2024
cc685bb
fix: update go module and sum
mao3267 Nov 1, 2024
3a629e1
refactor: gci format
mao3267 Nov 1, 2024
aa4d98e
test: add dataset casting tests for same (one/two levels) and supers…
mao3267 Nov 1, 2024
b282e5f
Merge branch 'master' of https://github.com/mao3267/flyte into fix/#5…
mao3267 Nov 8, 2024
818afb7
fix: support Pydantic BaseModel comparison
mao3267 Nov 8, 2024
d6468b6
fix: handle nested pydantic basemodel
mao3267 Nov 8, 2024
ada05ed
Reviews from Eduardo
Future-Outlier Nov 11, 2024
56623e3
fix: support strict subset match
mao3267 Nov 15, 2024
b8f38a7
test: update strict subset match test
mao3267 Nov 15, 2024
b698769
fix: missing go mod entry
mao3267 Nov 15, 2024
70ad767
fix: missing go mod entry
mao3267 Nov 15, 2024
9dc3fa6
fix: go mod entry
mao3267 Nov 15, 2024
b224a02
make go-tidy
Future-Outlier Nov 15, 2024
7ed9be2
comments
Future-Outlier Nov 15, 2024
6fe8871
Merge branch 'fix/#5489-dataclass-mismatch' of https://github.com/mao…
mao3267 Nov 15, 2024
64343c8
fix: strict subset match with draft 2020-12 mashumaro
mao3267 Nov 18, 2024
8ccced5
Merge branch 'master' of https://github.com/mao3267/flyte into fix/#5…
mao3267 Nov 18, 2024
0def0ad
refactor: make go-tidy
mao3267 Nov 18, 2024
81445a7
fix: support strict subset match with ambiguity
mao3267 Nov 19, 2024
9b19f04
fix: change test name and fix err
mao3267 Nov 19, 2024
30aa096
Add comments
Future-Outlier Nov 20, 2024
e62ba6e
nit
Future-Outlier Nov 20, 2024
c6ac729
add flytectl go-tidy in makefile
Future-Outlier Nov 20, 2024
ba4d6f1
nit
Future-Outlier Nov 20, 2024
86a395e
fix: add comment for error checking
mao3267 Nov 20, 2024
7f28c35
test: basemodel castable test, two level dataclass and ParentToChild …
mao3267 Nov 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions flytepropeller/pkg/compiler/validators/typing.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@
}

func removeTitleFieldFromProperties(schema map[string]*structpb.Value) {
// TODO: Explain why we need this
// TODO: givse me example about dataclass vs. Pydantic BaseModel
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example comparing dataclass and Pydantic BaseModel. As shown, the schema for dataclass includes a title field that records the name of the class. Additionally, the additionalProperties field is absent from the Pydantic BaseModel schema because its value is false. cc @eapolinario

dataclass Pydantic.BaseModel
image image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add the comment, writing the entire schema would make it too lengthy. Would it be acceptable to use something like this instead?

class A:
	a: int

Pydantic.BaseModel: 	{"properties": {"a": {"title": "A", "type": "integer"}}}
dataclass: 			{"properties": {"a": {"type": "integer"}}, "additionalProperties": false}

Copy link
Member

@fg91 fg91 Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing to preprocess the schemas so that one can mix and match dataclasses and base models given their schemas are aligned? I.e. task expects a dataclass with schema "A" and I pass a base model that has the same schema.

I personally feel this is not necessary and think it would be totally acceptable to consider a dataclass and a base model not a match by default. Especially if this makes things a lot more complicated in the backend otherwise because the schemas need to be aligned. What do you think about this?

If you are confident in the logic I'm of course not opposing the feature but if you feel this makes things complicated and brittle, I'd rather keep it simple and more robust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually make things more complicated, will remove related logic.

properties, ok := schema["properties"]
if !ok {
return
}

Check warning on line 27 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L26-L27

Added lines #L26 - L27 were not covered by tests

for _, p := range properties.GetStructValue().Fields {
if _, ok := p.GetStructValue().Fields["properties"]; ok {
Expand All @@ -32,20 +34,20 @@
}
}

func resolveRef(schema, defs map[string]*structpb.Value) {
// Schema from Pydantic BaseModel includes a $def field, which is a reference to the actual schema.
// We need to resolve the reference to compare the schema with those from marshumaro.
// https://github.com/flyteorg/flytekit/blob/3475ddc41f2ba31d23dd072362be704d7c2470a0/flytekit/core/type_engine.py#L632-L641
for _, p := range schema["properties"].GetStructValue().Fields {
if _, ok := p.GetStructValue().Fields["$ref"]; ok {
propName := strings.TrimPrefix(p.GetStructValue().Fields["$ref"].GetStringValue(), "#/$defs/")
p.GetStructValue().Fields = defs[propName].GetStructValue().Fields
resolveRef(p.GetStructValue().Fields, defs)
delete(p.GetStructValue().Fields, "$ref")
}

Check warning on line 47 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L37-L47

Added lines #L37 - L47 were not covered by tests
}

delete(schema, "$defs")

Check warning on line 50 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L50

Added line #L50 was not covered by tests
}

func isSuperTypeInJSON(sourceMetaData, targetMetaData *structpb.Struct) bool {
Expand All @@ -56,8 +58,8 @@

// We only support super type check for draft 2020-12
if upstreamIsDraft7 || downstreamIsDraft7 {
return false
}

Check warning on line 62 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L61-L62

Added lines #L61 - L62 were not covered by tests

copySrcSchema := make(map[string]*structpb.Value)
copyTgtSchema := make(map[string]*structpb.Value)
Expand All @@ -72,11 +74,11 @@

// For nested Pydantic BaseModel, we need to resolve the reference to compare the schema.
if _, ok := copySrcSchema["$defs"]; ok {
resolveRef(copySrcSchema, copySrcSchema["$defs"].GetStructValue().Fields)
}

Check warning on line 78 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L77-L78

Added lines #L77 - L78 were not covered by tests
if _, ok := copyTgtSchema["$defs"]; ok {
resolveRef(copyTgtSchema, copyTgtSchema["$defs"].GetStructValue().Fields)
}

Check warning on line 81 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L80-L81

Added lines #L80 - L81 were not covered by tests
// The JSON schema generated by Pydantic.BaseModel includes a title field in its properties, repeatedly recording the property name.
// Since this title field is absent in the JSON schema generated for dataclass, we need to remove the title field from the properties to ensure equivalence.
removeTitleFieldFromProperties(copySrcSchema)
Expand All @@ -90,16 +92,16 @@
// If additionalProperties is false, the field is not present in the schema from Pydantic.BaseModel.
// We handle this case by checking the relationships by ourselves.
if p.Type != jsondiff.OperationAdd && strings.Contains(p.Path, "additionalProperties") {
if p.Type == jsondiff.OperationRemove || p.Type == jsondiff.OperationReplace {
if p.OldValue != false {
return false
}

Check warning on line 98 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L95-L98

Added lines #L95 - L98 were not covered by tests
}
} else if p.Type != jsondiff.OperationAdd {
return false

Check warning on line 101 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L101

Added line #L101 was not covered by tests
} else if strings.Contains(p.Path, "required") {
return false
}

Check warning on line 104 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L103-L104

Added lines #L103 - L104 were not covered by tests
}
return true
}
Expand All @@ -112,8 +114,8 @@

// If the schema version is different, we can't compare them.
if upstreamIsDraft7 != downstreamIsDraft7 {
return false
}

Check warning on line 118 in flytepropeller/pkg/compiler/validators/typing.go

View check run for this annotation

Codecov / codecov/patch

flytepropeller/pkg/compiler/validators/typing.go#L117-L118

Added lines #L117 - L118 were not covered by tests

copySrcSchema := make(map[string]*structpb.Value)
copyTgtSchema := make(map[string]*structpb.Value)
Expand Down
Loading