-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(substrait): add set operations to consumer, update substrait to 0.45.0
#12863
Conversation
0.44.0
@@ -764,6 +768,7 @@ pub fn operator_to_name(op: Operator) -> &'static str { | |||
} | |||
} | |||
|
|||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this deprecated use required for changes in the substrait dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.. grouping_expressions
field was deprecated in the proto definitions, but should still be used for some time not to break backwards-compatibility. Maybe there's some other way.. I just saw similar deprecated markers and followed suit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not a rush, and might be better to stick with the old approach for now for producers until consumers have had a chance to catch up.
The new approach puts the grouping_expressions
in the AggregateRel
(so in to_substrait_agg_measure
) and not in the Grouping
. Instead, the Grouping
has a list of indices into the AggregateRel
's grouping_expressions
.
E.g. instead of...
AggregateRel = {
"groupings": [
{ "grouping_expressions": [expr_1, epxr_2] }
]
}
You would have...
AggregateRel = {
"grouping_expressions": [expr_1, expr_2],
"groupings": [
{ "expression_references": [0, 1] }
]
}
This makes it easier to recognize something like a rollup:
AggregateRel = {
"grouping_expressions": [expr_1, expr_2],
"groupings": [
{ "expression_references": [0, 1] },
{ "expression_references": [0] },
{ "expression_references": [] }
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.. grouping_expressions field was deprecated in the proto definitions, but should still be used for some time not to break backwards-compatibility.
+1 for retaining both for backwards compatability
I've created #12957 to track that we deferred implementing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me @tokoko -- thank you 🙏 Well commented and tested
cc @Blizzara @vbarua and @westonpace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks reasonable to me. Left some minor comments.
@@ -764,6 +768,7 @@ pub fn operator_to_name(op: Operator) -> &'static str { | |||
} | |||
} | |||
|
|||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.. grouping_expressions field was deprecated in the proto definitions, but should still be used for some time not to break backwards-compatibility.
+1 for retaining both for backwards compatability
I've created #12957 to track that we deferred implementing this.
0.44.0
0.45.0
Awesome -- thank you @tokoko |
Rationale for this change
Adds support for set operations to substrait consumer except for Multiset Minus (still unsure what that one does exactly...)
What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?
Yes