-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move sum test to slt or optimizer_integration #10807
Conversation
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@@ -861,105 +861,6 @@ mod test { | |||
assert_eq!(expected, formatted_plan); | |||
} | |||
|
|||
#[test] | |||
fn id_array_visitor() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand why you moved these tests to the core crate (as they rely on aggregate functions that will soon not be direct dependencies on datafusion-optimizer
However, with this change now the tests for individual passes may be spread in two places (core and optimizer) so evaluating test coverage will be harder. I think we should try to keep all the tests together
Also, I think it is quite nice that the tests are with each optimizer pass (I found it quite helpful when I was working on avoiding as many copies in the optimizer).
Options I can see:
- Move all the other optimizer tests over to
datafusion/core
- find a way to leave the tests in
datafusion-optimizer
One idea I had about keeping the tests in datafusion-optimizer
would be to add stu aggregate functions to the optimizer crate (for example we could add a sum()
AggregateUDF
that had the same signature as sum but didn't actually run (would panic if we tried to create an accumulator, etc)?
The downside is that then there is a danger that the stubs don't behave the same way as the actual functions and there are bugs / limitations when they are used together...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example we could add a sum() AggregateUDF that had the same signature as sum
Introduce another similar function for test only increase the maintain cost unless they are not changed frequently 🤔 I also prefer that the tests stay close to the code, maybe it is a acceptable tradeoff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, there are tradeoffs both way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update is that @jayzhan211 prototyped the stub approach here #10816
Which issue does this PR close?
Part of #10731
Rationale for this change
What changes are included in this PR?
I move the test to slt if possible and trivial. Others are moved to core/tests
Are these changes tested?
Are there any user-facing changes?