Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move sum test to slt or optimizer_integration #10807

Closed
wants to merge 10 commits into from

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Jun 5, 2024

Which issue does this PR close?

Part of #10731

Rationale for this change

What changes are included in this PR?

I move the test to slt if possible and trivial. Others are moved to core/tests

Are these changes tested?

Are there any user-facing changes?

Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jun 5, 2024
@jayzhan211 jayzhan211 marked this pull request as ready for review June 6, 2024 11:37
@@ -861,105 +861,6 @@ mod test {
assert_eq!(expected, formatted_plan);
}

#[test]
fn id_array_visitor() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why you moved these tests to the core crate (as they rely on aggregate functions that will soon not be direct dependencies on datafusion-optimizer

However, with this change now the tests for individual passes may be spread in two places (core and optimizer) so evaluating test coverage will be harder. I think we should try to keep all the tests together

Also, I think it is quite nice that the tests are with each optimizer pass (I found it quite helpful when I was working on avoiding as many copies in the optimizer).

Options I can see:

  1. Move all the other optimizer tests over to datafusion/core
  2. find a way to leave the tests in datafusion-optimizer

One idea I had about keeping the tests in datafusion-optimizer would be to add stu aggregate functions to the optimizer crate (for example we could add a sum() AggregateUDF that had the same signature as sum but didn't actually run (would panic if we tried to create an accumulator, etc)?

The downside is that then there is a danger that the stubs don't behave the same way as the actual functions and there are bugs / limitations when they are used together...

Copy link
Contributor Author

@jayzhan211 jayzhan211 Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example we could add a sum() AggregateUDF that had the same signature as sum

Introduce another similar function for test only increase the maintain cost unless they are not changed frequently 🤔 I also prefer that the tests stay close to the code, maybe it is a acceptable tradeoff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are tradeoffs both way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update is that @jayzhan211 prototyped the stub approach here #10816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants