Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50377][SQL] Allow to evaluate foldable RuntimeReplaceable #48912

Closed
wants to merge 2 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is to fix a regression caused by #47143 . The problem is, in some places, we want to get a constant from a foldable expression before the query execution starts. #47143 brings two problems:

  1. UnaryPositive is no longer a UnaryExpression, which means it's not foldable anymore even if its child is foldable.
  2. UnaryPositive is no longer evaluable.

Lag is such a place. It may evaluate the inputOffset parameter eagerly. lag(..., +1) no longer works after #47143 .

Instead of fixing Lag, this PR makes two changes and hopefully we can avoid all similar problems:

  1. Make UnaryPositive extend UnaryExpression again. We need follow-up PRs to check other RuntimeReplaceable expressions and see if they should extend UnaryExpression or BinaryExpression, etc.
  2. Implement RuntimeReplaceable#eval so that we can evaluate folding RuntimeReplaceable eagerly when needed.

Why are the changes needed?

Fix the regression on lag and avoid similar issues in the future.

Does this PR introduce any user-facing change?

No, the regression is not released yet.

How was this patch tested?

new test

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Nov 21, 2024
@cloud-fan
Copy link
Contributor Author

cc @yaooqinn @LuciferYang

final override def eval(input: InternalRow = null): Any = {
// For convenience, we allow to evaluate `RuntimeReplaceable` expressions, in case we need to
// get a constant from foldable expression before the query execution starts.
assert(input == null)
Copy link
Member

@MaxGekk MaxGekk Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you clarify why do you require null instead of just bypass input to replacement.eval()

Copy link
Contributor

@LuciferYang LuciferYang Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the premise here must be that input == null, then would it be better to throw a meaningful QueryExecutionError instead of an AssertionError when input is not null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally people should never eagerly evaluate a RuntimeReplaceable, the only case I see is evaluating foldable expressions. I think it's better to only allow it, to make sure that there is no RuntimeReplaceable leaked to the runtime execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use assert as this means bug, not a user error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use assert as this means bug, not a user error.

Got it,thanks

@cloud-fan
Copy link
Contributor Author

thanks for the review, merging to master!

@cloud-fan cloud-fan closed this in 0fbd34c Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants