-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6691] Query with QUALIFY on CTE generates a plan that references wrong columns #4061
base: main
Are you sure you want to change the base?
Conversation
Also add support for a QUALIFY clause in a query with a GROUP BY. In this case, the QUALIFY expression may reference aggregate functions. Previously such queries would give a ClassCastException (trying to convert a LogicalAggregate to a LogicalProject). If a QUALIFY expression references or duplicates an expression in the SELECT clause, we no longer detect and deduplicate that. This has made one or two plans more verbose. Potentially we would add back deduplication. Also converted a few collections from SqlNodeList to List<SqlNode>. Wrapping the lists as a SqlNode was not buying us much. Added method `SqlBasicVisitor.visitAll(List<SqlNode>)`. Close apache#4061
a0daaba
to
cdff494
Compare
Also add support for a QUALIFY clause in a query with a GROUP BY. In this case, the QUALIFY expression may reference aggregate functions. Previously such queries would give a ClassCastException (trying to convert a LogicalAggregate to a LogicalProject). If a QUALIFY expression references or duplicates an expression in the SELECT clause, we no longer detect and deduplicate that. This has made one or two plans more verbose. Potentially we would add back deduplication. Also converted a few collections from SqlNodeList to List<SqlNode>. Wrapping the lists as a SqlNode was not buying us much. Added method `SqlBasicVisitor.visitAll(List<SqlNode>)`. Close apache#4061
cdff494
to
295ea9d
Compare
Quality Gate passedIssues Measures |
*/ | ||
protected void convertAgg(Blackboard bb, SqlSelect select, | ||
List<SqlNode> orderExprList) { | ||
List<SqlNode> orderExprList, List<SqlNode> extraList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we call these orderByExpressions
and quailfyExpressions
?
LogicalFilter(condition=[$5]) | ||
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2], RANK_VAL=[$3], EXPR$0=[$4], QualifyExpression=[=($3, $4)]) | ||
LogicalFilter(condition=[$4]) | ||
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7], RANK_VAL=[RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC)], QualifyExpression=[=(RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC), $9)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a performance regression, since we are computing the window function multiple times? I believe this is the scenario that the comment "This is a very specific application of Common Subexpression Elimination" was trying to handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made note of this regression, and I'd rather it had not happened, but I don't view it as very serious. Project
(unlike Calc
) is not able to represent common subexpressions, so the appearance of duplicate expressions in Project
does not mean that they are going to be executed separately.
A reasonable planner configuration will do subexpression elimination. Sql-to-rel shouldn't try too hard to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
ImmutableList.of(addAlias(qualify, "QualifyExpression")), null); | ||
bb.setRoot( | ||
relBuilder.push(bb.root()) | ||
.filter(last(relBuilder.fields()))// filter on last column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a neat way of doing this :)
LogicalFilter(condition=[$5]) | ||
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2], RANK_VAL=[$3], EXPR$0=[$4], QualifyExpression=[=($3, $4)]) | ||
LogicalFilter(condition=[$4]) | ||
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7], RANK_VAL=[RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC)], QualifyExpression=[=(RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC), $9)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the JIRA mentioned something about the ProjectMerge logic in RelBuilder not handling the indexes in the window function. Did that ever get resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. The cause of this bug seems simple - the qualify expression needs to be translated relative to the inputs to the SELECT clause, not relative to the outputs - and when I did that, the query seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I that makes sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making this fix!
No description provided.