Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6691] Query with QUALIFY on CTE generates a plan that references wrong columns #4061

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

julianhyde
Copy link
Contributor

No description provided.

julianhyde added a commit to julianhyde/calcite that referenced this pull request Nov 22, 2024
Also add support for a QUALIFY clause in a query with a GROUP BY. In
this case, the QUALIFY expression may reference aggregate functions.
Previously such queries would give a ClassCastException (trying to
convert a LogicalAggregate to a LogicalProject).

If a QUALIFY expression references or duplicates an expression in the
SELECT clause, we no longer detect and deduplicate that. This has
made one or two plans more verbose. Potentially we would add back
deduplication.

Also converted a few collections from SqlNodeList to List<SqlNode>.
Wrapping the lists as a SqlNode was not buying us much. Added
method `SqlBasicVisitor.visitAll(List<SqlNode>)`.

Close apache#4061
Also add support for a QUALIFY clause in a query with a GROUP BY. In
this case, the QUALIFY expression may reference aggregate functions.
Previously such queries would give a ClassCastException (trying to
convert a LogicalAggregate to a LogicalProject).

If a QUALIFY expression references or duplicates an expression in the
SELECT clause, we no longer detect and deduplicate that. This has
made one or two plans more verbose. Potentially we would add back
deduplication.

Also converted a few collections from SqlNodeList to List<SqlNode>.
Wrapping the lists as a SqlNode was not buying us much. Added
method `SqlBasicVisitor.visitAll(List<SqlNode>)`.

Close apache#4061
Copy link

sonarcloud bot commented Nov 22, 2024

*/
protected void convertAgg(Blackboard bb, SqlSelect select,
List<SqlNode> orderExprList) {
List<SqlNode> orderExprList, List<SqlNode> extraList) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we call these orderByExpressions and quailfyExpressions?

LogicalFilter(condition=[$5])
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2], RANK_VAL=[$3], EXPR$0=[$4], QualifyExpression=[=($3, $4)])
LogicalFilter(condition=[$4])
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7], RANK_VAL=[RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC)], QualifyExpression=[=(RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC), $9)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a performance regression, since we are computing the window function multiple times? I believe this is the scenario that the comment "This is a very specific application of Common Subexpression Elimination" was trying to handle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made note of this regression, and I'd rather it had not happened, but I don't view it as very serious. Project (unlike Calc) is not able to represent common subexpressions, so the appearance of duplicate expressions in Project does not mean that they are going to be executed separately.

A reasonable planner configuration will do subexpression elimination. Sql-to-rel shouldn't try too hard to do that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

ImmutableList.of(addAlias(qualify, "QualifyExpression")), null);
bb.setRoot(
relBuilder.push(bb.root())
.filter(last(relBuilder.fields()))// filter on last column

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat way of doing this :)

LogicalFilter(condition=[$5])
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2], RANK_VAL=[$3], EXPR$0=[$4], QualifyExpression=[=($3, $4)])
LogicalFilter(condition=[$4])
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7], RANK_VAL=[RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC)], QualifyExpression=[=(RANK() OVER (PARTITION BY $1 ORDER BY $7 DESC), $9)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the JIRA mentioned something about the ProjectMerge logic in RelBuilder not handling the indexes in the window function. Did that ever get resolved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. The cause of this bug seems simple - the qualify expression needs to be translated relative to the inputs to the SELECT clause, not relative to the outputs - and when I did that, the query seems to work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I that makes sense to me

Copy link

@bchong95 bchong95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants