Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TSQL grammar so that INTERSECT precedence is handled there #1259

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2789,14 +2789,21 @@ predicate
;

queryExpression
: unionExpression sqlIntersection*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this and the below can be greatly simplified using something such as

queryExpression
 : querySpecification
 | queryExpression UNION ALL queryExpression
 | queryExpression INTERSECT queryExpression
etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericvergnaud: This is interesting. I don't fully understand why the grammar is as it is. I already came up with this which also seems to be correct and is closer to what I would think of as the 'classical' way to express these rules:

queryExpression
    : LPAREN queryExpression RPAREN
    | queryExpression (UNION ALL? | EXCEPT) queryExpression
    | queryExpression INTERSECT queryExpression
    | querySpecification
    ;

Are there performance/implementation reasons for preferring the existing folding (??) style over this?

Copy link
Contributor

@jimidle jimidle Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably correct, precedence is generally best done as one rule with precedence then being explicit, then label the alts and the visitor is more simple as well.

Let me look at the existing grammar

Copy link
Contributor Author

@asnare asnare Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jimidle, @vil1 indeed suggested:

queryExpression
    : LPAREN queryExpression RPAREN                                         #inParen
    | left = queryExpression (UNION ALL? | EXCEPT) right = queryExpression  #union
    | left = queryExpression INTERSECT right = queryExpression              #intersect
    | querySpecification                                                    #simple
    ;

Copy link
Contributor

@jimidle jimidle Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, don' tuse the labels left/right as they generate more code. ctx.queryExpression(0), ctx.queryExpression(1) is more efficient. See the read me about working with antlr grammar, labels are useful for disambiguation, but even then it might mean splitting rules or labeling the alts is better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericvergnaud: This is interesting. I don't fully understand why the grammar is as it is. I already came up with this which also seems to be correct and is closer to what I would think of as the 'classical' way to express these rules:

queryExpression
    : LPAREN queryExpression RPAREN
    | queryExpression (UNION ALL? | EXCEPT) queryExpression
    | queryExpression INTERSECT queryExpression
    | querySpecification
    ;

Are there performance/implementation reasons for preferring the existing folding (??) style over this?

BTW the grammar is like this because it was just typed in by reading the MS docs, which don't encapsulate such things. You should have seen it 6 months ago ;). These are the occasions where we get to fix it up.

| LPAREN queryExpression RPAREN (INTERSECT queryExpression)?
;

unionExpression
: querySpecification sqlUnion*
| LPAREN queryExpression RPAREN (UNION ALL? queryExpression)?
;

sqlIntersection
: INTERSECT (unionExpression | (LPAREN queryExpression RPAREN))
;

sqlUnion
// TODO: Handle INTERSECT precedence in the grammar; it has higher precedence than EXCEPT and UNION ALL.
// Reference: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql?view=sql-server-ver16#:~:text=following%20precedence
: (UNION ALL? | EXCEPT | INTERSECT) (querySpecification | (LPAREN queryExpression RPAREN))
: (UNION ALL? | EXCEPT) (querySpecification | (LPAREN queryExpression RPAREN))
;

querySpecification
Expand Down
Loading