-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support IGNORE|RESPECT
NULLs clause in window functions
#998
Conversation
tests/sqlparser_snowflake.rs
Outdated
let real_sql = "SELECT col_1, col_2, LAG(col_2) IGNORE NULLS OVER (ORDER BY col_1) FROM t1"; | ||
assert_eq!(snowflake().verified_stmt(real_sql).to_string(), real_sql); | ||
|
||
let first_sql = "SELECT column1, column2, FIRST_VALUE(column2) OVER (PARTITION BY column1 ORDER BY column2 NULLS LAST) AS column2_first FROM t1"; | ||
assert_eq!(snowflake().verified_stmt(first_sql).to_string(), first_sql); | ||
|
||
let sql_only_select = "SELECT LAG(col_2,1,0) OVER (PARTITION BY col_3 ORDER BY col_1)"; | ||
let select = snowflake().verified_only_select(sql_only_select); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this rather extends support for window function options Ignore/Respect NULLs vs snowflake specific, since the options are standard sql (e.g the test first_sql
and sql_select_only
should already be parsed without the MR)? If so it might be reasonable to instead extend the parse_function()
to optionally accept an option where applicable like the following?
enum WindowFunctionOption {
RespectNulls,
IgnoreNulls
}
Since parse_function already contains most of the functionality being added, and prevents window functions from being parsed into potentially different types in the AST which might not be desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @iffyio !
I extracted the null clause to an enum, and moved it to parse_function
.
I will appreciate any feedback.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice!
One thing that stood out was the mismatch between the field name and the typename:
nulls_clause: Option<WindowFunctionOption>
was curious and took a quick look and it turns out the option is mostly called null treatment clause - maybe we can rename accordingly and so they're consistent?
null_treatment: Option<NullTreatment>
Other than that I think the changes look good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @iffyio ! I looked for a better name but didn't found one, you did it!
Renamed :)
Pull Request Test Coverage Report for Build 6626368384
💛 - Coveralls |
IGNORE|RESPECT
NULLs clause in window functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks you for the contribution @yuval-illumex -- the basic idea is good but I think it needs a few cleanups
@@ -2259,18 +2263,34 @@ fn parse_agg_with_order_by() { | |||
Box::new(MsSqlDialect {}), | |||
Box::new(AnsiDialect {}), | |||
Box::new(HiveDialect {}), | |||
Box::new(SnowflakeDialect {}), | |||
], | |||
options: None, | |||
}; | |||
|
|||
for sql in [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were these tests changed?
This PR adds support for window functions (aka those that have an OVER
clause) which is different than the ORDER BY
clause in the aggregate function argument.
I think you should keep the existing tests and add a new one for the IGNORE NULLS / RESPECT NULLs syntax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @alamb .
Please look at the syntax of FIRST_VALUE
(for example) in some of the dialects:
https://docs.snowflake.com/en/sql-reference/functions/first_value
https://learn.microsoft.com/en-us/sql/t-sql/functions/first-value-transact-sql?view=sql-server-ver16
https://www.postgresqltutorial.com/postgresql-window-function/postgresql-first_value-function/
The Over
Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x)
is not valid.
Even if the tests pass, those tests are not aligned with the syntax.
That's why I added two arrays of dialects - for each dialect his supported syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Over Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x) is not valid.
I think that is true when the FIRST_VALUE function is being used as a window function
Some systems allow FIRST_VALUE
to be used as normal aggregate functions -- so like
SELECT
FIRST_VALUE(amount ORDER BY time),
LAST_VALUE(amount ORDER BY time)
FROM
t
GROUP BY
currency
Perhaps @mustafasrepo can offer some perspective as the author of #882
But basically I think we should retain these tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually FIRST_VALUE
and LAST_VALUE
can be used as window or aggregate function (similar to SUM
, MIN
, etc.). In these tests, they are used as aggregate function. Hence these tests should remain as is.
But having the window tests FIRST_VALUE
and LAST_VALUE
is also helpful. I think you shouldn't modify existing test, and add your test a a new test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb and @mustafasrepo :) Learned something about these functions :)
I maintained the old tests
Thank you for the review @iffyio -- I appreciate it |
Thank you @alamb for your feedback! Please see my comment on the tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yuval-illumex -- we are getting close I think
@@ -2259,18 +2263,34 @@ fn parse_agg_with_order_by() { | |||
Box::new(MsSqlDialect {}), | |||
Box::new(AnsiDialect {}), | |||
Box::new(HiveDialect {}), | |||
Box::new(SnowflakeDialect {}), | |||
], | |||
options: None, | |||
}; | |||
|
|||
for sql in [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Over Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x) is not valid.
I think that is true when the FIRST_VALUE function is being used as a window function
Some systems allow FIRST_VALUE
to be used as normal aggregate functions -- so like
SELECT
FIRST_VALUE(amount ORDER BY time),
LAST_VALUE(amount ORDER BY time)
FROM
t
GROUP BY
currency
Perhaps @mustafasrepo can offer some perspective as the author of #882
But basically I think we should retain these tests
Co-authored-by: Andrew Lamb <[email protected]>
Thanks @alamb - it's ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thanks again @yuval-illumex and @mustafasrepo
Co-authored-by: Andrew Lamb <[email protected]>
Adding Support for some Rank function of Snowflake:
FIRST_VALUE, LAG, LAST_VALUE,LEAD