Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support IGNORE|RESPECT NULLs clause in window functions #998

Merged
merged 14 commits into from
Oct 24, 2023

Conversation

yuval-illumex
Copy link
Contributor

Adding Support for some Rank function of Snowflake:
FIRST_VALUE, LAG, LAST_VALUE,LEAD

Comment on lines 1070 to 1077
let real_sql = "SELECT col_1, col_2, LAG(col_2) IGNORE NULLS OVER (ORDER BY col_1) FROM t1";
assert_eq!(snowflake().verified_stmt(real_sql).to_string(), real_sql);

let first_sql = "SELECT column1, column2, FIRST_VALUE(column2) OVER (PARTITION BY column1 ORDER BY column2 NULLS LAST) AS column2_first FROM t1";
assert_eq!(snowflake().verified_stmt(first_sql).to_string(), first_sql);

let sql_only_select = "SELECT LAG(col_2,1,0) OVER (PARTITION BY col_3 ORDER BY col_1)";
let select = snowflake().verified_only_select(sql_only_select);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this rather extends support for window function options Ignore/Respect NULLs vs snowflake specific, since the options are standard sql (e.g the test first_sql and sql_select_only should already be parsed without the MR)? If so it might be reasonable to instead extend the parse_function() to optionally accept an option where applicable like the following?

enum WindowFunctionOption {
	RespectNulls,
	IgnoreNulls
}

Since parse_function already contains most of the functionality being added, and prevents window functions from being parsed into potentially different types in the AST which might not be desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @iffyio !
I extracted the null clause to an enum, and moved it to parse_function.
I will appreciate any feedback.

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice!

One thing that stood out was the mismatch between the field name and the typename:

nulls_clause: Option<WindowFunctionOption>

was curious and took a quick look and it turns out the option is mostly called null treatment clause - maybe we can rename accordingly and so they're consistent?

null_treatment: Option<NullTreatment>

Other than that I think the changes look good!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iffyio ! I looked for a better name but didn't found one, you did it!
Renamed :)

@coveralls
Copy link

coveralls commented Oct 12, 2023

Pull Request Test Coverage Report for Build 6626368384

  • 61 of 62 (98.39%) changed or added relevant lines in 12 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.04%) to 87.415%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/parser/mod.rs 10 11 90.91%
Totals Coverage Status
Change from base Build 6626248059: 0.04%
Covered Lines: 17066
Relevant Lines: 19523

💛 - Coveralls

@alamb alamb changed the title Snowflake - Add Rank Function Support Support IGNORE|RESPECT NULLs clause in window functions Oct 20, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you for the contribution @yuval-illumex -- the basic idea is good but I think it needs a few cleanups

test.sql Outdated Show resolved Hide resolved
src/ast/mod.rs Show resolved Hide resolved
src/ast/mod.rs Show resolved Hide resolved
@@ -2259,18 +2263,34 @@ fn parse_agg_with_order_by() {
Box::new(MsSqlDialect {}),
Box::new(AnsiDialect {}),
Box::new(HiveDialect {}),
Box::new(SnowflakeDialect {}),
],
options: None,
};

for sql in [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these tests changed?

This PR adds support for window functions (aka those that have an OVER clause) which is different than the ORDER BY clause in the aggregate function argument.

I think you should keep the existing tests and add a new one for the IGNORE NULLS / RESPECT NULLs syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @alamb .
Please look at the syntax of FIRST_VALUE (for example) in some of the dialects:
https://docs.snowflake.com/en/sql-reference/functions/first_value
https://learn.microsoft.com/en-us/sql/t-sql/functions/first-value-transact-sql?view=sql-server-ver16
https://www.postgresqltutorial.com/postgresql-window-function/postgresql-first_value-function/

The Over Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x) is not valid.
Even if the tests pass, those tests are not aligned with the syntax.
That's why I added two arrays of dialects - for each dialect his supported syntax.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Over Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x) is not valid.

I think that is true when the FIRST_VALUE function is being used as a window function

Some systems allow FIRST_VALUE to be used as normal aggregate functions -- so like

SELECT 
  FIRST_VALUE(amount ORDER BY time), 
  LAST_VALUE(amount ORDER BY time)
FROM 
  t
GROUP BY
  currency

Perhaps @mustafasrepo can offer some perspective as the author of #882

But basically I think we should retain these tests

Copy link
Contributor

@mustafasrepo mustafasrepo Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually FIRST_VALUE and LAST_VALUE can be used as window or aggregate function (similar to SUM, MIN, etc.). In these tests, they are used as aggregate function. Hence these tests should remain as is.
But having the window tests FIRST_VALUE and LAST_VALUE is also helpful. I think you shouldn't modify existing test, and add your test a a new test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb and @mustafasrepo :) Learned something about these functions :)
I maintained the old tests

tests/sqlparser_snowflake.rs Outdated Show resolved Hide resolved
tests/sqlparser_snowflake.rs Outdated Show resolved Hide resolved
@alamb
Copy link
Contributor

alamb commented Oct 20, 2023

Thank you for the review @iffyio -- I appreciate it

@yuval-illumex
Copy link
Contributor Author

Thank you @alamb for your feedback! Please see my comment on the tests

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yuval-illumex -- we are getting close I think

@@ -2259,18 +2263,34 @@ fn parse_agg_with_order_by() {
Box::new(MsSqlDialect {}),
Box::new(AnsiDialect {}),
Box::new(HiveDialect {}),
Box::new(SnowflakeDialect {}),
],
options: None,
};

for sql in [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Over Keyword is a must. so the syntax FIRST_VALUE(x ORDER BY x) is not valid.

I think that is true when the FIRST_VALUE function is being used as a window function

Some systems allow FIRST_VALUE to be used as normal aggregate functions -- so like

SELECT 
  FIRST_VALUE(amount ORDER BY time), 
  LAST_VALUE(amount ORDER BY time)
FROM 
  t
GROUP BY
  currency

Perhaps @mustafasrepo can offer some perspective as the author of #882

But basically I think we should retain these tests

src/ast/mod.rs Outdated Show resolved Hide resolved
@yuval-illumex
Copy link
Contributor Author

Thanks @alamb - it's ready for another review

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thanks again @yuval-illumex and @mustafasrepo

@alamb alamb merged commit b89edaa into apache:main Oct 24, 2023
10 checks passed
serprex pushed a commit to serprex/sqlparser-rs that referenced this pull request Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants