-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TABLESAMPLE #1580
Conversation
789e1db
to
84b58d7
Compare
src/ast/query.rs
Outdated
sample: Option<Box<TableSample>>, | ||
/// Position of the table sample modifier in the table factor. Default is after the table alias | ||
/// e.g. `SELECT * FROM tbl t TABLESAMPLE (10 ROWS)`. See `Dialect::supports_table_sample_before_alias`. | ||
sample_before_alias: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use an enum here? e.g
enum TableSampleKind {
BeforeTableAlias(Box<TableSample>)
// ...
}
sample: Option<TableSampleKind>
thinking that avoids the surplus flag on the table factor and would lend itself to be extensible if required later on
src/parser/mod.rs
Outdated
self.expect_token(&Token::RParen)?; | ||
TableSample::Bucket(TableSampleBucket { bucket, total, on }) | ||
} else { | ||
let value = match self.try_parse(|p| p.parse_number_value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the try_parse rather be maybe_parse (we seem to be ignoring the returned error otherwise)?
src/parser/mod.rs
Outdated
let value = match self.maybe_parse(|p| p.parse_number_value()) { | ||
Ok(Some(num)) => num, | ||
_ => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let value = match self.maybe_parse(|p| p.parse_number_value()) { | |
Ok(Some(num)) => num, | |
_ => { | |
let value = match self.maybe_parse(|p| p.parse_number_value())? { | |
Some(num) => num, | |
None => { |
I think this is usually the recursion limit or similar fatal error we can propagate
src/parser/mod.rs
Outdated
if self.peek_token().token == Token::RParen | ||
&& !self.dialect.supports_implicit_table_sample_method() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this could be simplified as if self.dialect.supports_implicit_table_sample_method() && self.consume_token(Token::RParen)
it would also let us skip the expect_token that follows as well?
src/parser/mod.rs
Outdated
repeatable: seed, | ||
}) | ||
// Try to parse without an explicit table sample method keyword | ||
} else if self.peek_token().token == Token::LParen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else if self.peek_token().token == Token::LParen { | |
} else if self.consume_token(Token::LParen) { |
tests/sqlparser_snowflake.rs
Outdated
"SELECT * FROM testtable SAMPLE (10)", | ||
"SELECT * FROM testtable TABLESAMPLE BERNOULLI (10)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the scenarios that currently rely on one_statement_parse_to
, could we represent them faithfully when displaying? e.g this and the ROW
vs BERNOULLI
variants etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand the question, but if you mean whether the variants I added as interchangeable are really interchangeable in the dialect, as far as I understand yes. If not, please elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah so I was rather thinking that ideally we preserve the syntax roundtrip, even though they'r interchangeable in some dialects
snowflake_and_generic().verified_stmt("SELECT * FROM testtable TABLESAMPLE ROW (20.3)");
snowflake_and_generic().verified_stmt("SELECT * FROM testtable SAMPLE BLOCK (3) SEED (82)");
@iffyio please see latest commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yoavcloud! Left a comment to distinguish sample vs tablesample. Otherwise this looks good to me!
tests/sqlparser_snowflake.rs
Outdated
snowflake_and_generic().one_statement_parses_to( | ||
"SELECT * FROM testtable SAMPLE (10)", | ||
"SELECT * FROM testtable TABLESAMPLE (10)", | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh could we do the same here as well? clickhouse for example doesn't have has SAMPLE
but not TABLESAMPLE
so that the roundtrip stays correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added support for clickhouse as well, with a different approach to how to model the it in the AST
a0fdfbf
to
ef04565
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @yoavcloud
cc @alamb
This PR adds support for the
TABLESAMPLE
option in the following dialects:Collateral work includes expanding the use of a constructor function to create
Table
structs in unit tests to avoid modifying many files when adding default options to theTable
struct.