-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for BigQuery struct
, array
and bytes
, int64
, float64
datatypes
#1003
Conversation
Pull Request Test Coverage Report for Build 6523097396
💛 - Coveralls |
This builds on top of apache#817 - `STRUCT` literal support via `STRUCT` keyword https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#constructing_a_struct - `STRUCT` and `ARRAY` type declarations https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#declaring_an_array_type https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#declaring_a_struct_type It works around the issue of not being able to parse nested types like `ARRAY<ARRAY<INT>>` due to the right angle bracket ambiguity where the tokenizer chooses the right-shift operator (this affects other dialects like Hive that have similar syntax). When parsing such types, we accept a closing `>` or `>>` and track which variant is in use in order to preserve correctness. Fixes apache#901 Closes apache#817
Note there appears to be a similar PR: #966 |
struct
, array
and bytes
datatype
struct
, array
and bytes
datatypestruct
, array
and bytes
datatype
struct
, array
and bytes
datatypestruct
, array
and bytes
, int64
, float64
datatypes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this contribution @iffyio 👏 -- it is very impressive both well documented and well tested. I had a few minor comments, but nothing that would prevent merging.
Please let me know if you have time to address the comments this week (I plan to make a release later this week)
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))] | ||
pub enum ArrayElemTypeDef { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/parser/mod.rs
Outdated
@@ -816,6 +836,10 @@ impl<'a> Parser<'a> { | |||
Keyword::MATCH if dialect_of!(self is MySqlDialect | GenericDialect) => { | |||
self.parse_match_against() | |||
} | |||
Keyword::STRUCT if dialect_of!(self is BigQueryDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to also support this syntax in the Generic dialect? Rationale is explained in https://github.com/sqlparser-rs/sqlparser-rs#new-syntax
Keyword::STRUCT if dialect_of!(self is BigQueryDialect) => { | |
Keyword::STRUCT if dialect_of!(self is BigQueryDialect | GenericDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
src/parser/mod.rs
Outdated
/// expr [AS name] | ||
/// ``` | ||
/// [1]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#constructing_a_struct | ||
pub fn parse_struct_field_expr(&mut self, typed_syntax: bool) -> Result<Expr, ParserError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be pub
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
Ok(Expr::Struct { values, fields }) | ||
} | ||
|
||
/// Parse an expression value for a bigquery struct [1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a note to the documentation about the meaning of the typed_syntax
parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description to cover the parameter
Thanks again @iffyio |
…4` datatypes (apache#1003) # Conflicts: # src/ast/mod.rs # src/parser/mod.rs # tests/sqlparser_common.rs # tests/sqlparser_snowflake.rs
This builds on top of #817
STRUCT
literal support viaSTRUCT
keyword https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#constructing_a_structSTRUCT
andARRAY
type declarations https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#declaring_an_array_type https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#declaring_a_struct_typeIt works around the issue of not being able to parse nested types like
ARRAY<ARRAY<INT>>
due to the right angle bracket ambiguity where the tokenizer chooses the right-shift operator (this affects other dialects like Hive that have similar syntax). When parsing such types, we accept a closing>
or>>
and track which variant is in use in order to preserve correctness.Fixes #901
Closes #817