-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add all missing table options to be handled in any order #1747
base: main
Are you sure you want to change the base?
Conversation
src/ast/dml.rs
Outdated
@@ -138,6 +143,30 @@ pub struct CreateTable { | |||
pub engine: Option<TableEngine>, | |||
pub comment: Option<CommentDef>, | |||
pub auto_increment_offset: Option<u32>, | |||
pub key_block_size: Option<u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's quite a few of these options and its not clear what dialect they're coming from, I assume there will be more across the various dialects.
I'm thinking to @mvzink's point, it will be good to represent this as a Vec
, also think that we can represent this similar to the existing with_options: Vec<SqlOption>
on this struct (maybe we can just reuse and merge this into that field if it makes sense, otherwise it could be a new field)? essentially that it would be ideal to not have to enumerate and keep in sync all possible table options across dialects but instead always accept them as opaque entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be taking over this PR, diving into this suggestion:
Currently the CreateTable
struct has a few types of options which are AFAIU mainly dialect driven:
- with_options is formatted as "WITH (....options...)" -> Postgres & other
- options which is formatted as "OPTIONS (....options...)" -> BigQuery
- table_properties which is formatted as "TBLPROPERTIES (....options...)" -> Databricks
Proposed design to support the mysql variant:
- Add another parameter Vec which will include properties which do not require a preceding keyword (WITH/OPTIONS/...), e.g. plain_options
- The Vec will include optional parameters which are currently assigned to the
CreateTable
struct, e.g. engine / collation / ... - Parameter parsing will be delegated to the respective Dialect to avoid clutter in the main flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a reasonable approach to me. Thanks for pushing this forward! I would suggest that whatever parsing routine/subparser produces that vector of options be written in such a way that it could be used for ALTER TABLE
parsing as well; I believe their syntax is basically the same, at least for MySQL, and that struct should be updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another parameter Vec which will include properties which do not require a preceding keyword (WITH/OPTIONS/...), e.g. plain_options
seeing as there's a few similar parameters, thinking it could be nice to merge the variants into an enum something like
enum CreateTableOptions {
TableProperties(Vec<SqlOption>),
WithOptions(Vec<SqlOption>),
Options(Vec<SqlOption>),
Plain(Vec<SqlOption>),
}
that doesn't have to be in this PR however, we could introduce a new field if it could be a hassle to merge the existing ones.
Parameter parsing will be delegated to the respective Dialect to avoid clutter in the main flow
How do we mean by this part? I imagined it would be reusing self.parse_options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it makes sense to reuse parse_options
, it's designed to parse something in the format of KEYWORD(opt1 , opt2, opt3)
while mysql syntax is opt1 opt2 opt3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of an enum, will handle that on a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iffyio added a commit with merging of all variants into the (pre-existing) CreateTableOptions enum
tests/sqlparser_duckdb.rs
Outdated
@@ -721,6 +721,30 @@ fn test_duckdb_union_datatype() { | |||
comment: Default::default(), | |||
auto_increment_offset: Default::default(), | |||
default_charset: Default::default(), | |||
insert_method: Default::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use ..Default::default() for all these members?
tests/sqlparser_mssql.rs
Outdated
@@ -1590,6 +1590,30 @@ fn parse_create_table_with_valid_options() { | |||
comment: None, | |||
auto_increment_offset: None, | |||
default_charset: None, | |||
insert_method: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..Default::default() ?
Marking the PR as draft in the meantime as it is not currently pending review, @benrsatori feel free to undraft and ping when ready! |
a3f2732
to
0cd38de
Compare
0cd38de
to
9d3528d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Wishlist: I would still like to see this used in ALTER TABLE
as well.)
src/parser/mod.rs
Outdated
fn parse_plain_option(&mut self) -> Result<Option<SqlOption>, ParserError> { | ||
if let Some(option) = self.dialect.parse_plain_option(self)? { | ||
return Ok(Some(option)); | ||
} | ||
|
||
// Parse table options in any order | ||
if let Some(keyword) = self.parse_one_of_keywords(&[ | ||
Keyword::ENGINE, | ||
Keyword::AUTO_INCREMENT, | ||
Keyword::DEFAULT, | ||
Keyword::CHARSET, | ||
Keyword::COLLATE, | ||
Keyword::INSERT_METHOD, | ||
]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the motivation for making this a dialect parsing method. Is there a problem with putting all the supported options across all dialects in here? It especially raises my eyebrow to see the two lists have so much overlap, including with (afaik) MySQL-specific options like ENGINE
and AUTO_INCREMENT
in the generic set. If we are going to have Postgres and MSSQL parse those, why not have them parse DATA DIRECTORY
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bigger question if the library should branch out to dialect specific parsing or maintain common code supporting all dialects, looking at SnowflakeDialect
is seems like a specific implementation exists (parse_create_table
) and looking at the common code i can see several places where the code diverges based on dialect (mostly hive related).
Having parse_create_table
as part of the dialect is what guided the decision to pull this code into the dialect, I may have overlooked ENGINE
and AUTO_INCREMENT
, prefer to pull those into mysql code rather than take mysql code out.
LMKWYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that's a bigger question. I don't have a strong suggestion at the moment and this certainly seems servicable to me. My 2¢ on the bigger question is that introducing more dialect overrides (and actually this is more similar to parse_column_option
, which is global, than Snowflake's parse_create_table
which is only called from SnowflakeDialect::parse_statement
, the normal top-level override) is a slippery slope toward fully dynamically dispatched parsing, but it's more justifiable if it's parsing something with a substantially different syntactic structure. However, this is mostly just parsing a different list of acceptable keywords but doing it in the same way. In such cases, I personally think more permissive generic parsing is preferable; e.g. does it really matter if SnowflakeDialect
accepts ROW_FORMAT = COMPACT
even though it doesn't make sense, so long as it doesn't introduce syntactic ambiguity? The exceptions of special parsing routines for a few options like DATA DIRECTORY
and TABLESPACE
are a good argument though. If anything, I would probably favor putting all the not-so-special options in the default implementation and only using the override for these options that require special parsing. Honestly, if it wouldn't introduce ambiguity (mostly due to options not being comma separated), I would probably even prefer the default/fallback routine for non-special options just be "parse ident, optionally parse equals sign, parse expr"—that's all thats's going to end up in the SqlOption::KeyValue
at the end anyway, so why do we care which specific list of idents are acceptable? That is more of a semantic than syntactic question IMO.
src/ast/mod.rs
Outdated
|
||
Union(Vec<Ident>), | ||
|
||
TableEngine(TableEngine), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I haven't checked where else SqlOption
is used, but it really seems like it should be called TableOption
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be used all over ( ColumnOption
, CreateConnector
, CreateSchema
, Cache
& more)
looks like it's mainly used for the key/value variant.
IMO it makes sense to split out into a key/value type and a TableOption type (which has a K/V variant).
I think we should save this for another PR, there is a lot of scope in this one as is.
src/ast/helpers/stmt_create_table.rs
Outdated
pub file_format: Option<FileFormat>, | ||
pub location: Option<String>, | ||
pub query: Option<Box<Query>>, | ||
pub without_rowid: bool, | ||
pub like: Option<ObjectName>, | ||
pub clone: Option<ObjectName>, | ||
pub engine: Option<TableEngine>, | ||
pub comment: Option<CommentDef>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if comment
should be folded in to options as well (with a new SqlOption::Comment(CommentDef)
variant)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the latest commit,
note that the hive dialect still has some specific requirements
src/parser/mod.rs
Outdated
let value = self.next_token(); | ||
|
||
match (keyword, value.token) { | ||
(Keyword::AUTO_INCREMENT, Token::Number(n, l)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is missing the optional =
parsing that was present in the old version (and I believe was correct).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the let _ = self.consume_token(&Token::Eq);
a could of lines earlier or is it something else i've missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I totally missed that!
src/ast/mod.rs
Outdated
TableSpace(TablespaceOption), | ||
|
||
Union(Vec<Ident>), | ||
|
||
TableEngine(TableEngine), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a description (ideally with syntax example) and links to the docs where these variants come from?
this might not be valid given my other comment
src/dialect/mysql.rs
Outdated
//Some consts | ||
const COMPRESSION_OPTS: [&str; 3] = ["ZLIB", "LZ4", "NONE"]; | ||
const INSERT_METHODS_OPTS: [&str; 3] = ["NO", "FIRST", "LAST"]; | ||
|
||
let keyword = match parser.parse_one_of_keywords(&[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify my comment here my thinking is that the options are plain key value pairs and idealy would be parsed as such. So that I imagined the parser shouldnt look for explicit keywords, similar to other dialects like bigquery it only looks to parse a key and a value.
Essentially, that it looks to parse and represent something like this I was thinking
SqlOption::Plain { // Similar to SqlOption::KeyValue except it doesnt format with =
key: Ident,
value: Expr,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it would be better to parse more accurately where is is easy to implement (i.e. only a limited set of values are allowed). it does provide some value and I do not see the downside given that it's in dialect specific code.
LMKWYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside is that there's a bit of code and API to power and maintain in any case, not isolated to the dialect's parsing, the keyword list and the AST are affected as well. But it also implies that we have to model and keep up with all options across all dialects in order to be correct/consistent going forward which would be a bigger ask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I philosophically agree with @iffyio's ask here, as I mention in my other big ranty comment (sorry 😛). However, I am not 100% convinced in practice, mostly because these options aren't comma separated and this could produce some really confusing parses if users make errors. I imagined the default parsing routine would be "parse ident, optionally parse equals sign, parse expr". But suppose a user intended to send ROW_FORMAT COMPACT DEFAULT CHARACTER SET latin1
, but accidentally deleted COMPACT
. Instead of a parse error like they would get if options were comma separated or hardcoded, they get a successful parse with ROW_FORMAT = DEFAULT
followed by CHARACTER SET = latin1
. Is it our fault? Not really. Is it better than the maintenance burden of keeping a list of valid one-word option names up to date? I'm honestly not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that behavior makes sense for the parser to mitigate, maybe we could explicitly track the one word options and multiple keyword options to avoid ambiguity? something like:
fn parse_plain_option(&self, parser) -> Result<Option<SqlOption>> {
if self.parse_keywords(START, TRANSACTION) {
return Ok(Some(SqlOption::Ident("START TRANSACTION")))
} // ...
let key = if self.parse_keywords(DEFAULT, CHARACTER, SET) {
Ident::new("DEFAULT CHARACTER SET")
} // ...
else {
self.parse_ident()?
};
let _ = self.consume_token('=')?;
let value = self.parse_expr()?;
Ok(Some(SqlOption::KeyValue {
key, value
}))
}
In that, the special cases are only a handful, and we're only tracking their keys, values are opaque.
Hmm interesting with the example, I wonder how mysql interprets ROW_FORMAT DEFAULT CHARACTER SET latin1
. Since DEFAULT
seems to also be a valid value for ROW_FORMAT
from their docs
46deb3b
to
f1cacd2
Compare
src/dialect/mysql.rs
Outdated
|
||
// COMPRESSION [=] {'ZLIB' | 'LZ4' | 'NONE'} | ||
(Keyword::COMPRESSION, Token::SingleQuotedString(s)) | ||
if COMPRESSION_OPTS.contains(&s.to_uppercase().as_str()) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine but honestly I would skip this check entirely for reasons in a similar vein to what I mentioned in my other comment about dialect overrides. Since they are all single quoted strings anyway, it's not really a syntactic question as to which compression algorithms are allowed, but a semantic one. What if I were writing a MySQL-compatible DBMS with added compression options? Are you really going to make me override and copy and paste this method when the syntax is identical but the semantic content has changed? It's also a maintainability hazard if MySQL (or any MySQL-compatible system that users are parsing with MySqlDialect
) adds a new compression algorithm. I think I feel the same way about INSERT_METHODS_OPTS
but it's a tad less clear since those are identifiers not strings.
If you are gonna keep these checks, I would inline the arrays instead of defining them at the top of the function, to keep that info local to where it's used.
src/parser/mod.rs
Outdated
fn parse_plain_option(&mut self) -> Result<Option<SqlOption>, ParserError> { | ||
if let Some(option) = self.dialect.parse_plain_option(self)? { | ||
return Ok(Some(option)); | ||
} | ||
|
||
// Parse table options in any order | ||
if let Some(keyword) = self.parse_one_of_keywords(&[ | ||
Keyword::ENGINE, | ||
Keyword::AUTO_INCREMENT, | ||
Keyword::DEFAULT, | ||
Keyword::CHARSET, | ||
Keyword::COLLATE, | ||
Keyword::INSERT_METHOD, | ||
]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that's a bigger question. I don't have a strong suggestion at the moment and this certainly seems servicable to me. My 2¢ on the bigger question is that introducing more dialect overrides (and actually this is more similar to parse_column_option
, which is global, than Snowflake's parse_create_table
which is only called from SnowflakeDialect::parse_statement
, the normal top-level override) is a slippery slope toward fully dynamically dispatched parsing, but it's more justifiable if it's parsing something with a substantially different syntactic structure. However, this is mostly just parsing a different list of acceptable keywords but doing it in the same way. In such cases, I personally think more permissive generic parsing is preferable; e.g. does it really matter if SnowflakeDialect
accepts ROW_FORMAT = COMPACT
even though it doesn't make sense, so long as it doesn't introduce syntactic ambiguity? The exceptions of special parsing routines for a few options like DATA DIRECTORY
and TABLESPACE
are a good argument though. If anything, I would probably favor putting all the not-so-special options in the default implementation and only using the override for these options that require special parsing. Honestly, if it wouldn't introduce ambiguity (mostly due to options not being comma separated), I would probably even prefer the default/fallback routine for non-special options just be "parse ident, optionally parse equals sign, parse expr"—that's all thats's going to end up in the SqlOption::KeyValue
at the end anyway, so why do we care which specific list of idents are acceptable? That is more of a semantic than syntactic question IMO.
src/dialect/mysql.rs
Outdated
//Some consts | ||
const COMPRESSION_OPTS: [&str; 3] = ["ZLIB", "LZ4", "NONE"]; | ||
const INSERT_METHODS_OPTS: [&str; 3] = ["NO", "FIRST", "LAST"]; | ||
|
||
let keyword = match parser.parse_one_of_keywords(&[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I philosophically agree with @iffyio's ask here, as I mention in my other big ranty comment (sorry 😛). However, I am not 100% convinced in practice, mostly because these options aren't comma separated and this could produce some really confusing parses if users make errors. I imagined the default parsing routine would be "parse ident, optionally parse equals sign, parse expr". But suppose a user intended to send ROW_FORMAT COMPACT DEFAULT CHARACTER SET latin1
, but accidentally deleted COMPACT
. Instead of a parse error like they would get if options were comma separated or hardcoded, they get a successful parse with ROW_FORMAT = DEFAULT
followed by CHARACTER SET = latin1
. Is it our fault? Not really. Is it better than the maintenance burden of keeping a list of valid one-word option names up to date? I'm honestly not sure.
83c6333
to
5474eb9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for keeping on with this!
src/parser/mod.rs
Outdated
} else if self.parse_keywords(&[Keyword::CHARACTER, Keyword::SET]) { | ||
// [DEFAULT] CHARACTER SET [=] charset_name | ||
Ident::new("CHARACTER SET") | ||
} else if self.parse_keyword(Keyword::CHARSET) { | ||
// [DEFAULT] CHARACTER SET [=] charset_name | ||
Ident::new("CHARSET") | ||
} else if self.parse_keyword(Keyword::COLLATE) { | ||
// [DEFAULT] CHARACTER SET [=] charset_name | ||
Ident::new("COLLATE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed
src/parser/mod.rs
Outdated
let key = if self.parse_keywords(&[Keyword::DEFAULT, Keyword::CHARSET]) { | ||
// [DEFAULT] CHARACTER SET [=] charset_name | ||
Ident::new("DEFAULT CHARSET") | ||
} else if self.parse_keywords(&[Keyword::DEFAULT, Keyword::CHARACTER, Keyword::SET]) { | ||
// [DEFAULT] CHARACTER SET [=] charset_name | ||
Ident::new("DEFAULT CHARACTER SET") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR IMO, but I'd be curious to hear your thoughts & @iffyio's: would it be appropriate to add another SqlOption
variant like TableSpace
and TableEngine
for a few of these options to make inspecting them easier for users? I'm specifically thinking of [DEFAULT] {CHARSET | CHARACTER SET}
because as it is if I wanted to extract charset info from DDL, I would have to check for 4 different hardcoded strings which aren't represented in the public API (i.e. there's no DEFAULT_CHARACTER_SET_IDENT
or a set of idents I could expect to see). Does the fact that it has 4 possible representations make it worth special casing with a SqlOption::CharSet { default, shorthand, value }
? Since I will in fact need to inspect character set info, I may post a PR for that.
For this PR though, to at least make it harder to miss one of those 4 variants, can we put all the charset related branches next to each other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grouped the branches by topic.
IMHO another SqlOption
makes sense, in the same way Expr::Substring
supports both the shorthand flavor (SUBSTR) and the long format (SUBSTRING)
… CreateTableOptions enum
5474eb9
to
d91b70e
Compare
} else if self.parse_keyword(Keyword::KEY_BLOCK_SIZE) { | ||
// KEY_BLOCK_SIZE [=] value | ||
Ident::new("KEY_BLOCK_SIZE") | ||
} else if self.parse_keyword(Keyword::ROW_FORMAT) { | ||
// ROW_FORMAT [=] {DEFAULT | DYNAMIC | FIXED | COMPRESSED | REDUNDANT | COMPACT} | ||
Ident::new("ROW_FORMAT") | ||
} else if self.parse_keyword(Keyword::PACK_KEYS) { | ||
// PACK_KEYS [=] {0 | 1 | DEFAULT} | ||
Ident::new("PACK_KEYS") | ||
} else if self.parse_keyword(Keyword::STATS_AUTO_RECALC) { | ||
// STATS_AUTO_RECALC [=] {DEFAULT | 0 | 1} | ||
Ident::new("STATS_AUTO_RECALC") | ||
} else if self.parse_keyword(Keyword::STATS_PERSISTENT) { | ||
//STATS_PERSISTENT [=] {DEFAULT | 0 | 1} | ||
Ident::new("STATS_PERSISTENT") | ||
} else if self.parse_keyword(Keyword::STATS_SAMPLE_PAGES) { | ||
// STATS_SAMPLE_PAGES [=] value | ||
Ident::new("STATS_SAMPLE_PAGES") | ||
} else if self.parse_keyword(Keyword::DELAY_KEY_WRITE) { | ||
// DELAY_KEY_WRITE [=] {0 | 1} | ||
Ident::new("DELAY_KEY_WRITE") | ||
} else if self.parse_keyword(Keyword::COMPRESSION) { | ||
// COMPRESSION [=] {'ZLIB' | 'LZ4' | 'NONE'} | ||
Ident::new("COMPRESSION") | ||
} else if self.parse_keyword(Keyword::ENCRYPTION) { | ||
// ENCRYPTION [=] {'Y' | 'N'} | ||
Ident::new("ENCRYPTION") | ||
} else if self.parse_keyword(Keyword::MAX_ROWS) { | ||
// MAX_ROWS [=] value | ||
Ident::new("MAX_ROWS") | ||
} else if self.parse_keyword(Keyword::MIN_ROWS) { | ||
// MIN_ROWS [=] value | ||
Ident::new("MIN_ROWS") | ||
} else if self.parse_keyword(Keyword::AUTOEXTEND_SIZE) { | ||
// AUTOEXTEND_SIZE [=] value | ||
Ident::new("AUTOEXTEND_SIZE") | ||
} else if self.parse_keyword(Keyword::AVG_ROW_LENGTH) { | ||
// AVG_ROW_LENGTH [=] value | ||
Ident::new("AVG_ROW_LENGTH") | ||
} else if self.parse_keyword(Keyword::CHECKSUM) { | ||
// CHECKSUM [=] {0 | 1} | ||
Ident::new("CHECKSUM") | ||
} else if self.parse_keyword(Keyword::CONNECTION) { | ||
// CONNECTION [=] 'connect_string' | ||
Ident::new("CONNECTION") | ||
} else if self.parse_keyword(Keyword::ENGINE_ATTRIBUTE) { | ||
// ENGINE_ATTRIBUTE [=] 'string' | ||
Ident::new("ENGINE_ATTRIBUTE") | ||
} else if self.parse_keyword(Keyword::PASSWORD) { | ||
// PASSWORD [=] 'string' | ||
Ident::new("PASSWORD") | ||
} else if self.parse_keyword(Keyword::SECONDARY_ENGINE_ATTRIBUTE) { | ||
// SECONDARY_ENGINE_ATTRIBUTE [=] 'string' | ||
Ident::new("SECONDARY_ENGINE_ATTRIBUTE") | ||
} else if self.parse_keyword(Keyword::INSERT_METHOD) { | ||
// INSERT_METHOD [=] { NO | FIRST | LAST } | ||
Ident::new("INSERT_METHOD") | ||
} else if self.parse_keyword(Keyword::AUTO_INCREMENT) { | ||
Ident::new("AUTO_INCREMENT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so for this part I think these can be handled transparently, its what I meant by this section in my previous comment
let _ = self.consume_token('=')?;
let value = self.parse_expr()?;
Ok(Some(SqlOption::KeyValue {
key, value
}))
in that the ideal scenario avoids enumerating the various options where possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Value part is handled transparently, not sure how we can prevent enumeration for the different keys (or maybe i've missed something ....)
Interesting to note that 'parse_expr' cannot be used in this context as it has explicit handling for the COLLATE keyword
} | ||
|
||
// Key/Value parameter option | ||
let key = if self.parse_keywords(&[Keyword::DEFAULT, Keyword::CHARSET]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I thought that it made a lot of sense with the previous version that delegated parsing the options to the (at least mysql) dialect, where the dialect only handles the special cases while the parser handles the generic cases. The behavior would be self documenting too. It would be nice to somehow flag where each parameter is coming from and link to the docs. No need to move things back to the dialect though if that's a hassle, could be done as a follow up afterwards so that part doesn't need to go into this PR I dont think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflicting feedback here :-) Prefer we save this to a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please defer to @iffyio, avoiding parsers inside Dialect
is just my personal preference 😛
src/parser/mod.rs
Outdated
self.parse_comma_separated0(Parser::parse_identifier, Token::RParen)?; | ||
self.expect_token(&Token::RParen)?; | ||
|
||
return Ok(Some(SqlOption::Union(tables))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling this variant Union
can make it reusable for future dialects or similar options? something like this maybe
SqlOption::NamedParenthesizedList {
name: Ident, // Union, TABLEENGINE
key: Option<Ident>, // table engine name
values: Vec<Expr>, // (val1, val2 ...)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, see latest commit
Ok(Some(SqlOption::TableSpace(TablespaceOption { | ||
name, | ||
storage, | ||
}))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking similar to Union
it would be good to have a reusable representation for this, maybe?
SqlOption::NamedKeyValue {
name: Ident, // TABLESPACE
key: Ident,
value: Expr
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's a little premature to generalize, not sure what future variations will look like.
(also, this type is different than others as it does not have the optional '=', see tablespace_option )
…eEngine into SqlOption::NamedParenthesizedList
I will have limited internet connectivity / dev time for the rest of this month, appreciate if we could get this PR merged today. |
No description provided.