Enhance object name path segments #1539

ayman-sigma · 2024-11-20T22:05:32Z

Right now ObjectName is just list of identifiers. We parse each object name path segment as a string identifier. Some dialects has more rich types for each path segment. This PR rework the object name to allow different types for each path segment.

Examples this PR will make it easier to support:

Databricks IDENTIFIER clause. Example: SELECT * FROM myschema.IDENTIFIER(:mytab). The (:mytab) is wrongly parsed right now as TableFunctionArgs. More details: https://docs.databricks.com/en/sql/language-manual/sql-ref-names-identifier-clause.html
Snowflake double-dot notation. Example SELECT * FROM db..table_name. This indicates that use of default schema PUBLIC. With this PR, we can use DefaultSchema variant for the path segment instead of using empty identifier. More details: https://docs.snowflake.com/en/sql-reference/name-resolution#resolution-when-schema-omitted-double-dot-notation

Most changes are mechanical except couple of locations I commented on below, in addition to the ast/mod.rs.

ayman-sigma · 2024-11-20T22:36:17Z

src/parser/mod.rs

@@ -4294,7 +4312,9 @@ impl<'a> Parser<'a> {
        let mut data_type = self.parse_data_type()?;
        if let DataType::Custom(n, _) = &data_type {
            // the first token is actually a name
-            name = Some(n.0[0].clone());
+            match n.0[0].clone() {
+                ObjectNamePart::Identifier(ident) => name = Some(ident),


Once we start adding more to the ObjectNamePart enum, we will return parsing error for the other variants here.

ayman-sigma · 2024-11-20T22:38:26Z

src/parser/mod.rs

@@ -10778,7 +10798,7 @@ impl<'a> Parser<'a> {
        self.expect_token(&Token::LParen)?;
        let aggregate_functions = self.parse_comma_separated(Self::parse_aliased_function_call)?;
        self.expect_keyword(Keyword::FOR)?;
-        let value_column = self.parse_object_name(false)?.0;
+        let value_column = self.parse_period_separated_identifiers()?;


Giving this is a column name, we should parse it as period-separated identifiers and not as Object name.

mvzink · 2024-11-20T23:29:59Z

I think ObjectNamePart::Wildcard or something would be better than what I did in #1538, so this seems like a good idea to me.

src/ast/mod.rs

src/parser/mod.rs

iffyio

Thanks @ayman-sigma! left some minor comments, this looks good to me overall

src/ast/mod.rs

src/parser/mod.rs

iffyio

LGTM! cc @alamb

alamb · 2024-11-30T13:10:00Z

Hi @ayman-sigma this PR appears to have some conflicts. Is there any chance you can resolve them so we can merge it in?

Thank you!

ayman-sigma · 2024-12-02T03:54:44Z

Hi @ayman-sigma this PR appears to have some conflicts. Is there any chance you can resolve them so we can merge it in?

Thank you!

@alamb, Done.

alamb

I started trying to update DataFusion to use this change -- it turns out to be fairly invasive.

You can try here: apache/datafusion#13546

(the issue is that we have a bunch of handling of ObjectName --> Indents code).

I think we can make the DataFusion code better / easier to follow

alamb · 2024-12-11T22:37:12Z

Given the potential for non trivial downstream conflicts due to this change (look at the list of conflicts it has already collected) I would like to consider it for the next release

Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517

ayman-sigma · 2024-12-12T02:49:56Z

Given the potential for non trivial downstream conflicts due to this change (look at the list of conflicts it has already collected) I would like to consider it for the next release

Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517

Sounds good. Thanks @alamb!

iffyio · 2025-01-18T08:14:45Z

@alamb just wanted to double check status of this PR if there were reservations you had or if you feel this is something we would be able to land?

alamb · 2025-01-18T21:45:20Z

@alamb just wanted to double check status of this PR if there were reservations you had or if you feel this is something we would be able to land?

My biggest reservation was that it would cause substantial downstream churn (I tried to make the changes to DataFUsion briefly and it was painful). So I just haven't had the heart to click the merge button

I mentially was prepared if you merged it I would sort it out downstraem but I couldn't get myself to inflict the main on myself ...

iffyio · 2025-01-19T10:40:09Z

Ah yeah this is indeed an invasive change. Alright that makes sense!

In that case @ayman-sigma please take a look at resolving the conflicts when you have some time to pick this back up and we can look to merge it? Sorry for the delay in getting to it

alamb · 2025-01-19T14:08:08Z

FWIW I did a test upgrade to DataFusion to prepare for the next release and it already had some non trivial changes needed (changes to FieldAccess specifically)

Test upgrade to sqlparser-rs 0.54 datafusion#14198

ayman-sigma · 2025-01-21T06:37:43Z

FWIW I did a test upgrade to DataFusion to prepare for the next release and it already had some non trivial changes needed (changes to FieldAccess specifically)

Test upgrade to sqlparser-rs 0.54 datafusion#14198

I'm fine to drop this PR if will cause too much pain downstream. Let me know if we still want to go with this PR and I can fix the conflicts.

iffyio · 2025-01-22T08:24:13Z

@ayman-sigma FieldAccess being referred to was part of the 0.54 release, those changes were breaking but the syntax isn't tied to the objectName changes/this PR.

I think it would be good to get this PR in since it'll be good to support the Identifier syntax, unfortunately that syntax isn't uncommon in Snowflake especially, and I couldn't come up with a variant that improves upon this PR in terms of being backwards compatible.
I figured an alternative could be to let the parser accept the syntax but drop the Identifier keyword so that the actual AST isn't affected. But afaict it would only be able to support Identifier(<string_literal> | <variable>) variants. So I figure it would make sense with the changes as in this PR that lets us support the full syntax going forward.

So my thinking was to get this PR in for the next 0.55 release, and then any further extensions to the representation can come in later releases afterwards

ayman-sigma · 2025-01-25T04:28:43Z

@iffyio I rebased to main and resolved all conflicts. I made some changes in the last commit to best of my knowledge. Please make sure to review the last commit. Thanks!

iffyio · 2025-01-26T14:14:08Z

Thanks @ayman-sigma!

This reverts commit 3ec49b2.

ayman-sigma mentioned this pull request Nov 20, 2024

Support snowflake double dot notation for object name #1540

Merged

ayman-sigma commented Nov 20, 2024

View reviewed changes

mvzink reviewed Nov 20, 2024

View reviewed changes

src/ast/mod.rs Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from 4b4998e to 18ca48f Compare November 21, 2024 20:19

iffyio mentioned this pull request Nov 23, 2024

How to best add support for IDENTIFIER() clause #1412

Open

iffyio reviewed Nov 23, 2024

View reviewed changes

src/ast/mod.rs Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

ayman-sigma requested a review from iffyio November 24, 2024 20:33

iffyio approved these changes Nov 25, 2024

View reviewed changes

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from e22e3d8 to 176cf13 Compare November 26, 2024 02:39

ayman-sigma force-pushed the ayman/improveObjectNameParts branch 2 times, most recently from 6f05bcf to 7791973 Compare December 2, 2024 03:52

alamb reviewed Dec 2, 2024

View reviewed changes

ayman-sigma added 6 commits January 24, 2025 12:42

improve object name parts

861cfcd

update readme

d0aa469

fmt

55baf17

remove comment

fef73ce

fix vistior tests and address comment

a6009ab

rebase and fix tests

c84dd37

ayman-sigma added 4 commits January 24, 2025 12:43

address comments

55a8151

fix after rebase

fe05be7

implement spanned for ObjectNamePart

b35cd68

more changes after rebase

1c25ec4

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from 7791973 to 1c25ec4 Compare January 25, 2025 04:24

ayman-sigma requested a review from iffyio January 25, 2025 04:27

fmt

603860b

iffyio merged commit 211b15e into apache:main Jan 26, 2025
9 checks passed

Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025

Enhance object name path segments (apache#1539)

a2921a7

Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025

Enhance object name path segments (apache#1539)

3ec49b2

Vedin added a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025

Revert "Enhance object name path segments (apache#1539)"

10a63ea

This reverts commit 3ec49b2.

alamb added the api change label Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance object name path segments #1539

Enhance object name path segments #1539

ayman-sigma commented Nov 20, 2024 •

edited

Loading

ayman-sigma Nov 20, 2024

ayman-sigma Nov 20, 2024

mvzink commented Nov 20, 2024

iffyio left a comment

iffyio left a comment

alamb commented Nov 30, 2024

ayman-sigma commented Dec 2, 2024

alamb left a comment

alamb commented Dec 11, 2024

ayman-sigma commented Dec 12, 2024

iffyio commented Jan 18, 2025

alamb commented Jan 18, 2025

iffyio commented Jan 19, 2025

alamb commented Jan 19, 2025

ayman-sigma commented Jan 21, 2025

iffyio commented Jan 22, 2025

ayman-sigma commented Jan 25, 2025

iffyio commented Jan 26, 2025

Enhance object name path segments #1539

Enhance object name path segments #1539

Conversation

ayman-sigma commented Nov 20, 2024 • edited Loading

ayman-sigma Nov 20, 2024

Choose a reason for hiding this comment

ayman-sigma Nov 20, 2024

Choose a reason for hiding this comment

mvzink commented Nov 20, 2024

iffyio left a comment

Choose a reason for hiding this comment

iffyio left a comment

Choose a reason for hiding this comment

alamb commented Nov 30, 2024

ayman-sigma commented Dec 2, 2024

alamb left a comment

Choose a reason for hiding this comment

alamb commented Dec 11, 2024

ayman-sigma commented Dec 12, 2024

iffyio commented Jan 18, 2025

alamb commented Jan 18, 2025

iffyio commented Jan 19, 2025

alamb commented Jan 19, 2025

ayman-sigma commented Jan 21, 2025

iffyio commented Jan 22, 2025

ayman-sigma commented Jan 25, 2025

iffyio commented Jan 26, 2025

ayman-sigma commented Nov 20, 2024 •

edited

Loading