Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Set case flag for names #28

Open
BeadW opened this issue Feb 21, 2023 · 8 comments
Open

Feature: Set case flag for names #28

BeadW opened this issue Feb 21, 2023 · 8 comments

Comments

@BeadW
Copy link
Contributor

BeadW commented Feb 21, 2023

Seems like at the moment dbt-invoke is not doing any treatment to resource names.

Simple use case here is column naming coming from Snowflake.

It always comes back as upper case (unless quoted identifiers were on) and we have a lower case convention in our yaml and our sql.

It would be nice to be able to set a flag so that we conform the case of resources and columns.

@BeadW BeadW changed the title Set case flag for names Feature: Set case flag for names Feb 21, 2023
@BeadW
Copy link
Contributor Author

BeadW commented May 2, 2023

Adding to this one. When names are in a different case dbt-invoke treats them as being different causing it to nuke metadata. This might actually be a bug given that scenario.

@BeadW
Copy link
Contributor Author

BeadW commented May 2, 2023

@robastel interested in your thoughts on expected behavior when the name in yaml is a different case to that of the column in the db.
For example a column id in yaml in snowflake is going to most likely be ID.
Currently this is treated as id is no longer in the object so its removed.
A new column called ID is added to the yaml.

@robastel
Copy link
Member

Hi @BeadW, sorry my focus was away from dbt-invoke for a bit. Thanks for reporting this interesting issue.

In Snowflake, when using quoted identifiers, can you have multiple columns in the same table that would all be the same if converted to the same case?

  • For example: "mycolumn", "myColumn", and "MYCOLUMN"

@HiltonDay
Copy link

HiltonDay commented Dec 9, 2023

Hi @robastel

Firstly - awesome tool.. we developed an in-house python script to pre-create YML files over a year ago (using information_schema from database, rather than thru DBT), but its a "first time only" thing as we haven't spent time sorting out preserving descriptions etc. dbt-invoke is likely going to replace it completely.

Can confirm the issue - dbt-invoke is nuking metadata from files with the enforced case change.
image

Info on DBT and snowflake quoting here: https://docs.getdbt.com/reference/project-configs/quoting#snowflake
(recommendation is to default quoting to false in dbt-project.yml to make using objects in snowflake much easier)

And snowflake behaviour here: https://community.snowflake.com/s/article/Unquoted-object-names-are-stored-in-upper-case-letters

@HiltonDay
Copy link

Hi @BeadW, sorry my focus was away from dbt-invoke for a bit. Thanks for reporting this interesting issue.

In Snowflake, when using quoted identifiers, can you have multiple columns in the same table that would all be the same if converted to the same case?

* For example: "mycolumn", "myColumn", and "MYCOLUMN"

@robastel intersting question, I'll go check manually and report back 🤓

@HiltonDay
Copy link

Wow thats ugly.. (generated manually, not with DBT)

image

image

@robastel
Copy link
Member

Thanks @HiltonDay for doing some research on this issue!

Given your Text_Case/text_case/TEXT_CASE example and the possible collisions that could result from altering the case of identifiers, I'm inclined to leave dbt-invoke's behavior as is.

I'll leave this issue open for a bit in case there are any further proposals to account for collisions.

@HiltonDay
Copy link

HiltonDay commented May 6, 2024

Given the general propensity to snake case over all-caps, this is likely an issue for anyone using snowflake+dbt?

The default behaviour for snowflake is to turn off the case sensitivty for DBT (https://docs.getdbt.com/reference/project-configs/quoting#default).

Might I suggest adding the behaviour via a separate command line flag like "--ignore-case-sensitivity" so that the 99% of Snowflake users who are sane, might benefit from your tool? That way there's an explicit user opt-in at runtime to the behaviour, and no general risk of people murdering their documentation by accident? :)

My take is, any organisation who are overloading the same-named columns with other variations in case sensitivity, deserve all kinds of special punishment, but they're already doing that to their future selves, so agree that this shouldn't be a default behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants