You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We had previously attempted to address this within Issue #77. However, this is still an issue for Redshift users. Fortunately, we were able to get to the bottom of the root issue and have found a more appropriate solution.
At the root of this issue, Redshift does not support partitioning by constant expressions. This is only rendered as an issue when materializing as a table. Therefore, we do not see this issue until the end models (for example the problem window function in the stg_shopify__abandoned_checkout_discount_code doesn't show as an error until the shopify__discounts model is executed. The reason Redshift throws this error is because a constant expression is essentially useless when added to either a partition or order by statement in a window function. Redshift is telling us that there is no reason for a window function in this scenario, so it throws an error.
While this is understandable that we shouldn't do a window function for constant expressions, we do still need to handle this because the constant expressions are only created because we generate an empty table when the source table doesn't yet exist. This will only be an issue until source data becomes available so we will need to address this in the interim period before data is available at the source.
Relevant error log or model output
constant expressions are not supported in partition by clauses
Expected behavior
The package is able to handle window functions both when there is no source data available and when there is source data available without throwing a constant expression error.
Possible solution
When exploring this I found the reason for this error is because when we insert the null records in the fields cte via the fill_staging_columns macro we are essentially creating a data frame with one row (all being null), then when we attempt to apply the window function in the final cte we are hit with an error because essentially all fields are null.
What I found the be the root of this issue is that the fields cte artificially produces 1 row, when in fact there should be 0 records. If we add a condition to limit 0 the results of the field cte when there are no records in the source, then we are able to avoid this issue altogether. Since the window function does not actual need to process anything as a result of there being no records. Therefore, I propose the following changes:
Leverage the result_if_table_exists macro after the from statement of the fields cte to check if there is any data in the tmp model. If there are no records, then we apply a limit 0. If not then we do nothing.
Revert the case when statement and add back the normal window function as this will no longer be necessary with the new limit 0 approach in the above cte.
dbt Project configurations
Nothing specific
Package versions
Latest
What database are you using dbt with?
redshift
How are you running this dbt package?
Fivetran Quickstart Data Model, Fivetran Transformations, dbt Core™, dbt Cloud™
dbt Version
Latest
Additional Context
No response
Are you willing to open a PR to help address this issue?
Yes.
Yes, but I will need assistance.
No.
The text was updated successfully, but these errors were encountered:
To summarize the internal discussion with the team:
After exploring various potential solutions to the Redshift constant expression error and not finding an ideal umbrella outcome for all iterations of this issue, we reached the conclusion that based on the low frequency of this incident, we will be approaching this on a case-by-case basis.
For Shopify, the issue can arise due to empty metafield or abandoned_checkout_discount_code tables. Therefore we added enable/disable configs for these so that they may be disabled if not used by the customer. In this PR here.
Is there an existing issue for this?
Describe the issue
We had previously attempted to address this within Issue #77. However, this is still an issue for Redshift users. Fortunately, we were able to get to the bottom of the root issue and have found a more appropriate solution.
At the root of this issue, Redshift does not support partitioning by constant expressions. This is only rendered as an issue when materializing as a table. Therefore, we do not see this issue until the end models (for example the problem window function in the
stg_shopify__abandoned_checkout_discount_code
doesn't show as an error until theshopify__discounts
model is executed. The reason Redshift throws this error is because a constant expression is essentially useless when added to either a partition or order by statement in a window function. Redshift is telling us that there is no reason for a window function in this scenario, so it throws an error.While this is understandable that we shouldn't do a window function for constant expressions, we do still need to handle this because the constant expressions are only created because we generate an empty table when the source table doesn't yet exist. This will only be an issue until source data becomes available so we will need to address this in the interim period before data is available at the source.
Relevant error log or model output
constant expressions are not supported in partition by clauses
Expected behavior
The package is able to handle window functions both when there is no source data available and when there is source data available without throwing a constant expression error.
Possible solution
When exploring this I found the reason for this error is because when we insert the
null
records in the fields cte via the fill_staging_columns macro we are essentially creating a data frame with one row (all being null), then when we attempt to apply the window function in the final cte we are hit with an error because essentially all fields are null.What I found the be the root of this issue is that the fields cte artificially produces 1 row, when in fact there should be 0 records. If we add a condition to
limit 0
the results of the field cte when there are no records in the source, then we are able to avoid this issue altogether. Since the window function does not actual need to process anything as a result of there being no records. Therefore, I propose the following changes:dbt Project configurations
Nothing specific
Package versions
Latest
What database are you using dbt with?
redshift
How are you running this dbt package?
Fivetran Quickstart Data Model, Fivetran Transformations, dbt Core™, dbt Cloud™
dbt Version
Latest
Additional Context
No response
Are you willing to open a PR to help address this issue?
The text was updated successfully, but these errors were encountered: