Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add explicit datatype casts #12

Merged
merged 13 commits into from
Jan 6, 2025
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# dbt_qualtrics_source v0.3.0
[PR #12](https://github.com/fivetran/dbt_qualtrics_source/pull/12) includes the following update:

## Under the Hood
- Explicitly casts all boolean fields as `{{ dbt.type_boolean() }}`.
- (Affects Redshift only) Creates new `qualtrics_union_data` macro to accommodate Redshift's treatment of empty tables.
- For each staging model, if the source table is not found in any of your schemas, the package will create a empty table with 0 rows for non-Redshift warehouses and a table with 1 all-`null` row for Redshift destinations.
- This is necessary as Redshift will ignore explicit data casts when a table is completely empty and materialize every column as a `varchar`. This throws errors in downstream transformations in the `zendesk` package. The 1 row will ensure that Redshift will respect the package's datatype casts.

# dbt_qualtrics_source v0.2.2
[PR #9](https://github.com/fivetran/dbt_qualtrics_source/pull/9) includes the following updates:

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ If you are **not** using the [Qualtrics transformation package](https://github.
```yml
packages:
- package: fivetran/qualtrics_source
version: [">=0.2.0", "<0.3.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.3.0", "<0.4.0"] # we recommend using ranges to capture non-breaking changes automatically
```

### Step 3: Define database and schema variables
Expand Down Expand Up @@ -78,12 +78,12 @@ By default, this package does not bring in data from the Qualtrics Research Core

```yml
vars:
qualtrics__using_core_contacts: False # default = True
qualtrics__using_core_mailing_lists: False # default = True
qualtrics__using_core_contacts: True # default = False
qualtrics__using_core_mailing_lists: True # default = False
```

### (Optional) Step 5: Additional configurations
<details><summary>Expand to view configurations</summary>
<details open><summary>Expand to view configurations</summary>

#### Passing Through Additional Fields
This package includes all source columns defined in the macros folder. You can add more columns using our pass-through column variables. These variables allow for the pass-through fields to be aliased (`alias`) and casted (`transform_sql`) if desired, but not required. Datatype casting is configured via a sql snippet within the `transform_sql` key. You may add the desired sql while omitting the `as field_name` at the end and your custom pass-though fields will be casted accordingly. Use the below format for declaring the respective pass-through variables:
Expand Down Expand Up @@ -122,7 +122,7 @@ models:
```

#### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable. This config is available only when running the package on a single connector.
> IMPORTANT: See this project's [`src_qualtrics.yml`](https://github.com/fivetran/dbt_qualtrics_source/blob/main/models/src_qualtrics.yml) for the default names.

```yml
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'qualtrics_source'
version: '0.2.2'
version: '0.3.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]

Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

253 changes: 214 additions & 39 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'qualtrics_source_integration_tests'
version: '0.2.2'
version: '0.3.0'
profile: 'integration_tests'
config-version: 2

Expand Down
137 changes: 137 additions & 0 deletions macros/qualtrics_union_data.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
{%- macro qualtrics_union_data(table_identifier, database_variable, schema_variable, default_database, default_schema, default_variable, union_schema_variable='union_schemas', union_database_variable='union_databases') -%}
fivetran-jamie marked this conversation as resolved.
Show resolved Hide resolved

{{ adapter.dispatch('qualtrics_union_data', 'qualtrics_source') (
table_identifier,
database_variable,
schema_variable,
default_database,
default_schema,
default_variable,
union_schema_variable,
union_database_variable
) }}

{%- endmacro -%}

{%- macro default__qualtrics_union_data(
table_identifier,
database_variable,
schema_variable,
default_database,
default_schema,
default_variable,
union_schema_variable,
union_database_variable
) -%}

{%- if var(union_schema_variable, none) -%}

{%- set relations = [] -%}

{%- if var(union_schema_variable) is string -%}
{%- set trimmed = var(union_schema_variable)|trim('[')|trim(']') -%}
{%- set schemas = trimmed.split(',')|map('trim'," ")|map('trim','"')|map('trim',"'") -%}
{%- else -%}
{%- set schemas = var(union_schema_variable) -%}
{%- endif -%}

{%- for schema in var(union_schema_variable) -%}
{%- set relation=adapter.get_relation(
database=source(schema, table_identifier).database if var('has_defined_sources', false) else var(database_variable, default_database),
schema=source(schema, table_identifier).schema if var('has_defined_sources', false) else schema,
identifier=source(schema, table_identifier).identifier if var('has_defined_sources', false) else table_identifier
) -%}

{%- set relation_exists=relation is not none -%}

{%- if relation_exists -%}
{%- do relations.append(relation) -%}
{%- endif -%}

{%- endfor -%}

{%- if relations != [] -%}
{{ dbt_utils.union_relations(relations) }}
{%- else -%}
{% if execute and not var('fivetran__remove_empty_table_warnings', false) -%}
{{ exceptions.warn("\n\nPlease be aware: The " ~ table_identifier|upper ~ " table was not found in your " ~ default_schema|upper ~ " schema(s). The Fivetran dbt package will create a completely empty " ~ table_identifier|upper ~ " staging model as to not break downstream transformations. To turn off these warnings, set the `fivetran__remove_empty_table_warnings` variable to TRUE (see https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source for details).\n") }}
{% endif -%}
select
cast(null as {{ dbt.type_string() }}) as _dbt_source_relation
limit {{ '0' if target.type != 'redshift' else '1' }}
{%- endif -%}

{%- elif var(union_database_variable, none) -%}

{%- set relations = [] -%}

{%- for database in var(union_database_variable) -%}
{%- set relation=adapter.get_relation(
database=source(schema, table_identifier).database if var('has_defined_sources', false) else database,
schema=source(schema, table_identifier).schema if var('has_defined_sources', false) else var(schema_variable, default_schema),
identifier=source(schema, table_identifier).identifier if var('has_defined_sources', false) else table_identifier
) -%}

{%- set relation_exists=relation is not none -%}

{%- if relation_exists -%}
{%- do relations.append(relation) -%}
{%- endif -%}

{%- endfor -%}

{%- if relations != [] -%}
{{ dbt_utils.union_relations(relations) }}
{%- else -%}
{% if execute and not var('fivetran__remove_empty_table_warnings', false) -%}
{{ exceptions.warn("\n\nPlease be aware: The " ~ table_identifier|upper ~ " table was not found in your " ~ default_schema|upper ~ " schema(s). The Fivetran dbt package will create a completely empty " ~ table_identifier|upper ~ " staging model as to not break downstream transformations. To turn off these warnings, set the `fivetran__remove_empty_table_warnings` variable to TRUE (see https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source for details).\n") }}
{% endif -%}
select
cast(null as {{ dbt.type_string() }}) as _dbt_source_relation
limit {{ '0' if target.type != 'redshift' else '1' }}
{%- endif -%}

{%- else -%}
{% set exception_schemas = {"linkedin_company_pages": "linkedin_pages", "instagram_business_pages": "instagram_business"} %}
{% set relation = namespace(value="") %}
{% if default_schema in exception_schemas.keys() %}
{% for corrected_schema_name in exception_schemas.items() %}
{% if default_schema in corrected_schema_name %}
{# In order for this macro to effectively work within upstream integration tests (mainly used by the Fivetran dbt package maintainers), this identifier variable selection is required to use the macro with different identifier names. #}
{% set identifier_var = corrected_schema_name[1] + "_" + table_identifier + "_identifier" %}
{%- set relation.value=adapter.get_relation(
database=source(corrected_schema_name[1], table_identifier).database,
schema=source(corrected_schema_name[1], table_identifier).schema,
identifier=var(identifier_var, table_identifier)
) -%}
{% endif %}
{% endfor %}
{% else %}
{# In order for this macro to effectively work within upstream integration tests (mainly used by the Fivetran dbt package maintainers), this identifier variable selection is required to use the macro with different identifier names. #}
{% set identifier_var = default_schema + "_" + table_identifier + "_identifier" %}
{# Unfortunately the Twitter Organic identifiers were misspelled. As such, we will need to account for this in the model. This will be adjusted in the Twitter Organic package, but to ensure backwards compatibility, this needs to be included. #}
{% if var(identifier_var, none) is none %}
{% set identifier_var = default_schema + "_" + table_identifier + "_identifer" %}
{% endif %}
{%- set relation.value=adapter.get_relation(
database=source(default_schema, table_identifier).database,
schema=source(default_schema, table_identifier).schema,
identifier=var(identifier_var, table_identifier)
) -%}
{% endif %}
{%- set table_exists=relation.value is not none -%}

{%- if table_exists -%}
select *
from {{ relation.value }}
{%- else -%}
{% if execute and not var('fivetran__remove_empty_table_warnings', false) -%}
{{ exceptions.warn("\n\nPlease be aware: The " ~ table_identifier|upper ~ " table was not found in your " ~ default_schema|upper ~ " schema(s). The Fivetran dbt package will create a completely empty " ~ table_identifier|upper ~ " staging model as to not break downstream transformations. To turn off these warnings, set the `fivetran__remove_empty_table_warnings` variable to TRUE (see https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source for details).\n") }}
{% endif -%}
select
cast(null as {{ dbt.type_string() }}) as _dbt_source_relation
limit {{ '0' if target.type != 'redshift' else '1' }}
{%- endif -%}
{%- endif -%}

{%- endmacro -%}
2 changes: 1 addition & 1 deletion models/stg_qualtrics__block.sql
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ final as (
randomize_questions,
survey_id,
type,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__block_question.sql
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ final as (
block_id,
question_id,
survey_id,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__contact_mailing_list_membership.sql
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ final as (
name,
owner_id as owner_user_id,
cast(unsubscribe_date as {{ dbt.type_timestamp() }}) as unsubscribed_at,
unsubscribed as is_unsubscribed,
cast(unsubscribed as {{ dbt.type_boolean() }}) as is_unsubscribed,
_fivetran_synced,
source_relation

Expand Down
4 changes: 2 additions & 2 deletions models/stg_qualtrics__core_contact.sql
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ final as (
{{ dbt.split_part('email', "'@'", 2) }} as email_domain,
external_data_reference,
language,
unsubscribed as is_unsubscribed,
_fivetran_deleted as is_deleted,
cast(unsubscribed as {{ dbt.type_boolean() }}) as is_unsubscribed,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__core_mailing_list.sql
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ final as (
name,
category,
folder,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__directory.sql
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ final as (
id as directory_id,
is_default,
name,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__directory_contact.sql
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ final as (
cast(creation_date as {{ dbt.type_timestamp() }}) as created_at,
directory_id,
cast(directory_unsubscribe_date as {{ dbt.type_timestamp() }}) as unsubscribed_from_directory_at,
directory_unsubscribed as is_unsubscribed_from_directory,
cast(directory_unsubscribed as {{ dbt.type_boolean() }}) as is_unsubscribed_from_directory,
lower(email) as email,
lower(email_domain) as email_domain,
ext_ref,
Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__directory_mailing_list.sql
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ final as (
cast(last_modified_date as {{ dbt.type_timestamp() }}) as last_modified_at,
name,
owner_id as owner_user_id,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__distribution.sql
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ final as (
cast(survey_link_expiration_date as {{ dbt.type_timestamp() }}) as survey_link_expires_at,
survey_link_link_type as survey_link_type,
survey_link_survey_id as survey_id,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__question.sql
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ final as (
validation_setting_force_response,
validation_setting_force_response_type,
validation_setting_type,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation
from fields
Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__question_option.sql
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ final as (
key,
recode_value,
text,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__sub_question.sql
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ final as (
question_id,
survey_id,
text,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
2 changes: 1 addition & 1 deletion models/stg_qualtrics__survey.sql
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ final as (
cast(last_accessed as {{ dbt.type_timestamp() }}) as last_accessed_at,
cast(last_activated as {{ dbt.type_timestamp() }}) as last_activated_at,
cast(last_modified as {{ dbt.type_timestamp() }}) as last_modified_at,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
6 changes: 3 additions & 3 deletions models/stg_qualtrics__survey_version.sql
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@ final as (
cast(creation_date as {{ dbt.type_timestamp() }}) as created_at,
description as version_description,
id as version_id,
published as is_published,
cast(published as {{ dbt.type_boolean() }}) as is_published,
survey_id,
user_id as publisher_user_id,
version_number,
was_published,
_fivetran_deleted as is_deleted,
cast(was_published as {{ dbt.type_boolean() }}) as was_published,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation

Expand Down
4 changes: 2 additions & 2 deletions models/stg_qualtrics__user.sql
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@ final as (
response_count_deleted,
response_count_generated,
time_zone,
unsubscribed as is_unsubscribed,
cast(unsubscribed as {{ dbt.type_boolean() }}) as is_unsubscribed,
user_type,
username,
_fivetran_deleted as is_deleted,
cast(_fivetran_deleted as {{ dbt.type_boolean() }}) as is_deleted,
_fivetran_synced,
source_relation
from fields
Expand Down
2 changes: 1 addition & 1 deletion models/tmp/stg_qualtrics__block_question_tmp.sql
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='block_question',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
2 changes: 1 addition & 1 deletion models/tmp/stg_qualtrics__block_tmp.sql
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='block',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='contact_mailing_list_membership',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
2 changes: 1 addition & 1 deletion models/tmp/stg_qualtrics__core_contact_tmp.sql
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{{ config(enabled=var('qualtrics__using_core_contacts', false)) }}
-- can disable
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='core_contact',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
2 changes: 1 addition & 1 deletion models/tmp/stg_qualtrics__core_mailing_list_tmp.sql
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{{ config(enabled=var('qualtrics__using_core_mailing_lists', false)) }}
-- can disable
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='core_mailing_list',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
2 changes: 1 addition & 1 deletion models/tmp/stg_qualtrics__directory_contact_tmp.sql
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{
fivetran_utils.union_data(
qualtrics_union_data(
table_identifier='directory_contact',
database_variable='qualtrics_database',
schema_variable='qualtrics_schema',
Expand Down
Loading
Loading