Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/performance enhancement #41

Merged
merged 42 commits into from
Feb 21, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
29c24b7
Update README.md
fivetran-dejantucakov Dec 5, 2023
3497935
feature/performance-enhancement
fivetran-catfritz Jan 2, 2024
a9119ce
update to incremental
fivetran-catfritz Jan 3, 2024
0de2a07
update to incremental
fivetran-catfritz Jan 3, 2024
1129752
feature/performance-enhancement
fivetran-catfritz Jan 23, 2024
198081e
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
46fd5c4
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
18ac897
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
c6cd6a8
update clustering
fivetran-catfritz Jan 25, 2024
0fd82bf
update clustering
fivetran-catfritz Jan 25, 2024
334010b
Merge pull request #40 from fivetran/fivetran-dejantucakov-patch-1
fivetran-catfritz Jan 26, 2024
f049d87
update changelog & readme
fivetran-catfritz Jan 26, 2024
bcc310d
update ymls
fivetran-catfritz Jan 26, 2024
19901af
update readme
fivetran-catfritz Jan 26, 2024
d1ae3f1
updates
fivetran-catfritz Jan 26, 2024
ba53a3e
update changelog, ymls, regen docs
fivetran-catfritz Jan 30, 2024
00e2ccc
update changelog
fivetran-catfritz Jan 30, 2024
5f374f7
update changelog
fivetran-catfritz Jan 30, 2024
b17a9bd
update lookbacks
fivetran-catfritz Feb 2, 2024
702694b
update lookbacks
fivetran-catfritz Feb 2, 2024
dcb14e7
update lookbacks
fivetran-catfritz Feb 6, 2024
6345dc1
update readme
fivetran-catfritz Feb 6, 2024
2cb15a3
update
fivetran-catfritz Feb 6, 2024
823e2d0
update
fivetran-catfritz Feb 6, 2024
1b66f95
updates
fivetran-catfritz Feb 21, 2024
4b760c0
delete extra macro
fivetran-catfritz Feb 21, 2024
6f4c906
updates
fivetran-catfritz Feb 21, 2024
59e2434
updates
fivetran-catfritz Feb 21, 2024
2fac3fd
Merge pull request #43 from fivetran/feature/test-materializations
fivetran-catfritz Feb 21, 2024
6574992
Merge branch 'main' into feature/performance-enhancement
fivetran-catfritz Feb 21, 2024
5e91f92
update var names
fivetran-catfritz Feb 21, 2024
7b54bdb
update macro
fivetran-catfritz Feb 21, 2024
9bfcffa
remove extra comma
fivetran-catfritz Feb 21, 2024
ee45a9a
Apply suggestions from code review
fivetran-catfritz Feb 21, 2024
ea07fae
Update models/staging/stg_mixpanel__user_event_date_spine.sql
fivetran-catfritz Feb 21, 2024
befd1be
Apply suggestions from code review
fivetran-catfritz Feb 21, 2024
c8fe97d
update models, readme, changelog
fivetran-catfritz Feb 21, 2024
6e4fbc5
update changelog and regen docs
fivetran-catfritz Feb 21, 2024
ed92bba
update yml
fivetran-catfritz Feb 21, 2024
f9ae48d
update changelog
fivetran-catfritz Feb 21, 2024
d818a3a
add autoreleaser
fivetran-catfritz Feb 21, 2024
4f245f0
update changelog
fivetran-catfritz Feb 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update var names
  • Loading branch information
fivetran-catfritz committed Feb 21, 2024
commit 5e91f921c1421c52c28538c25027498c59d09834
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
## Feature Updates
- Added a default 7-day look-back to incremental models to accommodate late arriving events. The numbers of days can be changed by setting the var `lookback_window` in your dbt_project.yml. See the [Lookback Window section of the README](https://github.com/fivetran/dbt_mixpanel/blob/main/README.md#lookback-window) for more details.
- Note: this replaces the variable `sessionization_trailing_window`, which was previously used in the `mixpanel__sessions` model. This variable was replaced due to the change in the incremental and lookback strategy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of this variable should be treated as a breaking change in case users are leveraging this in their current workflow. Would you be able to move this to the breaking change section.

- Added column `dbt_run_date` to incremental models to capture the date a record was added or updated by this package.

# dbt_mixpanel v0.8.0
>Note: If you run into issues with this update, we suggest to try a **full refresh**.
Expand Down
17 changes: 17 additions & 0 deletions macros/date_today.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{% macro date_today(col_name) %}

{{ adapter.dispatch('date_today', 'mixpanel') (col_name) }}

{% endmacro %}

{% macro default__date_today(col_name)) %}

cast( {{ dbt.date_trunc('day', dbt.current_timestamp_backcompat()) }} as date) as {{ col_name }}

{% endmacro %}

{% macro sqlserver__date_today(col_name)) %}

cast( {{ dbt.date_trunc('day', dbt.current_timestamp()) }} as date) as {{ col_name }}

{% endmacro %}
12 changes: 6 additions & 6 deletions macros/mixpanel_lookback.sql
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
{% macro mixpanel_lookback(from_date, datepart, interval, default_start_date=var('default_start_date', '2010-01-01')) %}
{% macro mixpanel_lookback(from_date, datepart, interval, safety_date='2010-01-01') %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the rename of the safety_date argument. This is a lot more clear around what this is used for. Thanks!


{{ adapter.dispatch('mixpanel_lookback', 'mixpanel') (from_date, datepart, interval, default_start_date='2010-01-01') }}
{{ adapter.dispatch('mixpanel_lookback', 'mixpanel') (from_date, datepart, interval, safety_date='2010-01-01') }}

{%- endmacro %}

{% macro default__mixpanel_lookback(from_date, datepart, interval, default_start_date=var('default_start_date', '2010-01-01')) %}
{% macro default__mixpanel_lookback(from_date, datepart, interval, safety_date='2010-01-01') %}

coalesce(
(select {{ dbt.dateadd(datepart=datepart, interval=-interval, from_date_or_timestamp=from_date) }}
from {{ this }}),
{{ "'" ~ default_start_date ~ "'" }}
{{ "'" ~ safety_date ~ "'" }}
)

{% endmacro %}

{% macro bigquery__fivetran_log_lookback(from_date, datepart, interval, default_start_date='2010-01-01') %}
{% macro bigquery__fivetran_log_lookback(from_date, datepart, interval, safety_date='2010-01-01') %}

-- Capture the latest timestamp in a call statement instead of a subquery for optimizing BQ costs on incremental runs
{%- call statement('date_agg', fetch_result=True) -%}
Expand All @@ -29,7 +29,7 @@

coalesce(
{{ dbt.dateadd(datepart=datepart, interval=-interval, from_date_or_timestamp="'" ~ date_agg ~ "'") }},
{{ "'" ~ default_start_date ~ "'" }}
{{ "'" ~ safety_date ~ "'" }}
)

{% endmacro %}
16 changes: 14 additions & 2 deletions models/mixpanel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,10 @@ models:

- name: has_bluetooth_enabled
description: Boolean that is true if Bluetooth is enabled, false if not.


- name: dbt_run_date
description: The date of the dbt run when the record was added.

- name: mixpanel__daily_events
description: >
Table of each **event type's** daily history of activity, as reflected in user retention and event metrics.
Expand Down Expand Up @@ -225,7 +228,10 @@ models:
tests:
- unique
- not_null


- name: dbt_run_date
description: The date of the dbt run when the record was added.

- name: mixpanel__monthly_events
description: >
Table of each **event type's** monthly history of activity, as reflected in user retention and event metrics.
Expand Down Expand Up @@ -266,6 +272,9 @@ models:
- unique
- not_null

- name: dbt_run_date
description: The date of the dbt run when the record was added.

- name: mixpanel__sessions
description: >
Table aggregating events into unique user sessions, according to the `sessionization_inactivity` timeout length.
Expand Down Expand Up @@ -304,6 +313,9 @@ models:
- name: user_id
description: Coalescing of `device_id` and `people_id`.

- name: dbt_run_date
description: The date of the dbt run when the record was added.

macros:
- name: analyze_funnel
description: >
Expand Down
3 changes: 2 additions & 1 deletion models/mixpanel__daily_events.sql
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,8 @@ final as (
number_of_users - number_of_new_users - number_of_repeat_users as number_of_return_users,
trailing_users_28d,
trailing_users_7d,
{{ dbt_utils.generate_surrogate_key(['event_type', 'date_day']) }} as unique_key
{{ dbt_utils.generate_surrogate_key(['event_type', 'date_day']) }} as unique_key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to generate a surrogate key here? The surrogate key will create a hash whereas the previous record was a concatenation of the two records. We are losing some decipherable information if we leverage the surrogate key, although I am not sure if this change was made to work better with the incremental updates.

If we do end up changing this field we will need to update the docs and also call this out as part of a breaking change as this will drastically change the previous results.

What are your thoughts?

{{ mixpanel.date_today('dbt_run_date')}}

from agg_event_days

Expand Down
3 changes: 2 additions & 1 deletion models/mixpanel__event.sql
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ dedupe as (
pivot_properties as (

select
*
*,
{{ mixpanel.date_today('dbt_run_date')}},
{% if var('event_properties_to_pivot') %}
, {{ fivetran_utils.pivot_json_extract(string = 'event_properties', list_of_properties = var('event_properties_to_pivot')) }}
{% endif %}
Expand Down
3 changes: 2 additions & 1 deletion models/mixpanel__monthly_events.sql
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,8 @@ final as (
-- subtract the returned users from the previous month's total users to get the # churned
-- note: churned users refer to users who did something last month and not this month
coalesce(lag(number_of_users, 1) over(partition by event_type order by date_month asc) - number_of_repeat_users, 0) as number_of_churn_users,
{{ dbt_utils.generate_surrogate_key(['event_type', 'date_month']) }} as unique_key
{{ dbt_utils.generate_surrogate_key(['event_type', 'date_month']) }} as unique_key,
{{ mixpanel.date_today('dbt_run_date')}}

from monthly_metrics
)
Expand Down
3 changes: 2 additions & 1 deletion models/mixpanel__sessions.sql
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,8 @@ session_join as (
session_ids.user_id, -- coalescing of device_id and peeople_id
session_ids.device_id,
session_ids.total_number_of_events,
agg_event_types.event_frequencies
agg_event_types.event_frequencies,
{{ mixpanel.date_today('dbt_run_date')}}

{% if var('session_passthrough_columns', []) != [] %}
,
Expand Down