Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/performance enhancement #41

Merged
merged 42 commits into from
Feb 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
29c24b7
Update README.md
fivetran-dejantucakov Dec 5, 2023
3497935
feature/performance-enhancement
fivetran-catfritz Jan 2, 2024
a9119ce
update to incremental
fivetran-catfritz Jan 3, 2024
0de2a07
update to incremental
fivetran-catfritz Jan 3, 2024
1129752
feature/performance-enhancement
fivetran-catfritz Jan 23, 2024
198081e
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
46fd5c4
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
18ac897
feature/performance-enhancement
fivetran-catfritz Jan 25, 2024
c6cd6a8
update clustering
fivetran-catfritz Jan 25, 2024
0fd82bf
update clustering
fivetran-catfritz Jan 25, 2024
334010b
Merge pull request #40 from fivetran/fivetran-dejantucakov-patch-1
fivetran-catfritz Jan 26, 2024
f049d87
update changelog & readme
fivetran-catfritz Jan 26, 2024
bcc310d
update ymls
fivetran-catfritz Jan 26, 2024
19901af
update readme
fivetran-catfritz Jan 26, 2024
d1ae3f1
updates
fivetran-catfritz Jan 26, 2024
ba53a3e
update changelog, ymls, regen docs
fivetran-catfritz Jan 30, 2024
00e2ccc
update changelog
fivetran-catfritz Jan 30, 2024
5f374f7
update changelog
fivetran-catfritz Jan 30, 2024
b17a9bd
update lookbacks
fivetran-catfritz Feb 2, 2024
702694b
update lookbacks
fivetran-catfritz Feb 2, 2024
dcb14e7
update lookbacks
fivetran-catfritz Feb 6, 2024
6345dc1
update readme
fivetran-catfritz Feb 6, 2024
2cb15a3
update
fivetran-catfritz Feb 6, 2024
823e2d0
update
fivetran-catfritz Feb 6, 2024
1b66f95
updates
fivetran-catfritz Feb 21, 2024
4b760c0
delete extra macro
fivetran-catfritz Feb 21, 2024
6f4c906
updates
fivetran-catfritz Feb 21, 2024
59e2434
updates
fivetran-catfritz Feb 21, 2024
2fac3fd
Merge pull request #43 from fivetran/feature/test-materializations
fivetran-catfritz Feb 21, 2024
6574992
Merge branch 'main' into feature/performance-enhancement
fivetran-catfritz Feb 21, 2024
5e91f92
update var names
fivetran-catfritz Feb 21, 2024
7b54bdb
update macro
fivetran-catfritz Feb 21, 2024
9bfcffa
remove extra comma
fivetran-catfritz Feb 21, 2024
ee45a9a
Apply suggestions from code review
fivetran-catfritz Feb 21, 2024
ea07fae
Update models/staging/stg_mixpanel__user_event_date_spine.sql
fivetran-catfritz Feb 21, 2024
befd1be
Apply suggestions from code review
fivetran-catfritz Feb 21, 2024
c8fe97d
update models, readme, changelog
fivetran-catfritz Feb 21, 2024
6e4fbc5
update changelog and regen docs
fivetran-catfritz Feb 21, 2024
ed92bba
update yml
fivetran-catfritz Feb 21, 2024
f9ae48d
update changelog
fivetran-catfritz Feb 21, 2024
d818a3a
add autoreleaser
fivetran-catfritz Feb 21, 2024
4f245f0
update changelog
fivetran-catfritz Feb 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 11 additions & 32 deletions .github/PULL_REQUEST_TEMPLATE/maintainer_pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,27 @@
**This PR will result in the following new package version:**
<!--- Please add details around your decision for breaking vs non-breaking version upgrade. If this is a breaking change, were backwards-compatible options explored? -->

**Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:**
**Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:**
<!--- Copy/paste the CHANGELOG for this version below. -->

## PR Checklist
### Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
- [ ] dbt compile
- [ ] dbt run –full-refresh
- [ ] dbt run
- [ ] dbt test
- [ ] dbt run –vars (if applicable)
- [ ] dbt run –full-refresh && dbt test
- [ ] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:
- [ ] The appropriate issue has been linked and tagged
- [ ] You are assigned to the corresponding issue and this PR
- [ ] BuildKite integration tests are passing
- [ ] The appropriate issue has been linked, tagged, and properly assigned.
- [ ] All necessary documentation and version upgrades have been applied.
<!--- Be sure to update the package version in the dbt_project.yml, integration_tests/dbt_project.yml, and README if necessary. -->
- [ ] docs were regenerated (unless this PR does not include any code or yml updates).
- [ ] BuildKite integration tests are passing.
- [ ] Detailed validation steps have been provided below.

### Detailed Validation
Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":
- [ ] You have validated these changes and assure this PR will address the respective Issue/Feature.
- [ ] You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
- [ ] You have provided details below around the validation steps performed to gain confidence in these changes.
Please share any and all of your validation steps:
<!--- Provide the steps you took to validate your changes below. -->

### Standard Updates
Please acknowledge that your PR contains the following standard updates:
- Package versioning has been appropriately indexed in the following locations:
- [ ] indexed within dbt_project.yml
- [ ] indexed within integration_tests/dbt_project.yml
- [ ] CHANGELOG has individual entries for each respective change in this PR
<!--- If there is a parallel upstream change, remember to reference the corresponding CHANGELOG as an individual entry. -->
- [ ] README updates have been applied (if applicable)
<!--- Remember to check the following README locations for common updates. →
<!--- Suggested install range (needed for breaking changes) →
<!--- Dependency matrix is appropriately updated (if applicable) →
<!--- New variable documentation (if applicable) -->
- [ ] DECISIONLOG updates have been updated (if applicable)
- [ ] Appropriate yml documentation has been added (if applicable)

### dbt Docs
Please acknowledge that after the above were all completed the below were applied to your branch:
- [ ] docs were regenerated (unless this PR does not include any code or yml updates)

### If you had to summarize this PR in an emoji, which would it be?
<!--- For a complete list of markdown compatible emojis check our this git repo (https://gist.github.com/rxaviers/7360908) -->
:dancer:
13 changes: 13 additions & 0 deletions .github/workflows/auto-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: 'auto release'
on:
pull_request:
types:
- closed
branches:
- main

jobs:
call-workflow-passing-data:
if: github.event.pull_request.merged
uses: fivetran/dbt_package_automations/.github/workflows/auto-release.yml@main
secrets: inherit
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,33 @@
# dbt_mixpanel v0.9.0
[PR #41](https://github.com/fivetran/dbt_mixpanel/pull/41) includes the following updates:

## 🚨 Breaking Changes 🚨

> ⚠️ Since the following changes are breaking, a `--full-refresh` after upgrading will be required.

- Added a default 7-day look-back to incremental models to accommodate late arriving events. The number of days can be changed by setting the var `lookback_window` in your dbt_project.yml. See the [Lookback Window section of the README](https://github.com/fivetran/dbt_mixpanel/blob/main/README.md#lookback-window) for more details.
- **Note:** This replaces the variable `sessionization_trailing_window`, which was previously used in the `mixpanel__sessions` model. This variable was replaced due to the change in the incremental and lookback strategy.

- Performance improvements:
- Updated the incremental strategy for of the following models to `insert_overwrite` for BigQuery and Databricks and `delete+insert` for all other supported warehouses.
- `stg_mixpanel__user_event_date_spine`
- `mixpanel__event`
- `mixpanel__daily_events`
- `mixpanel__monthly_events`
- `mixpanel__sessions`
- Removed `stg_mixpanel__event_tmp` in favor of ephemeral model `stg_mixpanel__event`. This is to reduce redundancy of models created and reduce the number of full scans.
- Updated the materialization of `stg_mixpanel__user_first_event` from a table to a view. This model is used in one downstream model, so a view will reduce storage requirements while not significantly hindering performance.
- For Snowflake and BigQuery destinations, added `cluster_by` columns to the configs for incremental models.
- For Databricks destinations, updated incremental model file formats to `parquet` for compatibility with the `insert_overwrite` strategy.

## Feature Updates
- Added column `dbt_run_date` to incremental end models to capture the date a record was added or updated by this package.
- Added `_fivetran_id` to the `mixpanel__event` model, since this is the source `event` table's primary key as of the [March 2023 connector release notes](https://fivetran.com/docs/applications/mixpanel/changelog#march2023).

## Contributors
- [@jasongroob](https://github.com/jasongroob) ([#41](https://github.com/fivetran/dbt_mixpanel/pull/41))
- [@CraigWilson-ZOE](https://github.com/CraigWilson-ZOE) ([#38](https://github.com/fivetran/dbt_mixpanel/issues/38))

fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
# dbt_mixpanel v0.8.0
>Note: If you run into issues with this update, we suggest to try a **full refresh**.
## 🎉 Feature Updates 🎉
Expand Down
25 changes: 11 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,13 @@ dispatch:
```

### Database Incremental Strategies
Some end models in this package are materialized incrementally. We currently use the `merge` strategy as the default strategy for BigQuery, Snowflake, and Databricks databases. For Redshift and Postgres databases, we use `delete+insert` as the default strategy.
Many of the end models in this package are materialized incrementally, so we have configured our models to work with the different strategies available to each supported warehouse.

We recognize there are some limitations with these strategies, particularly around updated records in the past which cause duplicates, and are assessing using a different strategy in the future.
For **BigQuery** and **Databricks** destinations, we have chosen `insert_overwrite` as the default strategy, which benefits from the partitioning capability.

> For either of these strategies, we highly recommend that users periodically run a `--full-refresh` to ensure a high level of data quality.
For **Snowflake**, **Redshift**, and **Postgres** databases, we have chosen `delete+insert` as the default strategy.

> Regardless of strategy, we recommend that users periodically run a `--full-refresh` to ensure a high level of data quality.

## Step 2: Install the package
Include the following mixpanel package version in your `packages.yml` file:
Expand All @@ -69,7 +71,7 @@ Include the following mixpanel package version in your `packages.yml` file:
```yaml
packages:
- package: fivetran/mixpanel
version: [">=0.8.0", "<0.9.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.9.0", "<0.10.0"] # we recommend using ranges to capture non-breaking changes automatically
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
```

## Step 3: Define database and schema variables
Expand All @@ -82,7 +84,6 @@ vars:
```

## (Optional) Step 4: Additional configurations
<details><summary>Expand for configurations</summary>

## Macros
### analyze_funnel [(source)](https://github.com/fivetran/dbt_mixpanel/blob/master/macros/analyze_funnel.sql)
Expand All @@ -98,7 +99,7 @@ The macro takes the following as arguments:
- `event_funnel`: List of event types (not case sensitive).
- Example: `'['play_song', 'stop_song', 'exit']`
- `group_by_column`: (Optional) A column by which you want to segment the funnel (this macro pulls data from the `mixpanel__event` model). The default value is `None`.
- Examaple: `group_by_column = 'country_code'`.
- Example: `group_by_column = 'country_code'`.
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
- `conversion_criteria`: (Optional) A `WHERE` clause that will be applied when selecting from `mixpanel__event`.
- Example: To limit all events in the funnel to the United States, you'd provide `conversion_criteria = 'country_code = "US"'`. To limit the events to only song play events to the US, you'd input `conversion_criteria = 'country_code = "US"' OR event_type != 'play_song'`.

Expand Down Expand Up @@ -199,15 +200,13 @@ vars:
session_event_criteria: 'event_type in ("play_song", "stop_song", "create_playlist")'
```

#### Session Trailing Window
Events can sometimes come late. For example, events triggered on a mobile device that is offline will be sent to Mixpanel once the device reconnects to wifi or a cell network. This makes sessionizing a bit trickier/costlier, as the sessions model (and all final models in this package) is materialized as an incremental table.

Therefore, to avoid requiring a full refresh to incorporate these delayed events into sessions, the package by default re-sessionizes the most recent 3 hours of events on each run. To change this, add the following variable to your `dbt_project.yml` file:
#### Lookback Window
Events can sometimes arrive late. For example, events triggered on a mobile device that is offline will be sent to Mixpanel once the device reconnects to wifi or a cell network. Since many of the models in this package are incremental, by default we look back 7 days to ensure late arrivals are captured while avoiding requiring a full refresh. To change the default lookback window, add the following variable to your `dbt_project.yml` file:

```yml
vars:
mixpanel:
sessionization_trailing_window: number_of_hours # ex: 12
lookback_window: number_of_days # default is 7
```

### Changing the Build Schema
Expand All @@ -224,7 +223,7 @@ models:
### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:

> IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_mixpanel_source/blob/main/dbt_project.yml) variable declarations to see the expected names.
> IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_mixpanel/blob/main/dbt_project.yml) variable declarations to see the expected names.

```yml
vars:
Expand All @@ -241,8 +240,6 @@ Events are considered duplicates and consolidated by the package if they contain

This is performed in line with Mixpanel's internal de-duplication process, in which events are de-duped at the end of each day. This means that if an event was triggered during an offline session at 11:59 PM and _resent_ when the user came online at 12:01 AM, these records would _not_ be de-duplicated. This is the case in both Mixpanel and the Mixpanel dbt package.

</details>

## (Optional) Step 5: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for details</summary>
<br>
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
config-version: 2
name: 'mixpanel'
version: '0.8.0'
version: '0.9.0'
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
mixpanel:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

24 changes: 12 additions & 12 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

Loading
Loading