v2.0.0 Performance updates and other QoL changes
The primary focus of this release is performance optimizations with an eye towards reducing table scans and ultimately BQ query costs. There are also a number of new features and quality-of-life improvements.
With regards to query optimization, the following major changes were made:
- Removed window functions from
stg_ga4__events
and other models to avoid unnecessary table scans - Moved event deduplication to
base_ga4__events
so that events are deduped on the way in to the base model - Created an incremental
fct_ga4__sessions_daily
partitioned table that offers an efficient (cheap) method of calculating session-level metrics and conversion counts - Created an incremental
fct_ga4__pages
partitioned table containing page-level metrics
Note that mart tables with records that contain information spanning multiple days cannot be partitioned on date and will incur run costs commensurate with the size of your implementation. You can disable any specific models you like by updating the models
section of your dbt_project.yml
file.
Breaking changes
- The inputs to the surrogate keys have changed which means that new event, session, and user keys will not match previous keys
- The mart tables have changed significantly which may introduce errors for any connected BI tools
- Removed
user_key
in many cases in favor ofuser_pseudo_id
. A mapping betweenuser_id
anduser_pseudo_id
will be provided in the future.
Full List of Changes
- adding recommended events - share & generate_lead by @vibhorj in #47
- login, sign_up, and search recommended events by @3v-dgudaitis in #45
- Fixing 2 broken unit tests by @adamribaudo-velir in #50
- Allow renaming of custom event parameters (#48) by @willbryant in #49
- Flattened records by @dgitis in #53
- fix: query_parameter_exclusions applies to page_referrer too (#55) by @willbryant in #56
- Support derived_session_properties (#54) by @willbryant in #58
- Fix inconsistent indentation in README yaml by @willbryant in #57
- Support defining custom parameters that apply to all events by @willbryant in #51
- Base model partition optimization fixes by @adamribaudo-velir in #59
- Fix Google Ads Attributed to Organic by @dgitis in #70
- Add stream_id and platform to the sessions dimension table by @willbryant in #75
- Fix integration tests for detect_gclid change (#78) by @willbryant in #79
- Add landing_page_referrer to the sessions dimensions table by @willbryant in #77
- Use user_pseudo_id rather than user_key to construct session_key (#61) by @willbryant in #76
- Fix dim_ga4__sessions unique test by @adamribaudo-velir in #73
- Fix to make event_key unique by @adamribaudo-velir in #74
- Simplify derived user properties & session properties queries by @willbryant in #82
- De-dup events in the base events queries (#74) by @willbryant in #83
- Channel grouping fixes by @willbryant in #85
- Utm content and term by @willbryant in #84
- refactor: Simplify sessions conversions queries by @willbryant in #88
- Support deriving session properties from user_properties too by @willbryant in #87
- Incremental session fact table that includes conversion metrics by @adamribaudo-velir in #64
New Contributors
- @3v-dgudaitis made their first contribution in #45
- @willbryant made their first contribution in #49
Full Changelog: 1.0.0...2.0.0