Skip to content

v2.0.0 Performance updates and other QoL changes

Compare
Choose a tag to compare
@adamribaudo-velir adamribaudo-velir released this 18 Nov 17:59
· 113 commits to main since this release

The primary focus of this release is performance optimizations with an eye towards reducing table scans and ultimately BQ query costs. There are also a number of new features and quality-of-life improvements.

With regards to query optimization, the following major changes were made:

  • Removed window functions from stg_ga4__events and other models to avoid unnecessary table scans
  • Moved event deduplication to base_ga4__events so that events are deduped on the way in to the base model
  • Created an incremental fct_ga4__sessions_daily partitioned table that offers an efficient (cheap) method of calculating session-level metrics and conversion counts
  • Created an incremental fct_ga4__pages partitioned table containing page-level metrics

Note that mart tables with records that contain information spanning multiple days cannot be partitioned on date and will incur run costs commensurate with the size of your implementation. You can disable any specific models you like by updating the models section of your dbt_project.yml file.

Breaking changes

  • The inputs to the surrogate keys have changed which means that new event, session, and user keys will not match previous keys
  • The mart tables have changed significantly which may introduce errors for any connected BI tools
  • Removed user_key in many cases in favor of user_pseudo_id. A mapping between user_id and user_pseudo_id will be provided in the future.

Full List of Changes

New Contributors

Full Changelog: 1.0.0...2.0.0