This dbt package generates queries based on metrics, introduced to dbt Core in v1.0.
Check dbt Hub for the latest installation instructions, or read the docs for more information on installing packages.
To enable the dynamic referencing of models necessary for macro queries through the dbt Server, queries generated by this package do not participate in the DAG and ref
'd nodes will not necessarily be built before they are accessed. Refer to the docs on forcing dependencies for more details.
Access metrics like any other macro:
select *
from {{ metrics.metric(
metric_name='new_customers',
grain='week',
dimensions=['plan', 'country'],
secondary_calculations=[
metrics.period_over_period(comparison_strategy="ratio", interval=1, alias="pop_1wk"),
metrics.period_over_period(comparison_strategy="difference", interval=1),
metrics.period_to_date(aggregate="average", period="month", alias="this_month_average"),
metrics.period_to_date(aggregate="sum", period="year"),
metrics.rolling(aggregate="average", interval=4, alias="avg_past_4wks"),
metrics.rolling(aggregate="min", interval=4)
],
start_date='2020-01-01',
end_date="date_trunc('day', getdate())"
) }}
start_date
and end_date
are optional. When not provided, the spine will span all dates from oldest to newest in the metric's dataset. This default is likely to be correct in most cases, but you can use the arguments to either narrow the resulting table or expand it (e.g. if there was no new customers until 3 January but you want to include the first two days as well). Both values are inclusive.
You may want to materialize the results as a fixed table for querying. This is not the way we expect the dbt Metrics layer to add the most value, but is a way to experiment with the project without needing access to the interactive server.
When the dbt server is released later in 2022, you will be able to access these macros interactively, without needing to build each variant as a single dbt model. For more information, check out the keynote presentation from Coalesce 2021.
Secondary calculations are window functions which act on the primary metric. You can use them to compare a metric's value to an earlier period and calculate year-to-date sums or rolling averages.
Create secondary calculations using the convenience constructor macros. Alternatively, you can manually create a list of dictionary entries (not recommended).
Example of manual dictionary creation (not recommended)
Creating a calculation this way has no input validation.
[
{"calculation": "period_over_period", "interval": 1, "comparison_strategy": "difference", "alias": "pop_1mth"},
{"calculation": "rolling", "interval": 3, "aggregate": "sum"}
]
Column aliases are automatically generated, but you can override them by setting alias
.
Period over Period (source)
Constructor: metrics.period_over_period(comparison_strategy, interval [, alias])
comparison_strategy
: How to calculate the delta between the two periods. One of ["ratio"
,"difference"
]. Requiredinterval
: The number of periods to look back. Requiredalias
: The column alias for the resulting calculation. Optional
Period to Date (source)
Constructor: metrics.period_to_date(aggregate, period [, alias])
aggregate
: The aggregation to use in the window function. Options vary based on the primary aggregation and are enforced in validate_aggregate_coherence(). Requiredperiod
: The time grain to aggregate to. One of ["day"
,"week"
,"month"
,"quarter"
,"year"
]. Must be at equal or lesser granularity than the metric's grain (see Time Grains below). Requiredalias
: The column alias for the resulting calculation. Optional
Rolling (source)
Constructor: metrics.rolling(aggregate, interval [, alias])
aggregate
: The aggregation to use in the window function. Options vary based on the primary aggregation and are enforced in validate_aggregate_coherence(). Requiredinterval
: The number of periods to look back. Requiredalias
: The column alias for the resulting calculation. Optional
Most behaviour in the package can be overridden or customised.
The package comes with a basic calendar table, running between 2010-01-01 and 2029-12-31 inclusive. You can replace it with any custom calendar table which meets the following requirements:
- Contains a
date_day
column. - Contains the following columns:
date_week
,date_month
,date_quarter
,date_year
, or equivalents. - Additional date columns need to be prefixed with
date_
, e.g.date_4_5_4_month
for a 4-5-4 retail calendar date set. Dimensions can have any name (see dimensions on calendar tables).
To do this, set the value of the dbt_metrics_calendar_model
variable in your dbt_project.yml
file:
#dbt_project.yml
config-version: 2
[...]
vars:
dbt_metrics_calendar_model: ref('my_custom_table')
The package protects against nonsensical secondary calculations, such as a month-to-date aggregate of data which has been rolled up to the quarter. If you customise your calendar (for example by adding a 4-5-4 retail calendar month), you will need to override the get_grain_order()
macro. In that case, you might remove month
and replace it with month_4_5_4
. All date columns must be prefixed with date_
in the table. Do not include the prefix when defining your metric, it will be added automatically.
To create a custom primary aggregation (as exposed through the type
config of a metric), create a macro of the form metric_my_aggregate(expression)
, then override the aggregate_primary_metric()
macro to add it to the dispatch list. The package also protects against nonsensical secondary calculations such as an average of an average; you will need to override the get_metric_allowlist()
macro to both add your new aggregate to to the existing aggregations' allowlists, and to make an allowlist for your new aggregation:
{% do return ({
"average": ['max', 'min'],
"count": ['max', 'min', 'average', 'my_new_aggregate'],
[...]
"my_new_aggregate": ['max', 'min', 'sum', 'average', 'my_new_aggregate']
}) %}
To create a custom secondary aggregation (as exposed through the secondary_calculations
parameter in the metric
macro), create a macro of the form secondary_calculation_my_calculation(metric_name, dimensions, calc_config)
, then override the perform_secondary_calculations()
macro.
Aliases can be set for a secondary calculation. If no alias is provided, one will be automatically generated. To modify the existing alias logic, or add support for a custom secondary calculation, override generate_secondary_calculation_alias()
.
You may want to aggregate metrics by a dimension in your custom calendar table, for example is_weekend
. In addition to the primary dimensions
list, add the following meta
properties to your metric:
version: 2
metrics:
- name: new_customers
[...]
dimensions:
- plan
- country
meta:
dimensions:
- type: model
columns:
- plan
- country
- type: calendar
columns:
- is_weekend
You can then access the additional dimensions as normal:
select *
from {{ metrics.metric(
metric_name='new_customers',
grain='week',
dimensions=['plan', 'country', 'is_weekend'],
secondary_calcs=[]
) }}