From e8e084dc11b4e808594b0be748d799513193825e Mon Sep 17 00:00:00 2001 From: Luca Wintergerst Date: Wed, 31 Jul 2024 10:07:55 +0200 Subject: [PATCH] Update sampling.asciidoc provide more accurate description on how metrics are calculated when sampling --- docs/en/observability/apm/sampling.asciidoc | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/en/observability/apm/sampling.asciidoc b/docs/en/observability/apm/sampling.asciidoc index f1e3c49b77..03fb7e4abd 100644 --- a/docs/en/observability/apm/sampling.asciidoc +++ b/docs/en/observability/apm/sampling.asciidoc @@ -105,10 +105,20 @@ A sampled trace retains all data associated with it. A non-sampled trace drops all <> and <> data^1^. Regardless of the sampling decision, all traces retain <> data. -Some visualizations in the {apm-app}, like latency, are powered by aggregated transaction and span <>. -Metrics are based on sampled traces and weighted by the inverse sampling rate. -For example, if you sample at 5%, each trace is counted as 20. -As a result, as the variance of latency increases, or the sampling rate decreases, your level of error will increase. +Some visualizations in the {apm-app}, like latency, are powered by aggregated transaction and span <>. +The way these metrics are calculated depends on the sampling method used: + +* **Head-based sampling**: Metrics are calculated based on all sampled events. + +* **Tail-based sampling**: Metrics are calculated based on all events, regardless of whether they are ultimately sampled or not. + +* **Both head and tail-based sampling**: When both methods are used together, metrics are calculated based on all events that were sampled by the head-based sampling policy. + +For all sampling methods, metrics are weighted by the inverse sampling rate of the head-based sampling policy to provide an estimate of the total population. +For example, if your head-based sampling rate is 5%, each sampled trace is counted as 20. +As the variance of latency increases or the head-based sampling rate decreases, the level of error in these calculations may increase. + +These calculation methods ensure that the APM app provides the most accurate metrics possible given the sampling strategy in use, while also accounting for the head-based sampling rate to estimate the full population of traces. ^1^ Real User Monitoring (RUM) traces are an exception to this rule. The {kib} apps that utilize RUM data depend on transaction events,