Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sampling.asciidoc #4101

Merged
merged 1 commit into from
Jul 31, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions docs/en/observability/apm/sampling.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -105,10 +105,20 @@ A sampled trace retains all data associated with it.
A non-sampled trace drops all <<apm-data-model-spans,span>> and <<apm-data-model-transactions,transaction>> data^1^.
Regardless of the sampling decision, all traces retain <<apm-data-model-errors,error>> data.

Some visualizations in the {apm-app}, like latency, are powered by aggregated transaction and span <<apm-data-model-metrics,metrics>>.
Metrics are based on sampled traces and weighted by the inverse sampling rate.
For example, if you sample at 5%, each trace is counted as 20.
As a result, as the variance of latency increases, or the sampling rate decreases, your level of error will increase.
Some visualizations in the {apm-app}, like latency, are powered by aggregated transaction and span <<apm-data-model-metrics,metrics>>.
The way these metrics are calculated depends on the sampling method used:

* **Head-based sampling**: Metrics are calculated based on all sampled events.

* **Tail-based sampling**: Metrics are calculated based on all events, regardless of whether they are ultimately sampled or not.

* **Both head and tail-based sampling**: When both methods are used together, metrics are calculated based on all events that were sampled by the head-based sampling policy.

For all sampling methods, metrics are weighted by the inverse sampling rate of the head-based sampling policy to provide an estimate of the total population.
For example, if your head-based sampling rate is 5%, each sampled trace is counted as 20.
As the variance of latency increases or the head-based sampling rate decreases, the level of error in these calculations may increase.

These calculation methods ensure that the APM app provides the most accurate metrics possible given the sampling strategy in use, while also accounting for the head-based sampling rate to estimate the full population of traces.

^1^ Real User Monitoring (RUM) traces are an exception to this rule.
The {kib} apps that utilize RUM data depend on transaction events,
Expand Down