Skip to content

Commit

Permalink
Add docs about timeslice SLI (#3604) (#3632)
Browse files Browse the repository at this point in the history
(cherry picked from commit 2311c5e)

Co-authored-by: DeDe Morton <[email protected]>
  • Loading branch information
mergify[bot] and dedemorton authored Feb 22, 2024
1 parent 9a6c1a7 commit b4df72d
Showing 1 changed file with 34 additions and 3 deletions.
37 changes: 34 additions & 3 deletions docs/en/observability/slo-create.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The type of SLI to use depends on the location of your data:

* <<custom-kql-sli, Custom KQL>> — create an SLI based on raw logs coming from your services.
* <<custom-metric-sli, Custom metric>> — create an SLI to define custom equations from metric fields in your indices.
* <<timeslice-metric-sli, Timeslice metric>> — create an SLI based on a custom equation that uses multiple aggregations.
* <<histogram-metric-sli, Histogram metric>> — create an SLI based on histogram metrics.
* <<apm-latency-and-availability-sli, APM latency and APM availability>> — create an SLI based on services using application performance monitoring (APM).

Expand All @@ -44,7 +45,7 @@ When defining a custom KQL SLI, set the following fields:
* *Query filter* — A KQL filter to specify relevant criteria by which to filter the index documents.
* *Good query* — The query yielding events that are considered good or successful. For example, `nested.field.response.latency <= 100 and nested.field.env : “production”`
* *Total query* — The query yielding all events to take into account for computing the SLI. For example, `nested.field.env : “production”`.
* *Partition by* — The field used to partition the data based on the values of the specific field. For example, you could partition by the `url.domain` field, which would create individual SLOs for each value of the selected field.
* *Group by* — The field used to group the data based on the values of the specific field. For example, you could group by the `url.domain` field, which would create individual SLOs for each value of the selected field.

[discrete]
[[custom-metric-sli]]
Expand All @@ -68,7 +69,37 @@ When defining a custom metric SLI, set the following fields:
** *Metric [A-Z]* — The field that is aggregated using the `sum` aggregation for total events. For example, `processor.processed`
** *Filter [A-Z]* — The filter to apply to the metric for total events. For example, `"processor.outcome: *"`
** *Equation* — The equation that calculates the total metric. For example, `A`.
* *Partition by* — The field used to partition the data based on the values of the specific field. For example, you could partition by the `url.domain` field, which would create individual SLOs for each value of the selected field.
* *Group by* — The field used to group the data based on the values of the specific field. For example, you could group by the `url.domain` field, which would create individual SLOs for each value of the selected field.

[discrete]
[[timeslice-metric-sli]]
== Timeslice metric

Create an indicator based on a custom equation that uses statistical aggregations and a threshold to determine whether a slice is good or bad.
Supported aggregations include `Average`, `Max`, `Min`, `Sum`, `Cardinality`, `Last value`, `Std. deviation`, `Doc count`, and `Percentile`.
The equation supports basic math and logic.

NOTE: This indicator requires you to use the `Timeslices` budgeting method.

*Example:* You can define an indicator to determine whether a Kubernetes StatefulSet is healthy.
First you set the query filter to `orchestrator.cluster.name: "elastic-k8s" AND kubernetes.namespace: "my-ns" AND data_stream.dataset: "kubernetes.state_statefulset"`.
Then you define an equation that compares the number of ready (healthy) replicas to the number of observed replicas:
`A == B ? 1 : 0`, where `A` retrieves the last value of `kubernetes.statefulset.replicas.ready` and `B` retrieves the last value of `kubernetes.statefulset.replicas.observed`.
The equation returns `1` if the condition `A == B` is true (indicating the same number of replicas) or `0` if it's false. If the value is less than 1, you can determine that the Kubernetes StatefulSet is unhealthy.

When defining a timeslice metric SLI, set the following fields:

* *Source*
** *Index* — The data view or index pattern you want to base the SLI on. For example, `metrics-*:metrics-*`.
** *Timestamp field* — The timestamp field used by the index.
** *Query filter* — A KQL filter to specify relevant criteria by which to filter the index documents. For example, `orchestrator.cluster.name: "elastic-k8s" AND kubernetes.namespace: "my-ns" AND data_stream.dataset: "kubernetes.state_statefulset"`.
* *Metric definition*
** *Aggregation [A-Z]* — The type of aggregation to use.
** *Field [A-Z]* — The field to use in the aggregation. For example, `kubernetes.statefulset.replicas.ready`.
** *Filter [A-Z]* — The filter to apply to the metric.
** *Equation* — The equation that calculates the total metric. For example, `A == B ? 1 : 0`.
** *Comparator* - The type of comparison to perform.
** *Threshold* - The value to use along with the comparator to determine if the slice is good or bad.

[discrete]
[[histogram-metric-sli]]
Expand Down Expand Up @@ -98,7 +129,7 @@ When defining a histogram metric SLI, set the following fields:
** *From* — (`range` aggregation only) The starting value of the range for total events. For example, `0`.
** *To* — (`range` aggregation only) The ending value of the range for total events. For example, `100`.
** *KQL filter* — The filter for total events. For example, `"processor.outcome : *"`.
* *Partition by* — The field used to partition the data based on the values of the specific field. For example, you could partition by the `url.domain` field, which would create individual SLOs for each value of the selected field.
* *Group by* — The field used to group the data based on the values of the specific field. For example, you could group by the `url.domain` field, which would create individual SLOs for each value of the selected field.

[discrete]
[[apm-latency-and-availability-sli]]
Expand Down

0 comments on commit b4df72d

Please sign in to comment.