Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to alerting docs #4709

Merged
merged 5 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/en/observability/apm-anomaly-rule.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
[[apm-anomaly-rule]]
= APM Anomaly rule

++++
<titleabbrev>APM Anomaly</titleabbrev>
++++

[IMPORTANT]
====
To use the APM Anomaly rule, you have to enable <<create-ml-integration,machine learning>>,
which requires an {subscriptions}[appropriate license].
====

APM Anomaly rules trigger when the latency, throughput, or failed transaction rate of a service is abnormal.

[discrete]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
[[apm-error-count-threshold-rule]]
= Error count threshold rule

++++
<titleabbrev>Error count threshold</titleabbrev>
++++

Alert when the number of errors in a service exceeds a defined threshold. Error count rules can be set at the
environment level, service level, and error group level.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
[[apm-failed-transaction-rate-threshold-rule]]
= Failed transaction rate threshold rule

++++
<titleabbrev>Failed transaction rate threshold</titleabbrev>
++++

Alert when the rate of transaction errors in a service exceeds a defined threshold.

[discrete]
Expand Down
4 changes: 4 additions & 0 deletions docs/en/observability/apm-latency-threshold-rule.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
[[apm-latency-threshold-rule]]
= Latency threshold rule

++++
<titleabbrev>Latency threshold</titleabbrev>
++++

Alert when the latency or failed transaction rate is abnormal.
Threshold rules can be as broad or as granular as you'd like, enabling you to define exactly when you want to be alerted--whether that's at the environment level, service name level, transaction type level, and/or transaction name level.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
<titleabbrev>Integrate with machine learning</titleabbrev>
++++

[IMPORTANT]
====
Using machine learning requires an {subscriptions}[appropriate license].
====

The Machine learning integration initiates a new job predefined to calculate anomaly scores on APM transaction durations.
With this integration, you can quickly pinpoint anomalous transactions and see the health of
any upstream and downstream services.
Expand Down
31 changes: 24 additions & 7 deletions docs/en/observability/create-alerts.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -60,18 +60,35 @@ tie into other third-party systems. Connectors allow actions to talk to these se

Learn how to create specific types of rules:

* <<apm-anomaly-rule,APM Anomaly rule>>
* <<custom-threshold-alert,Custom threshold rule>>
[cols="1,1"]
|===
| *All of Observability*
a| * <<custom-threshold-alert,Custom threshold rule>>
* <<slo-burn-rate-alert,SLO burn rate rule>>

| *APM*
a| * <<apm-anomaly-rule,APM Anomaly rule>>
* <<apm-error-count-threshold-rule,Error count threshold rule>>
* <<apm-error-count-threshold-rule,Failed transaction rate threshold rule>>
* <<infrastructure-threshold-alert,Infrastructure threshold rule>>
* <<apm-latency-threshold-rule,Latency threshold rule>>
* <<logs-threshold-alert,Log threshold rule>>

| *Infrastructure*
a| * <<infrastructure-threshold-alert,Inventory rule>>
* <<metrics-threshold-alert,Metric threshold rule>>
* <<monitor-status-alert,Monitor status rule>>
* <<tls-certificate-alert,TLS certificate rule>>

| *Logs*
a| * <<logs-threshold-alert,Log threshold rule>>

| *Synthetics*
a| * <<monitor-status-alert-synthetics,Synthetics monitor status rule>>
// * <<tls-certificate-alert,Synthetics TLS certificate >> rule

| *Uptime* (deprecated:[8.15.0])
a| * <<monitor-status-alert-uptime,Uptime monitor status rule>>
* <<tls-certificate-alert,Uptime TLS rule>>
* <<duration-anomaly-alert,Uptime duration anomaly rule>>
* <<slo-burn-rate-alert,SLO burn rate rule>>

|===

[discrete]
[[create-alerts-rules-details]]
Expand Down
3 changes: 2 additions & 1 deletion docs/en/observability/inventory-threshold-alert.asciidoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
[[infrastructure-threshold-alert]]
= Create an inventory threshold rule

++++
<titleabbrev>Inventory threshold</titleabbrev>
<titleabbrev>Inventory</titleabbrev>
++++

Based on the resources listed on the *Infrastructure inventory* page within the {infrastructure-app},
Expand Down
7 changes: 5 additions & 2 deletions docs/en/observability/monitor-status-alert.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -234,8 +234,11 @@ image::images/monitor-status-alert-recovery.png[Default recovery message for mon
If you are currently using the Uptime monitor status with a monitor created with Elastic Synthetics,
you should migrate the Uptime monitor status rule to:

* The *Synthetics monitor rule* for <<migrate-monitor-rule-synthetics-rule,synthetic monitor _status_ checks>>.
* The *Synthetics availability SLI* for <<migrate-monitor-rule-synthetics-sli,synthetic monitor _availability_ checks>>.
* If you were using the Uptime rule for *synthetic monitor _status_ checks*,
you can recreate similar functionality using the <<migrate-monitor-rule-synthetics-rule,Synthetics monitor rule>>.
* If you were using the Uptime rule for *synthetic monitor _availability_ checks*,
there is no equivalent in the Synthetics monitor rule. Instead, you can use the
<<migrate-monitor-rule-synthetics-sli,Synthetics availability SLI>> to create similar functionality.

[discrete]
[[migrate-monitor-rule-synthetics-rule]]
Expand Down
1 change: 1 addition & 0 deletions docs/en/observability/slo-burn-rate-alert.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[[slo-burn-rate-alert]]
= Create a service-level objective (SLO) burn rate rule

++++
<titleabbrev>SLO burn rate</titleabbrev>
++++
Expand Down
1 change: 1 addition & 0 deletions docs/en/observability/threshold-alert.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[[custom-threshold-alert]]
= Create a custom threshold rule

++++
<titleabbrev>Custom threshold</titleabbrev>
++++
Expand Down
1 change: 1 addition & 0 deletions docs/en/observability/uptime-tls-alert.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[[tls-certificate-alert]]
= Create a TLS certificate rule

++++
<titleabbrev>TLS certificate</titleabbrev>
++++
Expand Down