Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved out the observability tutorial to the examples #18420

Merged
merged 2 commits into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Out of the box, Kyma offers various functionalities, such as:
- [Istio](https://kyma-project.io/#/istio/user/00-overview/README) for service-to-service communication and proxying
- [Service Management](https://kyma-project.io/#/01-overview/service-management/README) to use the built-in cloud services from such cloud providers as GCP, Azure, and AWS
- Secure API exposure
- In-cluster observability
- Collection and shipment of telemetry data to observability backends using the [Telemetry module](https://kyma-project.io/#/telemetry-manager/user/README)
- CLI supported by the intuitive UI through which you can connect your application to a Kubernetes cluster

<p align="center">
Expand Down
20 changes: 6 additions & 14 deletions docs/03-tutorials/00-observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,14 @@
title: Observability
---

## Purpose
If you're interested in learning more about the Telemetry module and how to integrate backends, check out these links:

The following instructions describe the complete monitoring flow for your services in Kyma. You get the gist of monitoring applications, such as Prometheus, Grafana, and Alertmanager. You learn how and where you can observe and visualize your service metrics to monitor them for any alerting values.
- Install the [OpenTelemetry Demo App](https://github.com/kyma-project/examples/tree/main/trace-demo) and see it in action to learn more about distributed tracing.

All the tutorials use the [`monitoring-custom-metrics`](https://github.com/kyma-project/examples/tree/main/prometheus/monitoring-custom-metrics) example and one of its services called `sample-metrics-8081`. This service exposes the `cpu_temperature_celsius` custom metric on the `/metrics` endpoint. This custom metric is the central element of the whole tutorial set. The metric value simulates the current processor temperature and changes randomly from 60 to 90 degrees Celsius. The alerting threshold in these tutorials is 75 degrees Celsius. If the temperature exceeds this value, the Grafana dashboard, PrometheusRule, and Alertmanager notifications you create inform you about this.
- Push application logs to a [custom Loki stack](https://github.com/kyma-project/examples/tree/main/loki).

## Sequence of tasks
- Push traces based on OpenTelemetry Protocol (OTLP) to a [custom Jaeger stack](https://github.com/kyma-project/examples/tree/main/jaeger).

The instructions cover the following tasks:
- Push metrics based on OTLP to any [OTLP based metric backend](https://github.com/kyma-project/examples/tree/main/metrics-otlp).

![Monitoring tutorials](./assets/monitoring-tutorials.svg)

1. [**Deploy a custom Prometheus stack**](https://github.com/kyma-project/examples/blob/main/prometheus/README.md), in which you deploy the [kube-prometheus-stack](https://github.com/prometheus-operator/kube-prometheus) from the upstream Helm chart.

2. [**Observe application metrics**](https://github.com/kyma-project/examples/blob/main/prometheus/monitoring-custom-metrics/README.md), in which you redirect the `cpu_temperature_celsius` metric to the localhost and the Prometheus UI. You later observe how the metric value changes in the predefined 10 seconds interval in which Prometheus scrapes the metric values from the service's `/metrics` endpoint.

3. [**Create a Grafana dashboard**](https://github.com/kyma-project/examples/blob/main/prometheus/monitoring-grafana-dashboard/README.md), in which you create a Grafana dashboard of a Gauge type for the `cpu_temperature_celsius` metric. This dashboard shows explicitly when the CPU temperature is equal to or higher than the predefined threshold of 75 degrees Celsius, at which point the dashboard turns red.

4. [**Define alerting rules**](https://github.com/kyma-project/examples/blob/main/prometheus/monitoring-alert-rules/README.md), in which you define the `CPUTempHigh` alerting rule by creating a PrometheusRule resource. Prometheus accesses the `/metrics` endpoint every 10 seconds and validates the current value of the `cpu_temperature_celsius` metric. If the value is equal to or higher than 75 degrees Celsius, Prometheus waits for 10 seconds to recheck it. If the value still exceeds the threshold, Prometheus triggers the rule. You can observe both the rule and the alert it generates on the Prometheus dashboard.
For a tutorial based on a typical prometheus stack, please read [Monitoring in Kyma using a custom kube-prometheus-stack](https://github.com/kyma-project/examples/edit/main/prometheus/README.md)
4 changes: 0 additions & 4 deletions docs/03-tutorials/assets/monitoring-tutorials.svg

This file was deleted.

17 changes: 0 additions & 17 deletions resources/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,3 @@ path-to-referenced-charts:

The version of the actual component image is located under the **global.{name_of_component}.version** property.
**{name_of_component}** is a directory name of the component where dashes are replaced by underscores.

### Add monitoring to components

Include configuration files for ServiceMonitors, alert rules, and dashboards under your component's chart to ensure proper health check monitoring of your component.

When creating a ServiceMonitor resource, follow this naming convention:

| Resource | Name/Pattern | Value/Example | Description |
|-----------|-------------|---------------| --------|
| Service monitor| `service-monitor.yaml` | `service-monitor.yaml` | Name of the file which contains the service monitor's specification.|
| Service monitor| `{chart_name}-{name_of_monitored_chart_component}` | `monitoring-grafana`, where the name of the main chart is **monitoring**, and the monitored component is **grafana**.| Name of the resource in the **metadata** section of the file. |
| Alert rule| `prometheus-rules.yaml` | `prometheus-rules.yaml` | Name of the file which contains the alert rule's specification.|
| Alert rule | `{chart_name}` if the resource contains rules for all chart elements, `{name_of_main_chart}-{name_of_sub-chart}` if every sub-chart has its own set of rules | `monitoring` if the resource contains all alert rules for the monitoring chart, `monitoring-grafana`, if it contains the rules for Grafana sub-chart only. | Name of the resource in the **metadata** section of the file.|
| Dashboard |`dashboard-configmap.yaml`|`dashboard-configmap.yaml`|Name of the file which contains the dashboard's specification.|
| Dashboard| `{chart_name}-dashboard` for the main chart dashboard,`{chart_name}-{sub_chart_name}-dashboard` for the sub-chart | `backup-dashboard`, `eventing-nats-dashboard` | Name of the resource in the **metadata** section of the file.|

For details on observing metrics, creating dashboards, and setting alerting rules, see [these tutorials](https://kyma-project.io/#/03-tutorials/00-observability).