Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add cluster monitoring #1397

Merged
merged 1 commit into from
Dec 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
---
keywords: [Kubernetes deployment, cluster, monitoring]
description: Guide to deploying monitoring for GreptimeDB clusters on Kubernetes, including self-monitoring and Prometheus monitoring steps.
---

# Cluster Monitoring Deployment

After deploying a GreptimeDB cluster using GreptimeDB Operator, by default, its components (Metasrv / Datanode / Frontend) expose a `/metrics` endpoint on their HTTP port (default `4000`) for Prometheus metrics.

We provide two approaches to monitor the GreptimeDB cluster:

1. **Enable GreptimeDB Self-Monitoring**: The GreptimeDB Operator will launch an additional GreptimeDB Standalone instance and Vector Sidecar container to collect and store metrics and logs from the GreptimeDB cluster.
2. **Use Prometheus Operator to Configure Prometheus Metrics Monitoring**: Users need first to deploy Prometheus Operator and create Prometheus instance, then use Prometheus Operator's `PodMonitor` to write GreptimeDB cluster metrics into Prometheus.

Users can choose the appropriate monitoring approach based on their needs.

## Enable GreptimeDB Self-Monitoring

In self-monitoring mode, GreptimeDB Operator will launch an additional GreptimeDB Standalone instance to collect metrics and logs from the GreptimeDB cluster, including cluster logs and slow query logs. To collect log data, GreptimeDB Operator will start a [Vector](https://vector.dev/) sidecar container in each Pod. When this mode is enabled, JSON format logging will be automatically enabled for the cluster.
zyy17 marked this conversation as resolved.
Show resolved Hide resolved

If you deploy the GreptimeDB cluster using Helm Chart (refer to [Getting Started](../getting-started.md)), you can configure the `values.yaml` file as follows:

```yaml
monitoring:
enabled: true
```

This will deploy a GreptimeDB Standalone instance named `${cluster}-monitoring` to collect metrics and logs. You can check it with:

```
kubectl get greptimedbstandalones.greptime.io ${cluster}-monitoring -n ${namespace}
```

By default, this GreptimeDB Standalone instance will store monitoring data using the Kubernetes default StorageClass in local storage. You can adjust this based on your needs.

The GreptimeDB Standalone instance can be configured via the `monitoring.standalone` field in `values.yaml`, for example:

```yaml
monitoring:
enabled: true
standalone:
base:
main:
# Configure GreptimeDB Standalone instance image
image: "greptime/greptimedb:latest"

# Configure GreptimeDB Standalone instance resources
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "2"
memory: "4Gi"

# Configure object storage for GreptimeDB Standalone instance
objectStorage:
s3:
# Configure bucket
bucket: "monitoring"
# Configure region
region: "ap-southeast-1"
# Configure secret name
secretName: "s3-credentials"
# Configure root path
root: "standalone-with-s3-data"
```

The GreptimeDB Standalone instance will expose services using `${cluster}-monitoring-standalone` as the Kubernetes Service name. You can use the following addresses to read monitoring data:

- **Prometheus metrics**: `http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus`
- **SQL logs**: `${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002`. By default, cluster logs are stored in `public._gt_logs` table and slow query logs are stored in `public._gt_slow_queries` table.

The Vector sidecar configuration for log collection can be customized via the `monitoring.vector` field:

```yaml
monitoring:
enabled: true
vector:
# Configure Vector image registry
registry: docker.io
# Configure Vector image repository
repository: timberio/vector
# Configure Vector image tag
tag: nightly-alpine

# Configure Vector resources
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "50m"
memory: "64Mi"
```

:::note
If you're not using Helm Chart, you can manually configure self-monitoring mode in the `GreptimeDBCluster` YAML:

```yaml
apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
name: basic
spec:
base:
main:
image: greptime/greptimedb:latest
frontend:
replicas: 1
meta:
replicas: 1
etcdEndpoints:
- "etcd.etcd-cluster.svc.cluster.local:2379"
datanode:
replicas: 1
monitoring:
enabled: true
```

The `monitoring` field configures self-monitoring mode. See [`GreptimeDBCluster` API docs](https://github.com/GreptimeTeam/greptimedb-operator/blob/main/docs/api-references/docs.md#monitoringspec) for details.
:::

## Use Prometheus Operator to Configure Prometheus Metrics Monitoring

Users need to first deploy Prometheus Operator and create Prometheus instance. For example, you can use [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) to deploy the Prometheus stack. You can refer to its [official documentation](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) for more details.

After deploying Prometheus Operator and instances, you can configure Prometheus monitoring via the `prometheusMonitor` field in `values.yaml`:

```yaml
prometheusMonitor:
# Enable Prometheus monitoring - this will create PodMonitor resources
enabled: true
# Configure scrape interval
interval: "30s"
# Configure labels
labels:
release: prometheus
```

:::note
The `labels` field must match the `matchLabels` field used to create the Prometheus instance, otherwise metrics collection won't work properly.
:::

After configuring `prometheusMonitor`, GreptimeDB Operator will automatically create `PodMonitor` resources and import metrics into Prometheus. You can check the `PodMonitor` resources with:

```
kubectl get podmonitors.monitoring.coreos.com -n ${namespace}
```

:::note
If not using Helm Chart, you can manually configure Prometheus monitoring in the `GreptimeDBCluster` YAML:

```yaml
apiVersion: greptime.io/v1alpha1
kind: GreptimeDBCluster
metadata:
name: basic
spec:
base:
main:
image: greptime/greptimedb:latest
frontend:
replicas: 1
meta:
replicas: 1
etcdEndpoints:
- "etcd.etcd-cluster.svc.cluster.local:2379"
datanode:
replicas: 1
prometheusMonitor:
enabled: true
interval: "30s"
labels:
release: prometheus
```

The `prometheusMonitor` field configures Prometheus monitoring.
:::

## Import Grafana Dashboards

GreptimeDB cluster currently provides 3 Grafana dashboards:

- [Cluster Metrics Dashboard](https://github.com/GreptimeTeam/greptimedb/blob/main/grafana/greptimedb-cluster.json)
- [Cluster Logs Dashboard](https://github.com/GreptimeTeam/helm-charts/blob/main/charts/greptimedb-cluster/dashboards/greptimedb-cluster-logs.json)
- [Slow Query Logs Dashboard](https://github.com/GreptimeTeam/helm-charts/blob/main/charts/greptimedb-cluster/dashboards/greptimedb-cluster-slow-queries.json)

**Note**: The Cluster Logs Dashboard and Slow Query Logs Dashboard are only for self-monitoring mode, while the Cluster Metrics Dashboard works for both self-monitoring and Prometheus monitoring modes.

If using Helm Chart, you can enable `grafana.enabled` to deploy Grafana and import dashboards automatically (see [Getting Started](../getting-started.md)):

```yaml
grafana:
enabled: true
```

If you already have Grafana deployed, follow these steps to import the dashboards:

1. **Add Data Sources**

You can refer to Grafana's [datasources](https://grafana.com/docs/grafana/latest/datasources/) docs to add the following 3 data sources:

- **`metrics` data source**

For importing Prometheus metrics, works with both monitoring modes. For self-monitoring mode, use `http://${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4000/v1/prometheus` as the URL. For your own Prometheus instance, use your Prometheus instance URL.

- **`information-schema` data source**

For importing cluster metadata via SQL, works with both monitoring modes. Use `${cluster}-frontend.${namespace}.svc.cluster.local:4002` as the SQL address with database `information_schema`.

- **`logs` data source**

For importing cluster and slow query logs via SQL, **only works with self-monitoring mode**. Use `${cluster}-monitor-standalone.${namespace}.svc.cluster.local:4002` as the SQL address with database `public`.

2. **Import Dashboards**

You can refer to Grafana's [Import dashboards](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) docs.
13 changes: 8 additions & 5 deletions docs/user-guide/deployments/deploy-on-kubernetes/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,20 @@ description: Overview of deploying GreptimeDB on Kubernetes using the GreptimeDB

# Overview

## GreptimeDB Operator
## GreptimeDB on Kubernetes

The [GreptimeDB Operator](https://github.com/GrepTimeTeam/greptimedb-operator) uses the [Operator pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) to manage GreptimeDB on Kubernetes, automating the setup, provisioning, and management of GreptimeDB cluster and standalone instances.
This makes it easy to quickly deploy and scale GreptimeDB in any Kubernetes environment, whether on-premises or in the cloud.
GreptimeDB is a time-series database built for cloud-native environments and can be deployed on Kubernetes since day one. We provide a [GreptimeDB Operator](https://github.com/GrepTimeTeam/greptimedb-operator) to manage GreptimeDB on Kubernetes, automating the setup, provisioning, and management of GreptimeDB cluster and standalone instances. This makes it easy to quickly deploy and scale GreptimeDB in any Kubernetes environment, whether on-premises or in the cloud.

We **highly recommend** using the GreptimeDB Operator to deploy GreptimeDB on Kubernetes.

## Manage GreptimeDB with the GreptimeDB Operator
## Getting Started

You can take [Getting Started](./getting-started.md) as your first guide to understand the whole picture. This guide provides the complete process of deploying the GreptimeDB cluster on Kubernetes.

After getting started, you can refer to the following documents for more details about the production deployment.
## GreptimeDB Operator

- [GreptimeDB Operator Management](./greptimedb-operator-management.md)

## Monitoring

- [Cluster Monitoring Deployment](./monitoring/cluster-monitoring-deployment.md)
Loading
Loading