From 73a825ffdcf7153153e055ba98614824a253c236 Mon Sep 17 00:00:00 2001 From: liyang Date: Mon, 6 Jan 2025 21:16:46 +0800 Subject: [PATCH] feat: Add Chinese docs --- docs/user-guide/administration/manage-etcd.md | 4 +- .../user-guide/administration/manage-etcd.md | 228 ++++++++++++++++++ 2 files changed, 230 insertions(+), 2 deletions(-) create mode 100644 i18n/zh/docusaurus-plugin-content-docs/version-0.11/user-guide/administration/manage-etcd.md diff --git a/docs/user-guide/administration/manage-etcd.md b/docs/user-guide/administration/manage-etcd.md index 35fc8e061..2104026b8 100644 --- a/docs/user-guide/administration/manage-etcd.md +++ b/docs/user-guide/administration/manage-etcd.md @@ -1,6 +1,6 @@ --- -keywords: [] -description: a. +keywords: [etcd] +description: a etcd management documentation. --- # Manage ETCD diff --git a/i18n/zh/docusaurus-plugin-content-docs/version-0.11/user-guide/administration/manage-etcd.md b/i18n/zh/docusaurus-plugin-content-docs/version-0.11/user-guide/administration/manage-etcd.md new file mode 100644 index 000000000..a67517bdb --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/version-0.11/user-guide/administration/manage-etcd.md @@ -0,0 +1,228 @@ +--- +keywords: [etcd] +description: etcd 管理文档. +--- + +# 管理 ETCD + +## 先决条件 + +- [Kubernetes](https://kubernetes.io/docs/setup/) >= v1.23 +- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) >= v1.18.0 +- [Helm](https://helm.sh/docs/intro/install/) >= v3.0.0 + +## 安装 + +GreptimeDB 集群需要 etcd 集群用于元数据存储。让我们使用 Bitnami 的 etcd Helm [chart](https://github.com/bitnami/charts/tree/main/bitnami/etcd) 安装 etcd 集群. + +```bash +helm upgrade --install etcd \ + oci://registry-1.docker.io/bitnamicharts/etcd \ + --version 10.2.12 \ + --set replicaCount=3 \ + --set auth.rbac.create=false \ + --set auth.rbac.token.enabled=false \ + --create-namespace \ + -n etcd-cluster +``` + +等待 etcd 集群运行: + +```bash +kubectl get po -n etcd-cluster +``` + +
+ Expected Output +```bash +NAME READY STATUS RESTARTS AGE +etcd-0 1/1 Running 0 64s +etcd-1 1/1 Running 0 65s +etcd-2 1/1 Running 0 72s +``` +
+ +etcd [initialClusterState](https://etcd.io/docs/v3.5/op-guide/configuration/) 参数指定启动 etcd 节点时 etcd 集群的初始状态。它对于确定节点如何加入集群非常重要。该参数可以采用以下两个值: + +- **new**: 表示 etcd 集群是新的。所有节点将作为新集群的一部分启动,并且不会使用任何先前的状态. +- **existing**: 表示该节点将加入一个已经存在的 etcd 集群,这种情况下必须确保 initialCluster 参数配置了当前集群所有节点的信息. + +etcd集群运行起来后,我们需要设置 initialClusterState 参数为 **existing** : + +```bash +helm upgrade --install etcd \ + oci://registry-1.docker.io/bitnamicharts/etcd \ + --version 10.2.12 \ + --set initialClusterState="existing" \ + --set removeMemberOnContainerTermination=false \ + --set replicaCount=3 \ + --set auth.rbac.create=false \ + --set auth.rbac.token.enabled=false \ + --create-namespace \ + -n etcd-cluster +``` + +等待 etcd 集群运行完毕,使用以下命令检查 etcd 集群的健康状态: + +```bash +kubectl -n etcd-cluster \ + exec etcd-0 -- etcdctl \ + --endpoints etcd-0.etcd-headless.etcd-cluster:2379,etcd-1.etcd-headless.etcd-cluster:2379,etcd-2.etcd-headless.etcd-cluster:2379 \ + endpoint status -w table +``` + +
+ Expected Output +```bash ++----------------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ +| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | ++----------------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ +| etcd-0.etcd-headless.etcd-cluster:2379 | 680910587385ae31 | 3.5.15 | 20 kB | false | false | 4 | 73991 | 73991 | | +| etcd-1.etcd-headless.etcd-cluster:2379 | d6980d56f5e3d817 | 3.5.15 | 20 kB | false | false | 4 | 73991 | 73991 | | +| etcd-2.etcd-headless.etcd-cluster:2379 | 12664fc67659db0a | 3.5.15 | 20 kB | true | false | 4 | 73991 | 73991 | | ++----------------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ +``` +
+ +## 备份 + +添加以下配置,并将其命名为 `etcd-backup.yaml` 文件,注意需要将 **existingClaim** 修改为你的 nfs pvc 名称: + +```yaml +replicaCount: 3 + +auth: + rbac: + create: false + token: + enabled: false + +initialClusterState: "existing" +removeMemberOnContainerTermination: false + +disasterRecovery: + enabled: true + cronjob: + schedule: "*/30 * * * *" + historyLimit: 2 + snapshotHistoryLimit: 2 + pvc: + existingClaim: "${YOUR_NFS_PVC_NAME_HERE}" +``` + +重新部署 etcd 集群: + +```bash +helm upgrade --install etcd \ + oci://registry-1.docker.io/bitnamicharts/etcd \ + --version 10.2.12 \ + --create-namespace \ + -n etcd-cluster --values etcd-backup.yaml +``` + +你可以看到 etcd 备份计划任务: + +```bash +kubectl get cronjob -n etcd-cluster +``` + +
+ Expected Output +```bash +NAME SCHEDULE TIMEZONE SUSPEND ACTIVE LAST SCHEDULE AGE +etcd-snapshotter */30 * * * * False 0 36s +``` +
+ +```bash +kubectl get pod -n etcd-cluster +``` + +
+ Expected Output +```bash +NAME READY STATUS RESTARTS AGE +etcd-0 1/1 Running 0 35m +etcd-1 1/1 Running 0 36m +etcd-2 0/1 Running 0 6m28s +etcd-snapshotter-28936038-tsck8 0/1 Completed 0 4m49s +``` +
+ +```bash +kubectl logs etcd-snapshotter-28936038-tsck8 -n etcd-cluster +``` + +
+ Expected Output +```log +etcd-0.etcd-headless.etcd-cluster.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 2.698457ms +etcd 11:18:07.47 INFO ==> Snapshotting the keyspace +{"level":"info","ts":"2025-01-06T11:18:07.579095Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/snapshots/db-2025-01-06_11-18.part"} +{"level":"info","ts":"2025-01-06T11:18:07.580335Z","logger":"client","caller":"v3@v3.5.15/maintenance.go:212","msg":"opened snapshot stream; downloading"} +{"level":"info","ts":"2025-01-06T11:18:07.580359Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"etcd-0.etcd-headless.etcd-cluster.svc.cluster.local:2379"} +{"level":"info","ts":"2025-01-06T11:18:07.582124Z","logger":"client","caller":"v3@v3.5.15/maintenance.go:220","msg":"completed snapshot read; closing"} +{"level":"info","ts":"2025-01-06T11:18:07.582688Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"etcd-0.etcd-headless.etcd-cluster.svc.cluster.local:2379","size":"20 kB","took":"now"} +{"level":"info","ts":"2025-01-06T11:18:07.583008Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/snapshots/db-2025-01-06_11-18"} +Snapshot saved at /snapshots/db-2025-01-06_11-18 +``` +
+ +接下来,可以在 nfs 服务器中看到 etcd 备份快照: + +```bash +ls ${NFS_SERVER_DIRECTORY} +``` + +
+ Expected Output +```bash +db-2025-01-06_11-18 db-2025-01-06_11-20 db-2025-01-06_11-22 +``` +
+ +## 恢复 + +添加以下配置文件,命名为 `etcd-restore.yaml`。注意,**existingClaim** 是你的 nfs pvc 的名字,**snapshotFilename** 为 etcd 快照文件名: + +```yaml +replicaCount: 3 + +auth: + rbac: + create: false + token: + enabled: false + +startFromSnapshot: + enabled: true + existingClaim: "${YOUR_NFS_PVC_NAME_HERE}" + snapshotFilename: "${YOUR_ETCD_SNAPSHOT_FILE_NAME}" +``` + +部署 etcd 恢复集群: + +```bash +helm upgrade --install etcd-recover \ + oci://registry-1.docker.io/bitnamicharts/etcd \ + --version 10.2.12 \ + --create-namespace \ + -n etcd-cluster --values etcd-restore.yaml +``` + +等待 etcd 恢复集群运行后,重新部署 etcd 恢复集群: + +```bash +helm upgrade --install etcd-recover \ + oci://registry-1.docker.io/bitnamicharts/etcd \ + --version 10.2.12 \ + --set initialClusterState="existing" \ + --set removeMemberOnContainerTermination=false \ + --set replicaCount=3 \ + --set auth.rbac.create=false \ + --set auth.rbac.token.enabled=false \ + --create-namespace \ + -n etcd-cluster +``` + +接下来完成 etcd 恢复.