Skip to content

Commit

Permalink
chore: apply suggestions from CR
Browse files Browse the repository at this point in the history
  • Loading branch information
WenyXu committed Jul 15, 2024
1 parent f3f7d62 commit 23ed88b
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 10 deletions.
6 changes: 6 additions & 0 deletions docs/nightly/en/user-guide/operations/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,6 +565,11 @@ store_addr = "127.0.0.1:2379"
selector = "LeaseBased"
# Store data in memory, false by default.
use_memory_store = false
## Whether to enable region failover.
## This feature is only available on GreptimeDB running on cluster mode and
## - Using Remote WAL
## - Using shared storage (e.g., s3).
enable_region_failover = false

[wal]
# Available wal providers:
Expand Down Expand Up @@ -617,6 +622,7 @@ backoff_deadline = "5mins"
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `wal` | -- | -- | -- |
| `wal.provider` | String | `raft_engine` | -- |
| `wal.broker_endpoints` | Array | -- | The broker endpoints of the Kafka cluster. |
Expand Down
39 changes: 29 additions & 10 deletions docs/nightly/en/user-guide/operations/region-failover.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,55 @@
# Region Failover

Region Failover provides the ability to recover regions from crashed Datanodes without losing data.
Region Failover provides the ability to recover regions from crashed Datanodes without losing data. This is implemented via [Region Migration](/user-guide/operations/region-migration).

## Enable the Region Failover

:::warning Warning
This feature is only available on GreptimeDB running on cluster mode and

- Using Kafka WAL
- Using [shared storage](/user-guide/operations/configuration.md#storage-options) (e.g., AWS S3)

:::

## The Recovery Time of Region Failover
### Via configuration file
Set the `enable_region_failover=true` in [metasrv](/user-guide/operations/configuration.md#metasrv-only-configuration) configuration file.

:::tip Note
### Via GreptimeDB Operator

In best practices, the number of topics/partitions supported by a Kafka cluster is limited (exceeding this number can degrade Kafka cluster performance). Therefore, we allow multiple regions to share a single topic as the WAL.
:::
Set the `meta.enableRegionFailover=true`, e.g.,
```bash
helm install greptimedb greptime/greptimedb-cluster \
--set meta.enableRegionFailover=true \
...
```

## The recovery time of Region Failover

The Recovery Time of Region Failover depends on:
The recovery time of Region Failover depends on:

- number of regions per Topic.
- the Kafka cluster read throughput performance.

:::tip Note

In best practices, the number of topics/partitions supported by a Kafka cluster is limited (exceeding this number can degrade Kafka cluster performance). Therefore, we allow multiple regions to share a single topic as the WAL.
:::

### The read amplification

However, the cost of multiple regions to share a single topic is read amplification during replaying WAL.
The data belonging to a specific region consists of data files plus data in the WAL (typically `WAL[LastCheckpoint...Latest]`). If multiple regions share a single topic, replaying data for a specific region from the Topic requires filtering out unrelated data (i.e., data from other regions). **This means reading more data than the actual data size of the region, a phenomenon known as read amplification**.

Although multiple regions share the same topic, allowing the Datanode to support more regions, the cost of this approach is read amplification during WAL replay.

For example, configure 128 topics for [metasrv](/user-guide/operations/configuration.md#metasrv-only-configuration), and if the whole cluster holds 1024 regions (physical regions), every 8 regions will share one topic.

![Read Amplification](/remote-wal-read-amplification.png)

<p style="text-align: center;"><b>(Figure1: recovery Region 0 need to read 7 times the redundant data)</b></p>
<p style="text-align: center;"><b>(Figure1: recovery Region 3 need to read 7 times the redundant data)</b></p>

:::warning Note
In actual scenarios, the reading amplification may be larger than this model.
:::

A simple model to estimate the read amplification factor:

Expand All @@ -47,7 +66,7 @@ A simple model to estimate the read amplification factor:

**If the Kafka cluster can provide 300MB/s read throughput, recovering 100 regions requires approximately 10 minutes (182.5GB/0.3GB = 10m).**

### Examples
### More examples

| Number of regions per Topic | Replay size (GB) | Kafka throughput 300MB/s- Reovery time (secs) | Kafka throughput 1000MB/s- Reovery time (secs) |
| --------------------------- | ---------------- | --------------------------------------------- | ---------------------------------------------- |
Expand Down
Binary file modified docs/public/remote-wal-read-amplification.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 23ed88b

Please sign in to comment.