Skip to content

Commit

Permalink
Minor update
Browse files Browse the repository at this point in the history
  • Loading branch information
g-despot committed Feb 6, 2025
1 parent 5882607 commit f04341f
Showing 1 changed file with 12 additions and 12 deletions.
24 changes: 12 additions & 12 deletions developers/weaviate/configuration/replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,31 +51,31 @@ import ReplicationConfigWithAsyncRepair from '/_includes/code/configuration/repl

<ReplicationConfigWithAsyncRepair />

### Configuring Async Replication
### Configuring async replication

:::info Added in `v1.29`
Async replication support has been added in `v1.26`while the environment variables for configuring async replication (`ASYNC_*`) have been introduced in `v1.29`.
Async replication support has been added in `v1.26`while the [environment variables](/developers/weaviate/config-refs/env-vars#multi-node-instances) for configuring async replication (`ASYNC_*`) have been introduced in `v1.29`.
:::

Async Replication is a background synchronization process in Weaviate that ensures eventual consistency across nodes storing the same shard. When a collection is partitioned into multiple shards, each shard is replicated across several nodes (as defined by the replication factor `REPLICATION_MINIMUM_FACTOR`). Async replication guarantees that all nodes holding the same shard remain in sync by periodically comparing and propagating data.
Async replication is a background synchronization process in Weaviate that ensures eventual consistency across nodes storing the same shard. When a collection is partitioned into multiple shards, each shard is replicated across several nodes (as defined by the replication factor `REPLICATION_MINIMUM_FACTOR`). Async replication guarantees that all nodes holding the same shard remain in sync by periodically comparing and propagating data.

#### 1. Periodic data comparison

Each node runs a background process that periodically compares its locally stored data with other nodes holding the same shard. This comparison is triggered either:
- At regular intervals (`ASYNC_REPLICATION_FREQUENCY`).
- When a change in availability of a node is detected (`ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY`).

To efficiently detect differences, Weaviate uses a **hashtree**. Instead of checking entire datasets, it compares hash digests at multiple levels, narrowing down differences to specific objects. The size of this hashtree can be defined via `ASYNC_REPLICATION_HASHTREE_HEIGHT`.
- at **regular intervals** (`ASYNC_REPLICATION_FREQUENCY`) or
- when a **change in the availability of a node** is detected (`ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY`).

If a node is unresponsive, Weaviate applies a timeout (`ASYNC_REPLICATION_DIFF_PER_NODE_TIMEOUT`) to avoid delays in the replication process.

Weaviate uses a **hashtree** data structure to efficiently detect differences. Instead of checking entire datasets, it compares hash digests at multiple levels, narrowing down differences to specific objects. The size of this hashtree can be defined via `ASYNC_REPLICATION_HASHTREE_HEIGHT`.

#### 2. Data synchronization

When differences are detected, the outdated or missing data is propagated to the affected nodes. This propagation process:
- Sends data in batches of a custom size (`ASYNC_REPLICATION_BATCH_SIZE`).
- Limits each propagation step object limit (`ASYNC_REPLICATION_PROPAGATION_LIMIT`).
- Enforces a time-bound for updates (`ASYNC_REPLICATION_PROPAGATION_TIMEOUT`).
- Uses a different comparison frequency right after completing synchronization on a node (`ASYNC_REPLICATION_FREQUENCY_WHILE_PROPAGATING`).
When differences are detected, the outdated or missing data is propagated to the affected nodes. This process:
- Sends data in batches of a defined size (`ASYNC_REPLICATION_BATCH_SIZE`).
- Enforces an object limit for each propagation iteration (`ASYNC_REPLICATION_PROPAGATION_LIMIT`).
- Enforces a time limit for the propagation (`ASYNC_REPLICATION_PROPAGATION_TIMEOUT`).
- Sets a different data comparison frequency right after completing synchronization on a node (`ASYNC_REPLICATION_FREQUENCY_WHILE_PROPAGATING`).

:::tip Replication settings
You can find a complete list of the environment variables related to async replication on the page [Reference: Environment variables](/developers/weaviate/config-refs/env-vars#multi-node-instances).
Expand Down

0 comments on commit f04341f

Please sign in to comment.