Skip to content

Commit

Permalink
chore: polish sentences
Browse files Browse the repository at this point in the history
  • Loading branch information
nicecui committed May 16, 2024
1 parent a25707f commit 009a027
Show file tree
Hide file tree
Showing 5 changed files with 32 additions and 26 deletions.
12 changes: 8 additions & 4 deletions docs/nightly/en/contributor-guide/flownode/arrangement.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# Arrangement

Arrangement stores the state in the dataflow's process. Streams of update flows are sent to an arrangement, and the arrangement stores them for further querying and updating.
Arrangement stores the state in the dataflow's process. It stores the streams of update flows for further querying and updating.

The arrangement essentially stores key-value pairs with timestamps to mark their change time.

Internally, the arrangement receives tuples like
`((Key Row, Value Row), timestamp, diff)` and stores them in memory. One can query key-value pairs at a certain time using the `get(now: Timestamp, key: Row)` method, and retrieve the value for the given key at the specified time `now`.
The arrangement also assumes that everything older than a certain time (also known as the low watermark) has already been ingested and does not keep a history for them.
`((Key Row, Value Row), timestamp, diff)` and stores them in memory. One can query key-value pairs at a certain time using the `get(now: Timestamp, key: Row)` method.
The arrangement also assumes that everything older than a certain time (also known as the low watermark) has already been ingested to the sink tables and does not keep a history for them.

NOTE: The arrangement allows for the removal of keys by setting the `diff` to -1 in incoming tuples. Moreover, if a row has been previously added to the arrangement and the same key is inserted with a different value, the original value is overwritten with the new value.
:::tip NOTE

The arrangement allows for the removal of keys by setting the `diff` to -1 in incoming tuples. Moreover, if a row has been previously added to the arrangement and the same key is inserted with a different value, the original value is overwritten with the new value.

:::
4 changes: 2 additions & 2 deletions docs/nightly/en/contributor-guide/flownode/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
## Introduction


`Flownode` provides a simple streaming process(known as `flow`) ability to the database.
`Flownode` manages `flows` which are tasks that actively receive data from the `source` and send data to the `sink`.
`Flownode` provides a simple streaming process (known as `flow`) ability to the database.
`Flownode` manages `flows` which are tasks that receive data from the `source` and send data to the `sink`.

In current version, `Flownode` only supports standalone mode. In the future, we will support distributed mode.

Expand Down
16 changes: 9 additions & 7 deletions docs/nightly/zh/contributor-guide/flownode/arrangement.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Arrangement

Arrangement 存储数据流进程中的状态。更新流被发送到 Arrangement,Arrangement 会存储这些更新流,以便进一步查询和更新
Arrangement 存储数据流进程中的状态,存储 flow 的更新流(stream)以供进一步查询和更新

Arrangement 主要存储带有时间戳的键值对,以标记其更改时间。
Arrangement 本质上存储的是带有时间戳的键值对。
在内部,Arrangement 接收类似 `((Key Row, Value Row), timestamp, diff)` 的 tuple,并将其存储在内存中。
你可以使用 `get(now: Timestamp, key: Row)` 查询某个时间的键值对。
Arrangement 假定早于某个时间(也称为 Low Watermark)的所有内容都已被写入到 sink 表中,不会为其保留历史记录。

在内部,Arrangement 接收的元组包括
`((Key Row, Value Row), timestamp, diff)` 这样的元组并将其存储在内存中。人们可以使用 `get(now: Timestamp, key: Row)` 方法查询某个时间的键值对,并检索指定时间 `now` 的给定键值。
该安排还假定,所有早于一定时间(也称为 Low Watermark)的内容都已被摄取,因此不会为它们保留历史记录。

注意:Arrangement 允许通过将传入元组中的 `diff` 设置为 -1 来删除键。此外,如果之前已向 Arrangement 添加了一行,而插入的相同键值不同,则会用新值覆盖原值。
:::tip 注意
Arrangement 允许通过将传入 tuple 的 `diff` 设置为 -1 来删除键。
此外,如果已将行数据添加到 Arrangement 并且使用不同的值插入相同的键,则原始值将被新值覆盖。
:::
9 changes: 5 additions & 4 deletions docs/nightly/zh/contributor-guide/flownode/dataflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ Dataflow 模块(参见 `flow::compute` 模块)是 `flow` 的核心计算模
然后,该执行计划被转化为实际的数据流,而数据流本质上是一个由带有输入和输出端口的函数组成的有向无环图(DAG)。
数据流会在需要时被触发运行。

目前,该数据流只支持 `map``reduce` 操作未来将添加对 `join` 等操作的支持。
目前该数据流只支持 `map``reduce` 操作未来将添加对 `join` 等操作的支持。

在内部,数据流使用元组 `(row, time, diff)` 处理行格式的数据。这里,`row` 表示实际传递的数据,可能包含多个`Value`对象。
time “是跟踪数据流进度的系统时间,`diff` 通常表示插入或删除行(+1 或-1)。
因此,元组表示在给定的系统时间(`time`)下对 `row` 的插入/删除操作。
在内部,数据流使用 `tuple(row, time, diff)` 以行格式处理数据。
这里 `row` 表示实际传递的数据,可能包含多个 `value` 对象。
`time` 是系统时间,用于跟踪数据流的进度,`diff` 通常表示行的插入或删除(+1 或 -1)。
因此,`tuple` 表示给定系统时间的 `row` 的插入/删除操作。
17 changes: 8 additions & 9 deletions docs/nightly/zh/contributor-guide/flownode/overview.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
# 概览
# 概述

## 简介

`Flownode` 为数据库提供了一种简单的流处理(称为 `flow`)能力。
`Flownode` 管理 `flow`,这些 `flow` 是从 `source` 接收数据并将数据发送到 `sink` 的任务。

Flownode “为数据库提供了一种简单的流式处理能力(称为 `flow`)。
Flownode “管理 `flow``flow` 是主动从作为数据源的表接收数据并将计算结果发送到结果表的任务。

在当前版本中,`Flownode` 仅支持 Standalone 模式。未来,我们将支持分布式模式。
在当前版本中,`Flownode` 仅在单机模式中支持,未来将支持分布式模式。

## 组件

一个 `Flownode` 包含流的流处理过程所需的所有组件。在此,我们列出了其中的重要部分
`Flownode` 包含了 flow 流式处理的所有组件,以下是关键部分

- `FlownodeManager`用于接收从 ”前端 "转发的插入信息,并将结果发送回流的汇表
- 一定数量的 `FlowWorker` 实例,每个实例在单独的线程中运行。目前,Standalone 模式下只有一个 `FlowWorker`,但将来可能会改变
- `flow` 是一个主动从作为数据源的表接收数据,并向结果表发送数据的任务。它由 `FlownodeManager` 管理,并由 `FlowWorker` 运行。
- `FlownodeManager`用于接收从 `Frontend` 转发的插入数据并将结果发送回 flow 的 sink 表
- 一定数量的 `FlowWorker` 实例,每个实例在单独的线程中运行。当前在单机模式中只有一个 flow worker,但这可能会在未来发生变化
- `Flow` 是一个主动从 `source` 接收数据并将数据发送到 `sink` 的任务。由 `FlownodeManager` 管理并由 `FlowWorker` 运行。

0 comments on commit 009a027

Please sign in to comment.