From 4ff7b7a2085d8a74875d56db8401987b03d58629 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Wed, 12 Jun 2024 11:06:17 -0400 Subject: [PATCH 1/7] waku sync first draft --- standards/core/sync.md | 93 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 standards/core/sync.md diff --git a/standards/core/sync.md b/standards/core/sync.md new file mode 100644 index 0000000..e02f606 --- /dev/null +++ b/standards/core/sync.md @@ -0,0 +1,93 @@ +--- +title: WAKU-SYNC +name: Waku Sync +editor: Simon-Pierre Vivier +contributors: + - Prem Chaitanya Prathi + - Hanno Cornelius +--- + +# Abstract +This specification explains the `WAKU-SYNC` protocol +which enables the reconciliation of two sets of message hashes +in the context of keeping syncronized two Waku Store nodes. +Waku Sync is a wrapper around +[Negentropy](https://github.com/hoytech/negentropy) a [range-based set reconciliation protocol](https://logperiodic.com/rbsr.html). + +# Specification + +**Protocol identifier**: `/vac/waku/sync/1.0.0` + +## Terminology +The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, +“RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119](https://www.ietf.org/rfc/rfc2119.txt). + +The term Negentropy refers to the protocol of the same name. +Negentropy payload refers to +the messages created by the Negentropy protocol. +Client always refers to the initiator +and the server the receiver of the first payload. + +## Design Requirements +Nodes enabling Waku Sync SHOULD have the relay and store protocols enabled and +keep messages for at least the last hour. TODO do we need to say this, sounds like an impl. detail + +After each sync session, nodes SHOULD use Store queries to acquire missing messages. + +## Payload + +```protobuf +syntax = "proto3"; + +package waku.sync.v1; + +message SyncPayload { + optional bytes negentropy = 1; + + repeated bytes hashes = 20; +} +``` + +## Session Flow +A client initiate a session with a server +by sending a `SyncPayload` with only the `negentropy` field set. +This field MUST contain the first negentropy payload created by the client for this session. + +The server receives a `SyncPayload`. +A new negentropy payload is computed from the received one. +The server sends back a `SyncPayload` to the client. + +The client receives a `SyncPayload`. +A new negentropy payload OR an empty one is computed. +If a new payload is computed then +the exchanges between client and server continues until +the client computes an empty payload. +This client computation also outputs any hash differences found, +those MUST be stored. +In the case of an empty payload, +the client MUST send back a `SyncPayload` +with all the hashes previoudly found in the `hashes` field and +an empty `engentropy` field. + +### Storage Pruning +TODO? do we need to talk about that, seams like an implementation detail + +### Message transfer +TODO? do we need to talk about that, seams like an implementation detail + +# Attack Vectors +Nodes using `WAKU-SYNC` are fully trusted. +Message hashes are assumed to be of valid messages received via Waku Relay. + +Further refinements to the protocol are planned +to reduce the trust level required to operate. +Notably by verifing message RLN proofs at reception. + +# Copyright + +Copyright and related rights waived via +[CC0](https://creativecommons.org/publicdomain/zero/1.0/). + +# References + - https://logperiodic.com/rbsr.html + - https://github.com/hoytech/negentropy \ No newline at end of file From 7d7cbc033885d3ccc1e434f80475c03a475e0091 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Tue, 2 Jul 2024 16:04:38 -0400 Subject: [PATCH 2/7] endless rephrasing --- standards/core/sync.md | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/standards/core/sync.md b/standards/core/sync.md index e02f606..851fd7b 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -10,7 +10,7 @@ contributors: # Abstract This specification explains the `WAKU-SYNC` protocol which enables the reconciliation of two sets of message hashes -in the context of keeping syncronized two Waku Store nodes. +in the context of keeping multiple nodes syncronized. Waku Sync is a wrapper around [Negentropy](https://github.com/hoytech/negentropy) a [range-based set reconciliation protocol](https://logperiodic.com/rbsr.html). @@ -29,10 +29,20 @@ Client always refers to the initiator and the server the receiver of the first payload. ## Design Requirements -Nodes enabling Waku Sync SHOULD have the relay and store protocols enabled and -keep messages for at least the last hour. TODO do we need to say this, sounds like an impl. detail - -After each sync session, nodes SHOULD use Store queries to acquire missing messages. +Nodes enabling Waku Sync SHOULD +manage and keep message hashes +for the range of time +during which syncronization is required. +Nodes SHOULD use the same time range, +for Waku we chose one hour as the global default. +Waku Relay or Filter protocol MAY be enabled +and used in conjuction with Sync +as a source of new message hashes +for the time range. + +Nodes MAY use the Store protocol +to request missing messages +or to provide messages to requesting clients. ## Payload @@ -50,8 +60,12 @@ message SyncPayload { ## Session Flow A client initiate a session with a server -by sending a `SyncPayload` with only the `negentropy` field set. -This field MUST contain the first negentropy payload created by the client for this session. +by sending a `SyncPayload` with +only the `negentropy` field set to it. +This field MUST contain +the first negentropy payload +created by the client +for this session. The server receives a `SyncPayload`. A new negentropy payload is computed from the received one. @@ -67,21 +81,15 @@ those MUST be stored. In the case of an empty payload, the client MUST send back a `SyncPayload` with all the hashes previoudly found in the `hashes` field and -an empty `engentropy` field. - -### Storage Pruning -TODO? do we need to talk about that, seams like an implementation detail - -### Message transfer -TODO? do we need to talk about that, seams like an implementation detail +an empty `nengentropy` field. # Attack Vectors Nodes using `WAKU-SYNC` are fully trusted. -Message hashes are assumed to be of valid messages received via Waku Relay. +Message hashes are assumed to be of valid messages received via Waku Relay or Filter. Further refinements to the protocol are planned to reduce the trust level required to operate. -Notably by verifing message RLN proofs at reception. +Notably by verifing messages RLN proof at reception. # Copyright From bf5564851fbe6c7cc8314d622c48ea963c9b91f6 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Tue, 3 Sep 2024 11:52:48 -0400 Subject: [PATCH 3/7] new impl. section --- standards/core/sync.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/standards/core/sync.md b/standards/core/sync.md index 851fd7b..0bde983 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -91,6 +91,30 @@ Further refinements to the protocol are planned to reduce the trust level required to operate. Notably by verifing messages RLN proof at reception. +# Implementation +The following is not part of the specifications but good to know implementation details. + +### Interval +Ad-hoc syncing can be useful in some cases but continueous periodic sync +minimise the differences in messages stored across the network. +Syncing early and often is the best strategy. +The default used in nwaku is 5 minutes interval between sync with a range of 1 hour. + +### Range +We also offset the sync range by 20 seconds in the past. +The actual start of the sync range is T-01:00:20 and the end T-00:00:20 +This is to handle the inherent jitter of GossipSub. +In other words, it is the amount of time needed to confirm if a message is missing or not. + +### Storage +The storage implementation should reflect the Waku context. +Most messages that will be added will be recent and +all removed messages will be older ones. +When differences are found some messages will have to be inserted randomly. +It is expected to be a less likely case than time based insertion and removal. +Last but not least it must be optimized for sequential read +as it is the most often used operation. + # Copyright Copyright and related rights waived via From 3721b54cc1c0acd343bca4e13bfc7d4c8d5405b0 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Thu, 5 Sep 2024 08:33:30 -0400 Subject: [PATCH 4/7] refinements --- standards/core/sync.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/standards/core/sync.md b/standards/core/sync.md index 0bde983..7a86bc3 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -10,7 +10,7 @@ contributors: # Abstract This specification explains the `WAKU-SYNC` protocol which enables the reconciliation of two sets of message hashes -in the context of keeping multiple nodes syncronized. +in the context of keeping multiple Store nodes syncronized. Waku Sync is a wrapper around [Negentropy](https://github.com/hoytech/negentropy) a [range-based set reconciliation protocol](https://logperiodic.com/rbsr.html). @@ -30,18 +30,18 @@ and the server the receiver of the first payload. ## Design Requirements Nodes enabling Waku Sync SHOULD -manage and keep message hashes +manage and keep message hashes in a local cache for the range of time during which syncronization is required. Nodes SHOULD use the same time range, for Waku we chose one hour as the global default. -Waku Relay or Filter protocol MAY be enabled +Waku Relay or Light Push protocol MAY be enabled and used in conjuction with Sync as a source of new message hashes -for the time range. +for the cache. Nodes MAY use the Store protocol -to request missing messages +to request missing messages once reconciliation is complete or to provide messages to requesting clients. ## Payload @@ -85,7 +85,7 @@ an empty `nengentropy` field. # Attack Vectors Nodes using `WAKU-SYNC` are fully trusted. -Message hashes are assumed to be of valid messages received via Waku Relay or Filter. +Message hashes are assumed to be of valid messages received via Waku Relay or Light push. Further refinements to the protocol are planned to reduce the trust level required to operate. From 048ee89d29280df884bb02d1e5cbfcd2ac257568 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Fri, 6 Sep 2024 11:18:14 -0400 Subject: [PATCH 5/7] typos --- standards/core/sync.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/standards/core/sync.md b/standards/core/sync.md index 7a86bc3..262883e 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -7,18 +7,18 @@ contributors: - Hanno Cornelius --- -# Abstract +## Abstract This specification explains the `WAKU-SYNC` protocol which enables the reconciliation of two sets of message hashes -in the context of keeping multiple Store nodes syncronized. +in the context of keeping multiple Store nodes synchronized. Waku Sync is a wrapper around [Negentropy](https://github.com/hoytech/negentropy) a [range-based set reconciliation protocol](https://logperiodic.com/rbsr.html). -# Specification +## Specification **Protocol identifier**: `/vac/waku/sync/1.0.0` -## Terminology +### Terminology The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119](https://www.ietf.org/rfc/rfc2119.txt). @@ -28,15 +28,15 @@ the messages created by the Negentropy protocol. Client always refers to the initiator and the server the receiver of the first payload. -## Design Requirements +### Design Requirements Nodes enabling Waku Sync SHOULD manage and keep message hashes in a local cache for the range of time -during which syncronization is required. +during which synchronization is required. Nodes SHOULD use the same time range, for Waku we chose one hour as the global default. Waku Relay or Light Push protocol MAY be enabled -and used in conjuction with Sync +and used in conjunction with Sync as a source of new message hashes for the cache. @@ -44,7 +44,7 @@ Nodes MAY use the Store protocol to request missing messages once reconciliation is complete or to provide messages to requesting clients. -## Payload +### Payload ```protobuf syntax = "proto3"; @@ -58,7 +58,7 @@ message SyncPayload { } ``` -## Session Flow +### Session Flow A client initiate a session with a server by sending a `SyncPayload` with only the `negentropy` field set to it. @@ -80,30 +80,30 @@ This client computation also outputs any hash differences found, those MUST be stored. In the case of an empty payload, the client MUST send back a `SyncPayload` -with all the hashes previoudly found in the `hashes` field and +with all the hashes previously found in the `hashes` field and an empty `nengentropy` field. -# Attack Vectors +## Attack Vectors Nodes using `WAKU-SYNC` are fully trusted. Message hashes are assumed to be of valid messages received via Waku Relay or Light push. Further refinements to the protocol are planned to reduce the trust level required to operate. -Notably by verifing messages RLN proof at reception. +Notably by verifying messages RLN proof at reception. -# Implementation +## Implementation The following is not part of the specifications but good to know implementation details. ### Interval -Ad-hoc syncing can be useful in some cases but continueous periodic sync -minimise the differences in messages stored across the network. +Ad-hoc syncing can be useful in some cases but continuous periodic sync +minimize the differences in messages stored across the network. Syncing early and often is the best strategy. The default used in nwaku is 5 minutes interval between sync with a range of 1 hour. ### Range We also offset the sync range by 20 seconds in the past. The actual start of the sync range is T-01:00:20 and the end T-00:00:20 -This is to handle the inherent jitter of GossipSub. +This is to handle the inherent jitters of GossipSub. In other words, it is the amount of time needed to confirm if a message is missing or not. ### Storage @@ -115,11 +115,11 @@ It is expected to be a less likely case than time based insertion and removal. Last but not least it must be optimized for sequential read as it is the most often used operation. -# Copyright +## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). -# References +## References - https://logperiodic.com/rbsr.html - https://github.com/hoytech/negentropy \ No newline at end of file From 0b70a112d58d9473ab2abcf2f2096c6f454cbe51 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Mon, 7 Oct 2024 11:41:55 -0400 Subject: [PATCH 6/7] added peer choice section --- standards/core/sync.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/standards/core/sync.md b/standards/core/sync.md index 262883e..28735be 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -94,6 +94,12 @@ Notably by verifying messages RLN proof at reception. ## Implementation The following is not part of the specifications but good to know implementation details. +### Peer Choice +Peering strategies can lead to inadvertently segregating peers and reduce sampling diversity. +We randomly select peers to sync with for simplicity and robustness. + +A good strategy can be devised but we chose not to. + ### Interval Ad-hoc syncing can be useful in some cases but continuous periodic sync minimize the differences in messages stored across the network. From 7897ad62a06cfc068483026eecfa2ddd1909dc11 Mon Sep 17 00:00:00 2001 From: SionoiS Date: Tue, 22 Oct 2024 10:22:25 -0400 Subject: [PATCH 7/7] clarification --- standards/core/sync.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/standards/core/sync.md b/standards/core/sync.md index 28735be..644df0c 100644 --- a/standards/core/sync.md +++ b/standards/core/sync.md @@ -59,9 +59,9 @@ message SyncPayload { ``` ### Session Flow -A client initiate a session with a server +A client initiates a session with a server by sending a `SyncPayload` with -only the `negentropy` field set to it. +only the `negentropy` field set. This field MUST contain the first negentropy payload created by the client @@ -79,8 +79,9 @@ the client computes an empty payload. This client computation also outputs any hash differences found, those MUST be stored. In the case of an empty payload, +the reconciliation is done, the client MUST send back a `SyncPayload` -with all the hashes previously found in the `hashes` field and +with all the missing server hashes in the `hashes` field and an empty `nengentropy` field. ## Attack Vectors