Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: spec for high throughput recovery #4285

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

evan-forbes
Copy link
Member

@evan-forbes evan-forbes commented Feb 3, 2025

Overview

spec for high throughput recovery. this can be considered part I of vacuum!.

@evan-forbes evan-forbes added the WS: Big Blonks 🔭 Improving consensus critical gossiping protocols label Feb 3, 2025
@evan-forbes evan-forbes self-assigned this Feb 3, 2025
@evan-forbes evan-forbes requested a review from a team as a code owner February 3, 2025 13:56
@evan-forbes evan-forbes requested review from cmwaters and ninabarbakadze and removed request for a team February 3, 2025 13:56
@evan-forbes evan-forbes marked this pull request as draft February 3, 2025 13:56
Copy link

github-actions bot commented Feb 3, 2025

PR Preview Action v1.6.0

🚀 View preview at
https://celestiaorg.github.io/celestia-app/pr-preview/pr-4285/

Built to branch gh-pages at 2025-02-14 13:00 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@evan-forbes evan-forbes changed the title WIP: spec for PBBT docs: spec for high throughput recovery Feb 7, 2025
@evan-forbes evan-forbes marked this pull request as ready for review February 10, 2025 14:01
Copy link
Contributor

coderabbitai bot commented Feb 10, 2025

📝 Walkthrough

Walkthrough

The update introduces a new high throughput recovery mechanism for the Celestia protocol called the Pull Based Broadcast Tree (PBBT). The document in specs/src/recovery.md is expanded with multiple new sections that describe the recovery process, including pipelining of 'Have' and 'Want' messages, message propagation rules, and security measures. Additionally, several new message types are defined to support block propagation, along with detailed considerations for mitigating denial-of-service and sybil attacks.

Changes

File Change Summary
specs/src/recovery.md Added new sections outlining high throughput recovery: "Vacuum! Part I: High Throughput Recovery", "Summary", "Intro", "Pull Based Block Propagation", "Distribution Rules", "Disconnection Rules", "Pipelining 'Have' and 'Want' Messages", "Broadcast Tree", "Optimal Routing", "Security", "Denial of Service", "Sybil Attacks", "Defending Against Attacks", "Erasure Encoding Recovery Data", "Overlay Networks", "Preparation", "Synthesis". Also defined new message types: BlobMetaData, CompactBlock, HavePart, WantParts, and Part.

Sequence Diagram(s)

sequenceDiagram
    participant Broadcaster
    participant Overlay
    participant Peer

    Broadcaster->>Overlay: Send Commitment and 'Have' messages
    Overlay->>Peer: Propagate 'Have' messages
    Peer->>Overlay: Respond with 'Want' message
    Overlay->>Broadcaster: Forward 'Want' request
    Broadcaster->>Overlay: Send Data/Part messages
    Overlay->>Peer: Deliver Data/Part messages
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (26)
specs/src/recovery.md (26)

1-10: Title and Summary Overview: Ensure Consistent Terminology and Spelling
The title and summary provide a clear overview of the new high throughput recovery mechanism. However, please ensure consistency in naming—for example, the title “Vacuum! Part I: High Throughput Recovery” (line 1) should match the image alt text in line 15 (which currently uses “vacumm! map”). Also, consider using hyphenated adjectives (e.g., “pull‐based” instead of “pull based”, “push‐based” instead of “push based”) throughout the document for improved clarity.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...propagation) could be achieved using a "Pull Based Broadcast Tree" (PBBT). PBBT works by e...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...d Want messages in otherwise standard pull based gossip, and by distributing the burden ...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ... across the broadcaster's peers. Unlike push based broadcast trees, the route to distribut...

(BASED_HYPHEN)


[uncategorized] ~7-~7: This expression is usually spelled with a hyphen.
Context: ...n sybil attack both broadcast trees and pull based gossip in time restricted applications ...

(BASED_HYPHEN)


11-18: Intro Section: Clarify Two-Phase Propagation
The introduction clearly distinguishes between the "Preparation" and "Recovery" phases. For additional clarity, consider using consistent quotation marks and terminology (for example, “pull‐based gossip” instead of “pull based gossip”) throughout this section.

🧰 Tools
🪛 LanguageTool

[style] ~16-~16: Consider a shorter alternative to avoid wordiness.
Context: ...p](./figures/recovery/vacuum!_map.png) In order to use the data propagated during the prep...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~17-~17: This expression is usually spelled with a hyphen.
Context: ... recovery phase has to use some form of pull based gossip. Recovery has the following con...

(BASED_HYPHEN)


19-30: Recovery Constraints: Improve Readability
The listed constraints are comprehensive and well-organized. To enhance readability, consider rephrasing expressions such as “the overhead % of the block” to “the percentage overhead of the block” and ensure uniform styling of technical terms.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~30-~30: This expression is usually spelled with a hyphen.
Context: ...ock the proposer must upload Enter the Pull Based Broadcast Tree (PBBT). PBBT addresses:...

(BASED_HYPHEN)


30-38: Introducing PBBT: Clarify and Standardize Terminology
The introductory lines for the Pull Based Broadcast Tree (PBBT) set the context nicely. It would be beneficial to use the hyphenated form (e.g., “pull‐based broadcast tree”) for consistency. Also, ensure that the transition into how PBBT addresses throughput bottlenecks is smooth.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~30-~30: This expression is usually spelled with a hyphen.
Context: ...ock the proposer must upload Enter the Pull Based Broadcast Tree (PBBT). PBBT addresses:...

(BASED_HYPHEN)


[uncategorized] ~34-~34: This expression is usually spelled with a hyphen.
Context: ...resses: - efficiency: by utilizing pull based logic - speed: by pipelining Have...

(BASED_HYPHEN)


40-50: Pull Based Block Propagation: Consistency and Detail
This section describes the propagation mechanism effectively by listing the four message types. To improve clarity, consider revising the sentence on line 49 (e.g., remove extraneous commas and redundant phrasing) and standardize terms like “pull‐based” throughout.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~40-~40: This expression is usually spelled with a hyphen.
Context: ... components in an abstract fashion. ## Pull Based Block Propagation PBBT relies heavily ...

(BASED_HYPHEN)


[uncategorized] ~42-~42: This expression is usually spelled with a hyphen.
Context: ...BBT relies heavily on slightly modified pull based gossiping mechanisms. There are four ge...

(BASED_HYPHEN)


[uncategorized] ~49-~49: This expression is usually spelled with a hyphen.
Context: ...ta The only difference between typical pull based block propagation and what is used here...

(BASED_HYPHEN)


64-79: Distribution Rules: Clarify Rules and Correct Minor Wording
The distribution rules are thorough. Consider rephrasing “The rules below goal is to facilitate…” (line 68) to “The rules below aim to facilitate…” and review the section for consistent use of capitalization and technical terminology.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~64-~64: This expression is usually spelled with a hyphen.
Context: ...->>NodeB: DATA ``` Figure 0: General pull based gossip steps. ### Distribution Rules...

(BASED_HYPHEN)


[uncategorized] ~68-~68: This expression is usually spelled with a hyphen.
Context: ... rules below goal is to facilitate safe pull based propagation. - **Nodes MUST only propa...

(BASED_HYPHEN)


[duplication] ~70-~70: Possible typo: you repeated a word.
Context: ...- Nodes MUST only propagate validated messages - Commitment messages MUST originate from the Proposer - `H...

(ENGLISH_WORD_REPEAT_RULE)


[grammar] ~72-~72: You’ve repeated a verb. Did you mean to only write one of them?
Context: ... those committed to in the Commitment message - Data messages MUST match their corresponding Have a...

(REPEATED_VERBS)


86-95: Pipelining ‘Have’ and ‘Want’ Messages: Enhance Clarity
This section effectively outlines the pipelining approach. A suggestion would be to tighten the sentence “Instead of waiting until data is downloaded to gossip Have messages, in PBT, they are gossiped after verifying them” for conciseness, and to standardize terms such as “pull‐based” and “push‐based” for consistency.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~87-~87: This expression is usually spelled with a hyphen.
Context: ... Pipelining Have and Want Messages Pull based mechanisms are very efficient, but sacr...

(BASED_HYPHEN)


[uncategorized] ~88-~88: This expression is usually spelled with a hyphen.
Context: ... the Have and Want messages so that push based mechanisms can get close to the speed o...

(BASED_HYPHEN)


[uncategorized] ~94-~94: This expression is usually spelled with a hyphen.
Context: ...ready been downloaded** Keep the other pull based rules in context as well, where nodes c...

(BASED_HYPHEN)


116-118: Figure 1 Caption: Clarify Caption Consistency
The caption for the pipelining diagram is informative. Consider a slight rewording such as “Note the simultaneous transfer of Want and Have messages” to improve clarity.


119-130: Broadcast Tree Overview: Ensure Conceptual Clarity
The introduction to the broadcast tree concept is well done, and the definition of throughput parameters (T, B, n) is clear. It might help to introduce these variables with a brief contextual explanation before the formula appears.

🧰 Tools
🪛 LanguageTool

[style] ~120-~120: Consider a shorter alternative to avoid wordiness.
Context: ...ally the same time. ## Broadcast Tree In order to increase the "recovery" throughput subs...

(IN_ORDER_TO_PREMIUM)


135-138: Upload Bandwidth Explanation: Provide Additional Detail
The explanation on reducing the broadcaster's upload burden by distributing block data among peers is clear. A brief elaboration on how this distribution mechanism enhances overall network performance could further benefit the reader.


165-166: Figure 3 Caption: Minor Text Revision
The caption describing the broadcast tree diagram is succinct and informative. A minor revision such as “Note how portions of the block data are transferred between nodes” might further enhance clarity.


167-176: Optimal Routing Section: Clarify Language
This section explains the benefits of on-the-fly route generation well. Consider revising phrases like “on fly” to “on the fly” and “real time data” to “real-time data” to ensure both grammatical correctness and clarity.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~169-~169: Possible missing article found.
Context: ...fits of generating the route of data on fly instead of pre-planning the route is th...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~169-~169: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... route is that the plan can incorporate real time data. This inherently includes: - ne...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🪛 markdownlint-cli2 (0.17.2)

169-169: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


177-182: FIFO Messaging and Congestion: Correct Terminology
The discussion on potential congestion when Have messages are enqueued in a FIFO manner is valuable. Please correct typographical errors such as changing “haves” (line 177) to “Have messages” and “recieve” (line 177) to “receive.” A slight rephrasing for improved readability is recommended.

🧰 Tools
🪛 LanguageTool

[grammar] ~177-~177: The verb form ‘take’ does not seem to be suitable in this context.
Context: ...on as Data. This ensures that a haves take longer to be delivered to a congested c...

(HAVE_VB)


[style] ~182-~182: As an alternative to the over-used intensifier ‘extremely’, consider replacing this phrase.
Context: ... send them to many nodes in the network extremely quickly. Without additional rules, this would c...

(EN_WEAK_ADJECTIVE)


[typographical] ~182-~182: The word “therefore” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...use that node to become a bottleneck in distribution, therefore we add a new rule to ensure this doesn'...

(THUS_SENTENCE)

🪛 markdownlint-cli2 (0.17.2)

177-177: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


183-186: Concurrent Request Cap: Adequate Addition
The rule stating that nodes must have fewer than PerPeerConcurrentRequestCap concurrent requests per peer is a practical measure. Consider providing a brief rationale or an example to further illustrate its importance.


187-203: Security Section: Ensure Completeness and Clarity
The security section effectively highlights the need for a robust broadcasting mechanism that can withstand adversarial conditions. Minor adjustments such as rephrasing “held to same standard” to “held to the same standard” would improve readability.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~190-~190: Possible missing article found.
Context: ... broadcasting mechanism must be held to same standard. Meaning that if 2/3 of the vo...

(AI_HYDRA_LEO_MISSING_THE)


[misspelling] ~197-~197: This word is normally spelled with a hyphen.
Context: ... every other honest node. The second is self explanatory. Recall the constraints for recovery: ...

(EN_COMPOUNDS_SELF_EXPLANATORY)


211-222: Sybil Attacks (Pull Based Gossip): Minor Corrections
This section provides a good explanation of how sybil attacks can affect pull-based protocols. Please update “its easy” (line 220) to “it’s easy” and consider a slight rephrasing to further enhance clarity.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~214-~214: This expression is usually spelled with a hyphen.
Context: ... you find one. ### Sybil Attacks #### Pull Based Gossip Upon requesting data in a stand...

(BASED_HYPHEN)


[uncategorized] ~216-~216: This expression is usually spelled with a hyphen.
Context: ...sip Upon requesting data in a standard pull based protocol, faulty peers can simply not s...

(BASED_HYPHEN)


[uncategorized] ~220-~220: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...icious peers are removed. > Side note: its easy to confuse a congested peer as a m...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[uncategorized] ~220-~220: This expression is usually spelled with a hyphen.
Context: ...s peer scoring incredibly difficult for pull based systems. If timeouts for scoring or add...

(BASED_HYPHEN)

🪛 markdownlint-cli2 (0.17.2)

220-220: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


227-234: Broadcast Trees Conclusions: Highlight Key Insights
The conclusions regarding broadcast trees are insightful. Ensure that the formatting and numbering remain consistent and consider emphasizing the importance of differentiating between validators and non-validators more explicitly.


235-238: Defending Against Attacks: Introduction Clarity
The introduction to the defensive measures is clear. It might be beneficial to further structure this section with subheadings or a numbered list to delineate each strategy more clearly.


239-253: Erasure Encoding Example: Code Clarity and Context
The Go code snippet is an effective illustration of erasure encoding. Although the use of ellipses (...) as placeholders is acceptable, adding brief comments to explain that these represent actual data would improve clarity. Also, ensure that the inline comment on line 251 is fully clear.


260-263: Erasure Encoding Conclusion: Emphasize Security Benefits
This section concisely conveys that adding parity data can mitigate attacks. Consider rephrasing “a powerful tool to get to our goal” to a more formal tone.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~263-~263: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...s a powerful tool to get to our goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


264-273: Overlay Networks: Clarify Benefits and Implementation
The description of overlay networks highlights their role in mitigating sybil attacks effectively. A minor grammatical adjustment is needed—replace “can avoiding sybil nodes” (line 271) with “can avoid sybil nodes.”

🧰 Tools
🪛 LanguageTool

[grammar] ~271-~271: Modal verbs like ‘can’ or ‘will’ require the following verb to be in its base form.
Context: ...y network is simply that validators can avoiding sybil nodes to the best of their knowle...

(MD_BASEFORM)


274-276: Critical Note Reiteration: Confirm Emphasis
The reiterated note emphasizes that new broadcasting mechanisms must not lower the barrier to network halting. Please correct “Its important” (line 275) to “It’s important” for consistency and clarity.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~275-~275: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...y we have achieved our original goal. >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


277-283: Preparation Section: Reuse and Advantage
This section effectively explains how validators can leverage pre-distributed transactions to combat sybil attacks. A minor suggestion is to change “combatting” (line 279) to “combating” for correct spelling.

🧰 Tools
🪛 LanguageTool

[style] ~279-~279: The phrase ‘have the ability to’ might be wordy. Consider using “can”.
Context: ... already does. #### Preparation If we have the ability to know what portion of the voting power a...

(HAS_THE_ABILITY_TO)


[style] ~279-~279: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...as a given transaction, and the network is able to reuse the transactions they have alread...

(BE_ABLE_TO)

🪛 markdownlint-cli2 (0.17.2)

279-279: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


284-293: Synthesis Section: Summarize Key Mechanisms
The synthesis succinctly combines the various mechanisms to meet the required security standards. Please correct recurring typographical errors (e.g., “Its important” should be “It’s important”) and consider streamlining the bullet points for enhanced clarity.

🧰 Tools
🪛 LanguageTool

[typographical] ~285-~285: There might be a comma missing.
Context: ...sybil attacks. However, when we combine them we can meet our original goal: >Its im...

(IF_PRP_PRP_COMMA)


[uncategorized] ~287-~287: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...e them we can meet our original goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[style] ~290-~290: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...erlay network is added, then validators are able to avoid sybil attacks from non-validators...

(BE_ABLE_TO)


[style] ~291-~291: ‘a majority of’ might be wordy. Consider a shorter alternative.
Context: ...roposed block is already distributed to a majority of validators, then it has already been su...

(EN_WORDINESS_PREMIUM_A_MAJORITY_OF)


294-317: Commitment and CompactBlock Definitions: Ensure Accuracy
The protobuf definitions for BlobMetaData and CompactBlock are well presented. In the accompanying explanation (around line 313), please correct “siganture” to “signature” and ensure that the relationship between BlobMetaData, pbbt_root, and the other fields is described clearly.


349-358: Data Message Definition: Check for Completeness
The Part message is straightforward and well-defined. Please confirm that the chosen field types, particularly uint32 for the index, are sufficient for the intended use case and that this structure matches downstream message processing.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d43ec72 and 4add30d.

⛔ Files ignored due to path filters (1)
  • specs/src/figures/recovery/vacuum!_map.png is excluded by !**/*.png
📒 Files selected for processing (1)
  • specs/src/recovery.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
specs/src/recovery.md

[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...propagation) could be achieved using a "Pull Based Broadcast Tree" (PBBT). PBBT works by e...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...d Want messages in otherwise standard pull based gossip, and by distributing the burden ...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ... across the broadcaster's peers. Unlike push based broadcast trees, the route to distribut...

(BASED_HYPHEN)


[uncategorized] ~7-~7: This expression is usually spelled with a hyphen.
Context: ...n sybil attack both broadcast trees and pull based gossip in time restricted applications ...

(BASED_HYPHEN)


[style] ~16-~16: Consider a shorter alternative to avoid wordiness.
Context: ...p](./figures/recovery/vacuum!_map.png) In order to use the data propagated during the prep...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~17-~17: This expression is usually spelled with a hyphen.
Context: ... recovery phase has to use some form of pull based gossip. Recovery has the following con...

(BASED_HYPHEN)


[uncategorized] ~30-~30: This expression is usually spelled with a hyphen.
Context: ...ock the proposer must upload Enter the Pull Based Broadcast Tree (PBBT). PBBT addresses:...

(BASED_HYPHEN)


[uncategorized] ~34-~34: This expression is usually spelled with a hyphen.
Context: ...resses: - efficiency: by utilizing pull based logic - speed: by pipelining Have...

(BASED_HYPHEN)


[uncategorized] ~40-~40: This expression is usually spelled with a hyphen.
Context: ... components in an abstract fashion. ## Pull Based Block Propagation PBBT relies heavily ...

(BASED_HYPHEN)


[uncategorized] ~42-~42: This expression is usually spelled with a hyphen.
Context: ...BBT relies heavily on slightly modified pull based gossiping mechanisms. There are four ge...

(BASED_HYPHEN)


[uncategorized] ~49-~49: This expression is usually spelled with a hyphen.
Context: ...ta The only difference between typical pull based block propagation and what is used here...

(BASED_HYPHEN)


[uncategorized] ~64-~64: This expression is usually spelled with a hyphen.
Context: ...->>NodeB: DATA ``` Figure 0: General pull based gossip steps. ### Distribution Rules...

(BASED_HYPHEN)


[uncategorized] ~68-~68: This expression is usually spelled with a hyphen.
Context: ... rules below goal is to facilitate safe pull based propagation. - **Nodes MUST only propa...

(BASED_HYPHEN)


[duplication] ~70-~70: Possible typo: you repeated a word.
Context: ...- Nodes MUST only propagate validated messages - Commitment messages MUST originate from the Proposer - `H...

(ENGLISH_WORD_REPEAT_RULE)


[grammar] ~72-~72: You’ve repeated a verb. Did you mean to only write one of them?
Context: ... those committed to in the Commitment message - Data messages MUST match their corresponding Have a...

(REPEATED_VERBS)


[uncategorized] ~87-~87: This expression is usually spelled with a hyphen.
Context: ... Pipelining Have and Want Messages Pull based mechanisms are very efficient, but sacr...

(BASED_HYPHEN)


[uncategorized] ~88-~88: This expression is usually spelled with a hyphen.
Context: ... the Have and Want messages so that push based mechanisms can get close to the speed o...

(BASED_HYPHEN)


[uncategorized] ~94-~94: This expression is usually spelled with a hyphen.
Context: ...ready been downloaded** Keep the other pull based rules in context as well, where nodes c...

(BASED_HYPHEN)


[style] ~120-~120: Consider a shorter alternative to avoid wordiness.
Context: ...ally the same time. ## Broadcast Tree In order to increase the "recovery" throughput subs...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~169-~169: Possible missing article found.
Context: ...fits of generating the route of data on fly instead of pre-planning the route is th...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~169-~169: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... route is that the plan can incorporate real time data. This inherently includes: - ne...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[grammar] ~177-~177: The verb form ‘take’ does not seem to be suitable in this context.
Context: ...on as Data. This ensures that a haves take longer to be delivered to a congested c...

(HAVE_VB)


[style] ~182-~182: As an alternative to the over-used intensifier ‘extremely’, consider replacing this phrase.
Context: ... send them to many nodes in the network extremely quickly. Without additional rules, this would c...

(EN_WEAK_ADJECTIVE)


[typographical] ~182-~182: The word “therefore” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...use that node to become a bottleneck in distribution, therefore we add a new rule to ensure this doesn'...

(THUS_SENTENCE)


[uncategorized] ~190-~190: Possible missing article found.
Context: ... broadcasting mechanism must be held to same standard. Meaning that if 2/3 of the vo...

(AI_HYDRA_LEO_MISSING_THE)


[misspelling] ~197-~197: This word is normally spelled with a hyphen.
Context: ... every other honest node. The second is self explanatory. Recall the constraints for recovery: ...

(EN_COMPOUNDS_SELF_EXPLANATORY)


[uncategorized] ~204-~204: This expression is usually spelled with a hyphen.
Context: ...ST be <= the ProposalTimeout** While pull based gossip and broadcast trees are incredib...

(BASED_HYPHEN)


[uncategorized] ~214-~214: This expression is usually spelled with a hyphen.
Context: ... you find one. ### Sybil Attacks #### Pull Based Gossip Upon requesting data in a stand...

(BASED_HYPHEN)


[uncategorized] ~216-~216: This expression is usually spelled with a hyphen.
Context: ...sip Upon requesting data in a standard pull based protocol, faulty peers can simply not s...

(BASED_HYPHEN)


[uncategorized] ~220-~220: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...icious peers are removed. > Side note: its easy to confuse a congested peer as a m...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[uncategorized] ~220-~220: This expression is usually spelled with a hyphen.
Context: ...s peer scoring incredibly difficult for pull based systems. If timeouts for scoring or add...

(BASED_HYPHEN)


[style] ~224-~224: Consider a shorter alternative to avoid wordiness.
Context: ...lications require additional mechanisms in order to use pull based gossip. #### Broadcast ...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~224-~224: This expression is usually spelled with a hyphen.
Context: ...e additional mechanisms in order to use pull based gossip. #### Broadcast Trees Using th...

(BASED_HYPHEN)


[uncategorized] ~263-~263: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...s a powerful tool to get to our goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[grammar] ~271-~271: Modal verbs like ‘can’ or ‘will’ require the following verb to be in its base form.
Context: ...y network is simply that validators can avoiding sybil nodes to the best of their knowle...

(MD_BASEFORM)


[uncategorized] ~275-~275: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...y we have achieved our original goal. >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[style] ~279-~279: The phrase ‘have the ability to’ might be wordy. Consider using “can”.
Context: ... already does. #### Preparation If we have the ability to know what portion of the voting power a...

(HAS_THE_ABILITY_TO)


[style] ~279-~279: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...as a given transaction, and the network is able to reuse the transactions they have alread...

(BE_ABLE_TO)


[typographical] ~285-~285: There might be a comma missing.
Context: ...sybil attacks. However, when we combine them we can meet our original goal: >Its im...

(IF_PRP_PRP_COMMA)


[uncategorized] ~287-~287: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...e them we can meet our original goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[style] ~290-~290: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...erlay network is added, then validators are able to avoid sybil attacks from non-validators...

(BE_ABLE_TO)


[style] ~291-~291: ‘a majority of’ might be wordy. Consider a shorter alternative.
Context: ...roposed block is already distributed to a majority of validators, then it has already been su...

(EN_WORDINESS_PREMIUM_A_MAJORITY_OF)

🪛 markdownlint-cli2 (0.17.2)
specs/src/recovery.md

169-169: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


177-177: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


220-220: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


279-279: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (10)
specs/src/recovery.md (10)

51-63: Mermaid Diagram for Gossip Steps: Verify Rendering
The Mermaid diagram clearly illustrates the gossip steps. Please ensure that the diagram syntax is fully compatible with our documentation renderer and that the sequence of messages is correct.


80-85: Disconnection Rules: Clear and Comprehensive
The disconnection rules are well-defined and clearly describe when and why a node should disconnect from peers. No major issues are noted here.


96-115: Mermaid Diagram for Pipelining: Confirm Accuracy
The sequence diagram illustrating the pipelining process is very helpful. Please double-check that the timing indicators (e.g., “time 0”, “time 1”, etc.) accurately represent the intended flow of messages.


131-134: Throughput Calculation Snippet: Confirm Clarity
The Python snippet showing T = B / n is simple and clear. If intended for educational purposes, consider adding inline comments to explain what each variable represents.


139-164: Broadcast Tree Diagram: Check Diagram Accuracy
The Mermaid diagram for the broadcast tree effectively demonstrates how portions of the block data are transferred between nodes. Please verify that all message labels (such as HAVE_0, WANT_0, etc.) are consistent with the narrative description.


204-210: Denial of Service: Adequate Coverage
The discussion on how DoS attacks may manifest and the corresponding disconnection policies is concise and clear. No changes are needed here.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~204-~204: This expression is usually spelled with a hyphen.
Context: ...ST be <= the ProposalTimeout** While pull based gossip and broadcast trees are incredib...

(BASED_HYPHEN)


254-259: Erasure Encoding Throughput: Explain Trade-offs
The explanation that the maximum throughput is divided by e and the revised delivery constraints are clearly stated. This section is solid as is.


323-333: Have Message Definition: Clear and Concise
The protobuf definition for HavePart is clear and the verification note is helpful. No significant changes are needed here.


339-347: Want Message Definition: Verify Consistency
The WantParts message is defined clearly. Ensure that the use of tendermint.libs.bits.BitArray aligns with the rest of the protocol specifications and documentation.


359-363: Data Verification: Ensure Correctness
The verification rule—that the hash of the bytes in the data field must match the corresponding Have message—is clearly stated and necessary. This section is well done with no further changes required.

Copy link
Member

@rach-id rach-id left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉

Awesome work 🚀

Left some comments to understand more and some cosmetics


Given these constraints, we can isolate the different bottlenecks to throughput:

- **efficiency:** the overhead % of the block the validators must download (0% == downloading the block exactly once)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make sense to make a difference between block download efficiency, and block upload efficiency?

In this case, we're optimising to only download the block once. But nothing holds us from uploading it 100 times.

Or by efficiency you mean both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

determining upload efficiency is impossible to determine without either setting hard caps, or hardcoding the topology including fixed latency and bandwidth. If we make assumptions around even bandwidth and network latency throughput a random network, if all nodes only download the block once, and the proposer only uploads the block once, then nodes should upload the block ~once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So we're optimizing for the proposer to only upload the block once, but for the other nodes, they can upload as much as they can?

a separate idea, if the proposer has perfect internet, is well-connected, and has very low latency, why not let it upload more than once? that would speed up propagation.

Maybe we could do something like upload as much as you can, or even use streamed erasure encoding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So we're optimizing for the proposer to only upload the block once, but for the other nodes, they can upload as much as they can?

we're letting the optimal routing mechanisms handle the rest yeah exactly. Since the broadcaster is different each height and round, the optimal route of data is different each time. Sometimes one node could end up utilizing more bandwidth than others simply due to the topology, other times they could not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. maybe an optimization would be not to limit uploading the block only once but upload as much as possible.

In this case, if the node has low connectivity to the network, it can upload even 1/4 (or whichever the erasure encoding coefficient chosen), and be good since the rest of the network can reconstruct the block.

However, if it's very well-connected, it can upload as much as possible.

The idea is not to put any upload limitations to the proposer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the proposer isn't actually capped at uploading the block once. I agree that this is a really good point to bring up though as its not specified. There could be scenarios where it would be more optimal to download more data directly from the proposer if both the proposer and the downloading peer have available bandwidth.

I think @nashqueue brought up something similar. There likely exists a rule we can add that pairs well with the existing optimal routing rules

  • Nodes SHOULD send Have and Data messages from a FIFO queue
  • Nodes SHOULD request Data from the first peer that sends them a Have message
  • Nodes MUST have <= PerPeerConcurrentRequestCap concurrent requests to a single peer

I think @nashqueue suggested that the proposer could include another Have that is appended to the FIFO queue. This Have could simply be the entire rest of the block.

This also improves the fault tolerance of broadcast trees and pull based logic significantly, as if a peer is faulty or slow, then the data would naturally route through the next highest node

@celestia-bot celestia-bot requested a review from rach-id February 14, 2025 13:00
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
specs/src/recovery.md (3)

1-1: Standardize hyphenation in terminology.

For consistency and readability, the term "Pull Based" should be hyphenated as "Pull-Based" throughout the document.

-# Vacuum! Part I: High Throughput Recovery
+# Vacuum! Part I: High-Throughput Recovery

-High throughput recovery (the second half of block propagation) could be achieved using a "Pull Based Broadcast Tree" (PBBT).
+High-throughput recovery (the second half of block propagation) could be achieved using a "Pull-Based Broadcast Tree" (PBBT).

-Enter the Pull Based Broadcast Tree (PBBT).
+Enter the Pull-Based Broadcast Tree (PBBT).

-## Pull Based Block Propagation
+## Pull-Based Block Propagation

Also applies to: 5-5, 30-30, 40-40


26-26: Clarify the efficiency metric definition.

The efficiency metric definition could be clearer. Consider rephrasing to explicitly state that 0% overhead means optimal efficiency.

-- **efficiency:** the overhead % of the block the validators must download (0% == downloading the block exactly once)
+- **efficiency:** the percentage overhead of block data that validators must download beyond the original block size (0% overhead means each validator downloads the block exactly once, representing optimal efficiency)

177-177: Enhance FIFO queue implementation guidance.

The FIFO queue recommendation needs more specific implementation guidance to ensure consistent behavior across implementations.

-However, it can also include congestion if the `Have` messages and entered into the send queue in a FIFO fashion as `Data`. This ensures that a `Have` take longer to be delivered to a congested connection, making it more likely peers will recieve that same `Have` message first from a different peer. 
+However, it can also handle congestion by ensuring `Have` messages and `Data` messages share the same FIFO send queue. In a congested connection, this naturally delays `Have` messages, increasing the likelihood that peers receive these messages first from less congested peers.

-- **Nodes SHOULD send `Have` and `Data` messages from a FIFO queue**
+- **Nodes SHOULD implement a single FIFO queue for both `Have` and `Data` messages with the following characteristics:**
+  - Queue should be bounded to prevent memory exhaustion
+  - Messages should be processed strictly in order of insertion
+  - Queue size should be monitored as a congestion indicator

Also applies to: 179-179

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

177-177: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4add30d and 5e4afa0.

📒 Files selected for processing (1)
  • specs/src/recovery.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
specs/src/recovery.md

[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...propagation) could be achieved using a "Pull Based Broadcast Tree" (PBBT). PBBT works by e...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ...d Want messages in otherwise standard pull based gossip, and by distributing the burden ...

(BASED_HYPHEN)


[uncategorized] ~5-~5: This expression is usually spelled with a hyphen.
Context: ... across the broadcaster's peers. Unlike push based broadcast trees, the route to distribut...

(BASED_HYPHEN)


[uncategorized] ~7-~7: This expression is usually spelled with a hyphen.
Context: ...n sybil attack both broadcast trees and pull based gossip in time restricted applications ...

(BASED_HYPHEN)


[style] ~16-~16: Consider a shorter alternative to avoid wordiness.
Context: ...p](./figures/recovery/vacuum!_map.png) In order to use the data propagated during the prep...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~17-~17: This expression is usually spelled with a hyphen.
Context: ... recovery phase has to use some form of pull based gossip. Recovery has the following con...

(BASED_HYPHEN)


[uncategorized] ~30-~30: This expression is usually spelled with a hyphen.
Context: ...ock the proposer must upload Enter the Pull Based Broadcast Tree (PBBT). PBBT addresses:...

(BASED_HYPHEN)


[uncategorized] ~34-~34: This expression is usually spelled with a hyphen.
Context: ...resses: - efficiency: by utilizing pull based logic - speed: by pipelining Have...

(BASED_HYPHEN)


[uncategorized] ~40-~40: This expression is usually spelled with a hyphen.
Context: ... components in an abstract fashion. ## Pull Based Block Propagation PBBT relies heavily ...

(BASED_HYPHEN)


[uncategorized] ~42-~42: This expression is usually spelled with a hyphen.
Context: ...BBT relies heavily on slightly modified pull based gossiping mechanisms. There are four ge...

(BASED_HYPHEN)


[uncategorized] ~49-~49: This expression is usually spelled with a hyphen.
Context: ...ta The only difference between typical pull based block propagation and what is used here...

(BASED_HYPHEN)


[uncategorized] ~64-~64: This expression is usually spelled with a hyphen.
Context: ...->>NodeB: DATA ``` Figure 0: General pull based gossip steps. ### Distribution Rules...

(BASED_HYPHEN)


[uncategorized] ~68-~68: This expression is usually spelled with a hyphen.
Context: ... rules below goal is to facilitate safe pull based propagation. - **Nodes MUST only propa...

(BASED_HYPHEN)


[duplication] ~70-~70: Possible typo: you repeated a word.
Context: ...- Nodes MUST only propagate validated messages - Commitment messages MUST originate from the Proposer - `H...

(ENGLISH_WORD_REPEAT_RULE)


[grammar] ~72-~72: You’ve repeated a verb. Did you mean to only write one of them?
Context: ... those committed to in the Commitment message - Data messages MUST match their corresponding Have a...

(REPEATED_VERBS)


[uncategorized] ~87-~87: This expression is usually spelled with a hyphen.
Context: ... Pipelining Have and Want Messages Pull based mechanisms are very efficient, but sacr...

(BASED_HYPHEN)


[uncategorized] ~88-~88: Possible missing comma found.
Context: ...re very efficient, but sacrifice speed. Here we describe how to pipeline the Have ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~88-~88: This expression is usually spelled with a hyphen.
Context: ... the Have and Want messages so that pull based mechanisms can get close to the speed o...

(BASED_HYPHEN)


[uncategorized] ~88-~88: This expression is usually spelled with a hyphen.
Context: ...echanisms can get close to the speed of push based ones. Instead of waiting until data is...

(BASED_HYPHEN)


[uncategorized] ~94-~94: This expression is usually spelled with a hyphen.
Context: ...ready been downloaded** Keep the other pull based rules in context as well, where nodes c...

(BASED_HYPHEN)


[style] ~120-~120: Consider a shorter alternative to avoid wordiness.
Context: ...ally the same time. ## Broadcast Tree In order to increase the "recovery" throughput subs...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~169-~169: Possible missing article found.
Context: ...fits of generating the route of data on fly instead of pre-planning the route is th...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~169-~169: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... route is that the plan can incorporate real time data. This inherently includes: - ne...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~182-~182: As an alternative to the over-used intensifier ‘extremely’, consider replacing this phrase.
Context: ... send them to many nodes in the network extremely quickly. Without additional rules, this would c...

(EN_WEAK_ADJECTIVE)


[typographical] ~182-~182: The word “therefore” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...use that node to become a bottleneck in distribution, therefore we add a new rule to ensure this doesn'...

(THUS_SENTENCE)


[misspelling] ~197-~197: This word is normally spelled with a hyphen.
Context: ... every other honest node. The second is self explanatory. Recall the constraints for recovery: ...

(EN_COMPOUNDS_SELF_EXPLANATORY)


[uncategorized] ~204-~204: This expression is usually spelled with a hyphen.
Context: ...ST be <= the ProposalTimeout** While pull based gossip and broadcast trees are incredib...

(BASED_HYPHEN)


[uncategorized] ~214-~214: This expression is usually spelled with a hyphen.
Context: ... you find one. ### Sybil Attacks #### Pull Based Gossip Upon requesting data in a stand...

(BASED_HYPHEN)


[uncategorized] ~216-~216: This expression is usually spelled with a hyphen.
Context: ...sip Upon requesting data in a standard pull based protocol, faulty peers can simply not s...

(BASED_HYPHEN)


[grammar] ~220-~220: Did you mean “it’s” (contraction of “it is/has”)?
Context: ...icious peers are removed. > Side note: its easy to confuse a congested peer as a m...

(ITS_TO_IT_S)


[uncategorized] ~220-~220: This expression is usually spelled with a hyphen.
Context: ...s peer scoring incredibly difficult for pull based systems. If timeouts for scoring or add...

(BASED_HYPHEN)


[style] ~224-~224: Consider a shorter alternative to avoid wordiness.
Context: ...lications require additional mechanisms in order to use pull based gossip. #### Broadcast ...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~224-~224: This expression is usually spelled with a hyphen.
Context: ...e additional mechanisms in order to use pull based gossip. #### Broadcast Trees Using th...

(BASED_HYPHEN)


[uncategorized] ~263-~263: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...s a powerful tool to get to our goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[grammar] ~271-~271: Modal verbs like ‘can’ or ‘will’ require the following verb to be in its base form.
Context: ...y network is simply that validators can avoiding sybil nodes to the best of their knowle...

(MD_BASEFORM)


[uncategorized] ~275-~275: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...y we have achieved our original goal. >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[style] ~279-~279: The phrase ‘have the ability to’ might be wordy. Consider using “can”.
Context: ... already does. #### Preparation If we have the ability to know what portion of the voting power a...

(HAS_THE_ABILITY_TO)


[style] ~279-~279: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...as a given transaction, and the network is able to reuse the transactions they have alread...

(BE_ABLE_TO)


[typographical] ~285-~285: There might be a comma missing.
Context: ...sybil attacks. However, when we combine them we can meet our original goal: >Its im...

(IF_PRP_PRP_COMMA)


[uncategorized] ~287-~287: “its” (belonging to it) seems less likely than “it’s” (it is)
Context: ...e them we can meet our original goal: >Its important that new broadcasting mechani...

(AI_HYDRA_LEO_CPT_ITS_ITIS)


[style] ~290-~290: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...erlay network is added, then validators are able to avoid sybil attacks from non-validators...

(BE_ABLE_TO)


[style] ~291-~291: ‘a majority of’ might be wordy. Consider a shorter alternative.
Context: ...roposed block is already distributed to a majority of validators, then it has already been su...

(EN_WORDINESS_PREMIUM_A_MAJORITY_OF)

🪛 markdownlint-cli2 (0.17.2)
specs/src/recovery.md

169-169: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


177-177: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


220-220: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


279-279: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: docker-security-build / docker-build (GHCR; ghcr.io/celestiaorg/celestia-app)
  • GitHub Check: docker-txsim-build / docker-build (GHCR; ghcr.io/celestiaorg/txsim)
  • GitHub Check: test / test-fuzz
  • GitHub Check: test / test-coverage
  • GitHub Check: test / test-race
  • GitHub Check: test / test
  • GitHub Check: test / test-short
  • GitHub Check: markdown-linter / markdown-link-check
  • GitHub Check: Summary
🔇 Additional comments (1)
specs/src/recovery.md (1)

184-186: Specify recommended value range for PerPeerConcurrentRequestCap.

The document should provide guidance on recommended values for PerPeerConcurrentRequestCap based on network conditions and typical peer bandwidth capabilities.

Could you provide recommended value ranges for PerPeerConcurrentRequestCap based on typical network conditions and peer capabilities?

Comment on lines +251 to +253
// e is the erasure encoding factor that specifies the ratio of parity data
e := len(totalData) / len(originalData)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Elaborate on erasure encoding factor tradeoffs.

The impact of the erasure encoding factor on throughput and security needs more detailed explanation.

-// e is the erasure encoding factor that specifies the ratio of parity data
-e := len(totalData) / len(originalData) 
+// e is the erasure encoding factor that specifies the ratio of total data to original data
+// For example, e = 2.0 means 100% parity data (double the original size)
+// Higher values of e provide better protection against data loss but reduce throughput
+e := len(totalData) / len(originalData)

-- **(100% / e) of the Block data MUST be delivered to >2/3 of the validators**
+- **At minimum, (100% / e) of the Block data MUST be delivered to >2/3 of the validators**
+- **Example: With e = 2.0, only 50% of the total data needs to reach each validator**
+- **Note: Higher e values reduce the required data delivery percentage but increase bandwidth usage**

Also applies to: 255-257

Comment on lines +320 to +323
Verification:

- The signature MUST be valid and from the expected proposer for that height and round.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance CompactBlock verification rules.

The verification rules for CompactBlock should be more comprehensive.

 Verification:
 
-- The signature MUST be valid and from the expected proposer for that height and round.
+- The signature MUST be valid and from the expected proposer for that height and round
+- The version MUST be supported by the implementation
+- The height MUST be greater than 0
+- The round MUST be greater than or equal to 0
+- The bp_hash MUST NOT be empty
+- The txs array MUST NOT be empty
+- Each TxMetaData entry MUST have valid start and end values (start < end)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Verification:
- The signature MUST be valid and from the expected proposer for that height and round.
Verification:
- The signature MUST be valid and from the expected proposer for that height and round
- The version MUST be supported by the implementation
- The height MUST be greater than 0
- The round MUST be greater than or equal to 0
- The bp_hash MUST NOT be empty
- The txs array MUST NOT be empty
- Each TxMetaData entry MUST have valid start and end values (start < end)

Comment on lines +305 to +311
message CompactBlock {
int64 height = 1;
int32 round = 2;
bytes bp_hash = 3;
repeated TxMetaData txs = 4;
bytes signature = 5;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add version field to CompactBlock message.

The CompactBlock message should include a version field to support future protocol upgrades.

 message CompactBlock {
+  uint32              version   = 1;
-  int64               height    = 1;
-  int32               round     = 2;
-  bytes               bp_hash   = 3;
-  repeated TxMetaData txs       = 4;
-  bytes               signature = 5;
+  int64               height    = 2;
+  int32               round     = 3;
+  bytes               bp_hash   = 4;
+  repeated TxMetaData txs       = 5;
+  bytes               signature = 6;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
message CompactBlock {
int64 height = 1;
int32 round = 2;
bytes bp_hash = 3;
repeated TxMetaData txs = 4;
bytes signature = 5;
}
message CompactBlock {
uint32 version = 1;
int64 height = 2;
int32 round = 3;
bytes bp_hash = 4;
repeated TxMetaData txs = 5;
bytes signature = 6;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WS: Big Blonks 🔭 Improving consensus critical gossiping protocols
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants