-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR 006: Non-interactive Defaults, Wrapped Txs, and Subtree Root Message Inclusion Checks #673
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am SO happy about these diagrams b/c they're really helping me develop an intuition for non-interactive defaults. Thanks for creating them!
|
||
|
||
- Messages begin at a location aligned with the largest power of 2 that is not larger than the message length or k. | ||
- If the largest power of 2 of a given message spans multiple rows it must begin at the start of a row (this can occur if a message is longer than k shares or if the block producer decides to start a message partway through a row and it cannot fit). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] I'm confused by this rule's application in example-full-block.png
for ns=2
. ns=2
is a message that spans 7 shares. The largest power of 2 less than the message length (7) or k (8) is 4. The message is aligned with index 4.
If the largest power of 2 of a given message spans multiple rows
The largest power of 2 for this message (4) doesn't span multiple rows but this message does because the block producer decided to start this message partway through the row and it cannot fit. If ns=2
is indeed supposed to be split across rows, then should this read:
- If the largest power of 2 of a given message spans multiple rows it must begin at the start of a row (this can occur if a message is longer than k shares or if the block producer decides to start a message partway through a row and it cannot fit). | |
- If the largest power of 2 of a given message is larger than k then it must begin at the start of a row. |
If ns=2
is supposed to start at a new row, I think the current language makes sense.
example-full-block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still thinking about this, but I think the first rule might be the only one we need. The we can just go into more detail of a what an "aligned power of two" actually means
> **Messages must begin at a location aligned with the largest power of 2 that is not larger than the message length or k.** | ||
|
||
![Subtree root commitment](./assets/subtree-root.png "Subtree Root based commitments") | ||
|
||
> **If the messages are larger than k, then they must start on a new row.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] I took a stab at rephrasing these rules so that I could grok them better. Defer to author if this is clearer to others
> **Messages must begin at a location aligned with the largest power of 2 that is not larger than the message length or k.** | |
![Subtree root commitment](./assets/subtree-root.png "Subtree Root based commitments") | |
> **If the messages are larger than k, then they must start on a new row.** | |
> **If a message length is less than or equal to k, then it must begin at a location aligned with the largest power of 2 that is less than or equal to the message length.** | |
![Subtree root commitment](./assets/subtree-root.png "Subtree Root based commitments") | |
> **If the message is larger than k, then it must start on a new row.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm I'll defer to @adlerjohn as they were the original author. I tried to stick to as close to what is written in the specs
Messages begin at a location aligned with the largest power of 2 that is not larger than the message length or k.
the rule was mainly just modified to be first, as I felt it was more important since we're only following a portion of the other rule now
Messages that span multiple rows must begin at the start of a row (this can occur if a message is longer than k shares or if the block producer decides to start a message partway through a row and it cannot fit).
we're not following the "if the block producer decides to start a message partway through a row and it cannot fit" part, as it doesn't make a different to the commitment provided that it follows the other rule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we should probably be more explicit about that in the ADR come to think of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the following diagram is correct: https://github.com/celestiaorg/celestia-app/raw/27ee1dac22be7498ebfb87be85c7cd4718a4d56e/docs/architecture/assets/example-full-block.png
Then, it is strictly less than the largest power of 2 that is less than or equal to the message length : ns = 4
Edit: the ns = 4
is put straight after ns = 3
. If we go by the largest power of 2 that is lower than 2, then it's 1. I guess the diagrams will need to be updated and we need to decide on:
- Case of 2
- Whether the power is strictly less than the length, or, less or equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether the power is strictly less than the length, or, less or equal.
that is not larger than the message length or k.
I interpret this to mean less than or equal to
does that clarify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaah, now I get it, when calculating the largest power of 2 that is less or equal than message length:
- We start the count from the share after the last share of the previous namespace
- We start the new namespace at the last position of the count. For example, if message length is
7
, the largest power of 2 that is less than message length is 4. Then, we start the new namespace at the fourth share after the last share of the previous namespace.
Thanks a lot for your help, now I get this more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, we start the new namespace at the fourth share after the last share of the previous namespace.
just tbc, it doesn't really have that much to do with where the last namespace ended. we want to start at the next aligned power of two, because we want a single subroot to contain all of the set of 4 shares. In a square size of 8, there are exactly two spots for this, share 0, and share 4. If the largest power of two a message cannot fit on the next aligned power of two, then it must always start on the next row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do more research and come back to you. thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have other questions, please, when you have time 😸 :
- In your previous response:
In a square size of 8, there are exactly two spots for this, share 0, and share 4
You mean share 0 and share 3, right? Because the message will be aligned with the power of 2?
- If
ns = 1
had 7 shares. Where would it start?
- The beginning of a new row
- The 4th position in a new row
- The same position it is starting now
- if the
k = 8
, then the largest power of 2 that are possible are: 1, 2, 4, and 8.
If a message length is 4. Then, the possible start of message are 1, 2 and 4. Then, the block producer checks if that collides with a previous namespace, then it does that on a new row, right?
In the above diagram,ns = 4
is still a mystery to me.
And, for the clients, if I understand right, all they need to do is decompose the message to a merkle mountain range structure, that is bound by the row size (which is always a power of 2), and sign all the subtree roots commitments, for all row sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean share 0 and share 3, right? Because the message will be aligned with the power of 2?
we actually want 4 in this case, as we want to start at the next aligned power of two, and we would want to start on the next row (5th position).
f ns = 1 had 7 shares. Where would it start?
if there was room on the row for 4 shares, the largest power of two less than or equal to the size of the message, then it would start there. If not, then it would start on the next row.
if the k = 8, then the largest power of 2 that are possible are: 1, 2, 4, and 8.
yeah!
If a message length is 4. Then, the possible start of message are 1, 2 and 4, or the start of a new row
then we have to use the largest, 4. it has to start at a share index that is divisible by 4
|
||
recall our non-interactive default rule: | ||
|
||
> **Messages must begin at a location aligned with the largest power of 2 that is not larger than the message length or k.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] This should only be accepted if the suggestion above is accepted.
> **Messages must begin at a location aligned with the largest power of 2 that is not larger than the message length or k.** | |
> **If a message length is less than or equal to k, then it must begin at a location aligned with the largest power of 2 that is less than or equal to the message length.** |
Co-authored-by: Rootul Patel <[email protected]>
Co-authored-by: Rootul Patel <[email protected]>
Co-authored-by: Rootul Patel <[email protected]>
…estia-app into evan/adr-003-NID
Co-authored-by: Rootul Patel <[email protected]>
Co-authored-by: Rootul Patel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible ADR, thanks for writing this! Again, the diagrams are insanely helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome stuff 🚀 👍. Mainly adding questions that I have.
> **Messages must begin at a location aligned with the largest power of 2 that is not larger than the message length or k.** | ||
|
||
![Subtree root commitment](./assets/subtree-root.png "Subtree Root based commitments") | ||
|
||
> **If the messages are larger than k, then they must start on a new row.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the following diagram is correct: https://github.com/celestiaorg/celestia-app/raw/27ee1dac22be7498ebfb87be85c7cd4718a4d56e/docs/architecture/assets/example-full-block.png
Then, it is strictly less than the largest power of 2 that is less than or equal to the message length : ns = 4
Edit: the ns = 4
is put straight after ns = 3
. If we go by the largest power of 2 that is lower than 2, then it's 1. I guess the diagrams will need to be updated and we need to decide on:
- Case of 2
- Whether the power is strictly less than the length, or, less or equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 👍 🎉 🎉 Awesome work. This should also be called: Demystifying non-interactive defaults.
Co-authored-by: CHAMI Rachid <[email protected]>
After reading #382, I still don't understand the importance of the following statement. ”This works for ensuring that each message was included, but this doesn't actually check that the commitment is actually a subtree root to one of the data availability header row or column roots.“ If we include the commitment into somewhere ( block header for example ), then It's even possible ( however have some additional work to do ) for a light node to validate the message, then we don't need the commitment to be a row/column root? As long as the message is in a share ( or some continuous shares ) and the row/column root ( or continuous roots ) includes these shares ( I know this is another addition work ), then it should be ok? You mentioned the fraud proof at the end of the post. Then, why don't rely on it ( and may remove the necessary that include commitments for light node and download of the message ), but it's just a 'good to have? The fraud proof seems important for celestia' basic assumption, then we could trust this way? |
@HoytRen I think the confusion here lies with term "Message Inclusion". As you stated, we could provide merkle proofs to prove that a message was included in the square. That's not the issue though. The issue is that we have to prove that each payment transaction (PFD) has a message included in the same block. We have to have a commitment that can be signed before the order of the messages in the block are known. That's the reason for the additional complexity. if you haven't already, I would recommend reading the original specs |
I see 3 cases that message could be missing or wrong: case 1: the sequenser submit a tx with wrong data. For this case the validator should keep the tx in the pool until data arrived or timeout, then the tx is failed. case 2: the validator pick a tx into block but don't release the data. This is what the DAS states for. The full node should always get all block data or generate a DA fraud proof. When full node have all date, but the message is missing, he should generate fraud proof too. case 3: the full node doesn't send all message to light node. This is what the NMT states for. I explained it earlier. Is here something I forget? I believe celestia rely on fraud proof to protect light nodes, and the DAS make sure the fraud proof could be generated. If we need another proof to allow light node check the message by themself, then is here a case that DAS doesn't work well? Before light node retrieve messages of it's own namespace ( i.e. when doing DAS ), I believe he doesn't need to have the concept of message, the validators should care of this, but not the light node ( the light node help validator to get confident that he isn't under split-view-attack too ). Even if light node really care of this, he could finally retrieve all message of his namespace, and then he could validate the messages. Let's get back to the orignial speces, it state: 'otherwise every node that validates the sanctity of the Celestia coin would need to download all message data' This means we are talking about light node, then we should accept a probablistic result and the whole network should afford the cost. In fact, even if a node get to wrong state, he should be fixed when next block comes, because the pre-state roots are conflict. If celectia can't trust DAS when handle it's own tx and coin, how could we convince people that light node of celestia could work for them? Then DAS has no meaning...( No, I don't think so ) |
@HoytRen I'm not sure I see how this is related to this ADR. This ADR is strictly related to the mechanism that allows users to sign over a commitment that links to subtree roots of the message without knowing the square layout ahead of time. I could be wrong, but I'm not sure this specific PR is the best place to answer your questions. Perhaps it would be easier via a different medium? |
Yes, you are right. I finally go out of the scope of this thread. We could discuss relative ideas somewhere else later ( Discord DM may be a better place when you are not that busy ). Since most of logics relative to ADR003 is implemented and both of us agree they are obviously work ( however I feel it has some drawbacks ), then the design problem isn't urgent. I find here are a lot of 'first good issue' for celestia-node project that I haven't evaluated. It's better I have a check there first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not fully ready with the review but submitting my first comments.
|
||
The main issue with that requirement is that users must know the relevant subtree roots before they sign, which is problematic considering that if the block is not organized perfectly, the subtree roots will include data unknown to the user at the time of signing. | ||
|
||
To fix this, the spec outlines the “non-interactive default rules”. These involve a few additional **default but optional** message layout rules that enables the user to follow the above block validity rule, while also not interacting with a block producer. Commitments to messages can consist entirely of sub tree roots of the data hash, and for those sub tree roots to be generated only from the message itself (so that the user can sign something “non-interactively”). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These involve a few additional default but optional message layout rules
This is something that always confuses me. The way I understood this on calls in the past is that this messages layout will be verified during ProcessProposal. How is it only a default and even more so: how is it optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's optional because when other validators are verifying the square, they don't pick where the message starts. That is encoded in the wrapped transactions as a share index.
This way, its possible for the block validity rule to still be followed provided that the user signs over a share commitment that matches the subtree roots of a message at an arbitrary starting point in the square.
All share commitments included in MsgPayForData must consist only of subtree roots of the data square.
thinking about this, we don't have an integration test (although we do have many unit tests for the code that finds subtree roots) for this scenario, so I'll add an issue.
- Refactor `PrepareProposal` to arrange the shares such that each message has the appropriate subtree roots, and so that the metadata connection between transactions and messages is correct. | ||
- Refactor `ProcessProposal` to check for message inclusion using only subtree roots to row roots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the defaults are not followed during ProcessProposal? Does the block get rejected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if they don't follow the non-interactive defaults, but yes if they don't follow the block validity rule
All share commitments included in MsgPayForData must consist only of subtree roots of the data square.
also see the above #673 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no way to do that atm, and it would be quite complicated. So a simpler answer would probably just be, yes.
Co-authored-by: John Adler <[email protected]>
…estia-app into evan/adr-003-NID
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
…estia-app into evan/adr-003-NID
Co-authored-by: Ismail Khoffi <[email protected]>
…age Inclusion Checks (celestiaorg#673) # ADR 006 #### Non-interactive Defaults, Wrapped Txs, and Subtree Root Message Inclusion Checks [rendered](https://github.com/celestiaorg/celestia-app/blob/63b0ff3671e20e397b951f3ff82a93d3e36e84c6/docs/architecture/ADR-003-Non-interactive-defaults.md) closes celestiaorg#598 Co-authored-by: Rootul Patel <[email protected]> Co-authored-by: CHAMI Rachid <[email protected]> Co-authored-by: John Adler <[email protected]> Co-authored-by: Ismail Khoffi <[email protected]>
ADR 006
Non-interactive Defaults, Wrapped Txs, and Subtree Root Message Inclusion Checks
rendered
closes #598