Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proof splitter #2

Merged
merged 2 commits into from
Dec 22, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions src/starkex/proof-splitter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Due to the gas limitations when verifying an original STARK proof, the proof generated by the Stone Prover should be split into three types: the main proof, Trace Merkle proof, and FRI proof. Typically, this includes one Main Proof accompanied by several Trace Merkle Proofs and FRI Proofs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two things here:

  • you're missing a h1 header
  • you're also missing the registration of memory pages no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, also this is not going to show up in the book if you don't add it to the summary file :o

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're also missing the registration of memory pages no?

Indeed, registration of memory pages is part of the splitter, although empty bootloader doesn't require submitting main or continuous memory.

I guess I can add this part once I got a better understanding of how the bootloader works and implemented the registration of continuous memory.


This process depends on the [fact registry](https://zksecurity.github.io/stark-book/starkex/facts.html), which serves as verification snapshots. This allows the Main Proof to be verified based on earlier snapshots obtained from the Merkle Proofs and FRI Proofs, thereby circumventing the bottleneck caused by gas limitations in a single EVM transaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should be able to link to this by just writing something like ./starkex/facts.html instead of an absolute URL (which might change if we change domain name at some point)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a good way to explain these could be that they are subproofs that have been verified already :D


This section aims to clarify the roles of the various proof types, showing how the split proof works with references to the source code in [Cairo verifier contracts](https://github.com/starkware-libs/starkex-contracts), [Stone Prover](https://github.com/starkware-libs/stone-prover) and [Stark EVM Adapter](https://github.com/zksecurity/stark-evm-adapter).

### Main proof
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an h3 header instead of an h2


The Main Proof performs two key functions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you haven't defined what "main proof" means here at this point.

1. Deriving a deep composition polynomial $p_0$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first time you talk about the deep composition polynomial but you haven't defined it. I guess we should either have a short description here, or a link to some other section that talks more about that :o

Copy link
Contributor Author

@katat katat Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created a backlog :)
#3

2. Checking FRI layers for $p_0$

> This explanation reuses the notations from the [ethSTARK paper](https://eprint.iacr.org/2021/582.pdf).

The Main Proof contains the commitments [for](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout6/StarkVerifier.sol#L513) [the](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout6/StarkVerifier.sol#L524) [traces](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout6/StarkVerifier.sol#L534) and [FRI](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout6/StarkVerifier.sol#L553), as well as the trace values for deriving the polynomial $p_0$.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these links lead to the same page for me:

Screenshot 2023-12-22 at 12 31 50 PM


These trace values, also referred to as mask values, can be categorized into two types:

1. Trace values involved in expressing $C_j(x)$ in the following composition polynomial:


$$h(x) = \sum_{j=1}^{k} C_j(x) (\alpha_j x^{D-D_j^{-1}} + \beta_j)$$



2. Evaluations of the polynomials $h_i(x)$ decomposed from the $h(x)$:


$$h(x) = \sum_{i=0}^{M_2-1} x^i h_i(x^{M_2})
$$
Copy link
Contributor

@mimoo mimoo Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the verifier wants to check that a commitment to $h = p_0$ is computed correctly based on the constraints and the registers. To do that they check that at a random point $z$. In other word, they check

$$h(z) == \sum_{j=1}^{k} C_j(z) (\alpha_j z^{D-D_j^{-1}} + \beta_j)$$

Since $h$ is actually given as chunks $h_i$ instead (such that $h(x) = \sum_{i=0}^{M_2-1} x^i h_i(x^{M_2})$) the check is actually the following:

$$\sum_{i=0}^{M_2-1} z^i h_i(z^{M_2}) == \sum_{j=1}^{k} C_j(z) (\alpha_j z^{D-D_j^{-1}} + \beta_j)$$

To perform this check, the verifier computes the left hand side and the right hand side separately, then check that they match.

To compute the left hand side, they use FRI as a PCS to obtain an evaluation proof of the different $h_i(z^{M_2})$.

To compute the right hand side, they do the same but with the $C_j(z)$.

Is that it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your reasoning on the equations are correct. Thanks for providing these detailed math. I will add them to that part.

IIUC it is mainly for checking the consistency between the masks values of execution trace and evaluations of composition trace. Passing the consistency check implies the prover holds the correct trace values.

But it wants to improve the soundness using FRI. That is why the $p_0$ is derived from these oods values and get tested by FRI. So I think $h≠p_0$.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The polynomial for execution trace, which involves two rows of mask values, is

$$h(x) = \sum_{j=1}^{k} C_j(x) (\alpha_j x^{D-D_j^{-1}} + \beta_j)$$

while composition trace, which involves a single row, is

$$h(x) = \sum_{i=0}^{M_2-1} x^i h_i(x^{M_2})$$

I think the oods is majorly testing if these two traces are consistent.

But for FRI low degree test, it needs to use induced mask values to come up with a quotient polynomial $p_0$ of degree $d(h) - 1$. If the FRI failed, then the quotient polynomial $p_0$ is incorrect, implying the oods mask values are incorrect thus the consistency check is unsound actually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, the first OODS test only checks that h (or rather the chunks h_i) are correctly computed, meaning that they correctly encode the values you get by aggregating all the constraint polynomials (which are quotients) with random challenges.

But I don't think it's enough because originally you wanted to prove that all these quotients polynomials exist and were of some maximum degree, so you still need to do that but on the aggregated polynomials h_i this time

In other words, thanks to the OODS check, we reduced the problem of checking that each constraint is correct, to checking that a single polynomial (called the composition polynomial) is correct.


By invoking the [CpuConstraintPoly](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout5/CpuVerifier.sol#L364) contract to evaluate the $h(x)$ with the trace mask values, it checks the out-of-domain-sampling (oods) consistency among these provided trace values and the evaluations of decomposed composition polynomial.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first mention of OODS which needs to be defined (or point to a link that explains what that is :D)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4


After completing the OODS consistency check, it proceeds to prepares and verifies the FRI layers for $p_0​$.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to prepare* and verify*


Conducting the FRI test requires the evaluation domain of $p_0$ to be within the expected domain for low degree linear extension instead of the out of domain (todo: maybe it is more appropriate to say field extension for the out of domain here?). The FRI protocol cannot process queries for out-of-domain evaluation points due to computational feasibility constraints for the prover.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires the evaluation domain of $p_0$ to be within the expected domain for low degree linear extension

not sure what that means :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(at this point we don't care if p_0 is an LDE or not, most likely it isn't)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FRI protocol cannot process queries for out-of-domain evaluation points due to computational feasibility constraints for the prover

mmm, I'm not sure what this means either. What does it mean "process a query" here? Do you mean during the query phase of FRI the points have to be points that are committed to in the merkle tree, and since the field has a very large domain it would take too much work to commit to all evaluations (what is the size of the STARK field?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean during the query phase of FRI the points have to be points that are committed to in the merkle tree, and since the field has a very large domain it would take too much work to commit to all evaluations (what is the size of the STARK field?)

That is what I am trying to convey. I am trying to discuss the out of domain issue when applying the FRI as explained in the thaler lecture: https://www.youtube.com/watch?v=A3edAQDPnDY&t=5255s


Therefore, the trace values and evaluations of the decomposed composition polynomial should be induced from a domain aligned with a low-degree extension. These newly induced values, as opposed to those used for the OODS consistency check, are utilized to derive the [deep composition polynomial](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout5/StarkVerifier.sol#L429) $p_0$ through the [CpuOods contract](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout5/CpuOods.sol#L36):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I don't understand this paragraph either, I think there is a lot to unpack here:

  1. the trace values and evaluations of the decomposed composition polynomial should be induced from a domain aligned with a low-degree extension -> not sure what you mean here, I think I understand that the random evaluation point should be sampled from the main multiplicative subgroup used in the protocol?
  2. These newly induced values, as opposed to those used for the OODS consistency check, are utilized to derive the deep composition polynomial -> what is the OODS consistency check? how are they used to derive the composition polynomial?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my comments above help on this part? Let me know if not.
I guess I really need to dive into the field extension and subgroup stuffs to better explain this part.


$$p_0(x) = \sum_{\ell=0}^{M_1-1} \frac{\gamma_\ell \cdot (f_{j\ell}(x) - y_\ell)}{x - z g_\ell^e} + \sum_{i=0}^{M_2-1} \frac{\gamma_{M_1+i} \cdot (h_i(x) - \hat{y}_i)}{x - z^{M_2}}
$$

Here $f_j$ is the trace polynomial and $h_i$ is the decomposed composition polynomial, corresponding to the same polynomials resulting in the OODS values but in different evaluation domains.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how p_0 is computed here :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is f_{jl} ?

Copy link
Contributor

@mimoo mimoo Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh OK, I think there's a typo here, it should be f_{l}.

Now I think I understand this specific p_0, it is FRI used as PCS: you're proving aggregating $M_1 + M_2$ evaluation proofs, that $f_i(zg^e_i) = y_i$ for all $i$ and $h_j(z^{M_2}) = \hat{y}_j$ for all $j$

(what is $zg^e_l$)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should explain more clearly what is happening here. You need to have evaluations of:

  • the different registers $f_i$ at the random point $z$, so that you can evaluate all the constraints at $z$, so that you can evaluate $h$ at $z$ (LHS in my other comment)
  • the different $h_i$ at the random point $z$ so that you can evaluate $h$ at $z$ (RHS in my other comment)

so you use FRI as a PCS, and in addition aggregate all of these evaluation proofs together as an optimization.

Doing this means that the prover has to prove that p_0 exists and is of degree $d$ (for some $d$). During FRI the verifier will obtain different evaluations of $p_0$. For example if they want an evaluation of $p_0(3)$ the prover would have to produce merkle proofs for $f_i(3)$ for all $i$ and $h_j(3)$ for all $j$


In the first FRI layer computation stage, the contract reads these induced values based on FRI queries and decommits them from the main proof. Three traces are required for the Cairo verifier: [execution trace](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/StarkVerifier.sol#L395), [interaction trace](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/StarkVerifier.sol#L405) and [composition trace](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/StarkVerifier.sol#L415). Each decommitment checks against a [trace commitment](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/StarkVerifier.sol#L400), which should have been verified and registered as [a fact](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/MerkleStatementVerifier.sol#L47) in the Merkle fact registry.

These facts serve as verification [snapshots](https://zksecurity.github.io/stark-book/starkex/facts.html), which are crucial when it is needed to split a verification into multiple stages. This approach is necessary due to various reasons, such as circumventing gas cost limitations or conserving gas costs for transaction retries based on a verified snapshot.

Similarly, the process of checking the FRI layers depends on the FRI Merkle verification snapshots obtained earlier. It involves verifying whether the [fact is registered](https://github.com/starkware-libs/starkex-contracts/blob/f4ed79bb04b56d587618c24312e87d81e4efc56b/evm-verifier/solidity/contracts/cpu/layout5/Fri.sol#L93) with the FRI layer's commitment.

### Other split proofs
In order to provide the verfication snapshots for the Main Proof, it is essential to first complete both the Trace Merkle Proofs and the FRI Merkle Proofs. These preliminary steps are necessary to generate the verification snapshots required for the Main Proof.

The [MerkleStatementContract](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/MerkleStatementContract.sol#L15) and [FriStatementContract](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/FriStatementContract.sol#L23) are specifically designed for verifying the Trace Merkle Proofs and FRI Merkle Proofs, respectively. Upon successful verification, [these](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/FriStatementContract.sol#L85) [facts](https://github.com/starkware-libs/starkex-contracts/blob/f81ba5fdbd68516db50ea9679de9d0ac2f8049d8/evm-verifier/solidity/contracts/MerkleStatementContract.sol#L104) will be registered as verification snapshots.

### Proof annotations
With the original proof file obtained from the Stone Prover, it is necessary to run its CPU verifier to [generate the annotations](https://github.com/starkware-libs/stone-prover/blob/a78ff37c1402dc9c3e3050a1090cd98b7ff123b3/src/starkware/main/verifier_main_helper.cc#L36-L45). This process is akin to operating an offline simulator, which verifies the proof and simulates the required proof data as annotations. These annotations are then utilized by verifiers in other locations, such as the EVM Cairo verifier.

The primary reason for generating these annotations, rather than extracting directly from the original proof, would be to facilitate the restructuring of the data in the original proof into a clearer format. This restructuring is particularly beneficial for other verification procedures, such as those employed by EVM Cairo verifiers.

[Here](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L2) is an example of an annotated proof. This example showcases how annotations are combined with the original proof into a single JSON file for further processing. The annotations themselves represent various data points, including [different](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L5) [trace](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L12) [commitments](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L14), the [challenges](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L6-L11) [from](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L13) [verifier](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L15), the [mask values](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L16-L30) for out-of-domain (OODS) consistency checks, FRI [commitments](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L288-L301) and corresponding [trace decommitments](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L335-L349) for [queries](https://github.com/zksecurity/stark-evm-adapter/blob/f4f2c88bd30c157423f67564d9ea3481b70c0a3c/tests/fixtures/annotated_proof.json#L302-L334).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of these massive list of links :D I think having snippets and names of functions (for example what I did here https://zksecurity.github.io/stark-book/starkex/facts.html and here https://zksecurity.github.io/stark-book/starkex/starkex.html with the different links and snippets) is cleaner



### Open source library
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment on h3 that should be h2 (and previous header as well)

The [Stark EVM Adapter](https://github.com/zksecurity/stark-evm-adapter) provides an API to facilitate the splitting of original proofs from the Stone Prover into these segmented proofs. This library is particularly useful for developers working with STARK proofs in EVM environments. Included in the repository is a demo script that illustrates the complete process of [splitting a Stone proof](https://github.com/zksecurity/stark-evm-adapter/blob/20cb1a83ddcbbd092f8aa6cf3382383b1c0e9814/examples/verify_stone_proof.rs#L68). This script demonstrates not only how to split the proof but also how to [submit the resultant proofs to the Cairo verifier contracts](https://github.com/zksecurity/stark-evm-adapter/blob/20cb1a83ddcbbd092f8aa6cf3382383b1c0e9814/examples/verify_stone_proof.rs#L71-L96).