-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add dummy circuit to final extraction for indexing Data with no provable extraction #415
base: main
Are you sure you want to change the base?
Conversation
feat: allow for either aggregation-only or tabular-only queries
776bf1a
to
e110090
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circuit logic looks good, I think API needs some changes, I left some comments about it. The integration test should instead use a generic off-chain table instead of using blockchain data, as we discussed offline. The data to fill this table can be randomly generated, no need to use blockchain data.
#[derive(Clone, Debug, Serialize, Deserialize)] | ||
pub struct DummyWires { | ||
#[serde(serialize_with = "serialize", deserialize_with = "deserialize")] | ||
is_merge: BoolTarget, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this flag doesn't need to be an input of the circuit: it's not meaningful in this case, so we can just hardcode it to false in the circuit?
mp2-v1/src/values_extraction/mod.rs
Outdated
/// Compute the row value digest of one table. | ||
/// The parameter `digests` are computed by `compute_leaf_single_values_digest` or | ||
/// `compute_leaf_mapping_values_digest`. | ||
pub fn compute_table_row_digest(dimension: TableDimension, digests: &[Digest]) -> Digest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this method should be used in cases when we prove no provenance of the data, I think it should be more generic and work for any table we might define, mimicking the source-agnostic computation that is done inside the rows tree circuits. More specifically, I think this method should take as input a table description (i.e., the set of columns in the table) and the table itself, and then:
- Compute the digest of each row as
D(D(id1||v1)+D(id2||v2)...)
- Compute the cumulative digest of the table by summing up the digests of each row
Indeed, in the case where we don't prove provenance, we might combine variables from a smart contract in more ways than what we support with provable extraction, so the computation of the digest should work for a generic table. Wdyt?
mp2-v1/src/values_extraction/mod.rs
Outdated
} | ||
|
||
/// Merge the row value digests of multiple tables. | ||
pub fn merge_table_row_digests( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this function? There should be no longer a merge case when we don't prove the provenance, so there shouldn't be the need to compute the rows digest according to the merge case
mp2-v1/src/api.rs
Outdated
/// Compute the metadata hash for a table including no provable extraction data. | ||
/// The input is a metadata digest set by the caller, then generate the metadata hash | ||
/// same as the DB computation. | ||
pub fn no_provable_metadata_hash(metadata_digest: &Digest) -> MetadataHash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't require the caller to compute the metadata digest: I think we can just define a method to compute such metadata_digest
. Indeed, in this case the provenance is not proven, so the metadata hash is kind of pointless. For instance, we might simply compute it as D(id1||id2...||idN)
, where id1, id2, ..., idN
are the identifiers of the columns of the table. So we could define a method like no_provenance_metadata_digest
which takes as input the set of column identifiers and compute the metadata digest. This method would then just call no_provenance_metadata_digest
providing as input the set of column identifiers, which will be requested as input in place of metadata_digest
. Does it make sense?
mp2-v1/src/final_extraction/api.rs
Outdated
block_hash: HashOutput, | ||
prev_block_hash: HashOutput, | ||
metadata_digest: Digest, | ||
row_digest: Digest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid the caller to compute metadata_digest
and row_digest
. We could ask the caller to provide instead the inputs necessary to compute those digests (i.e., the set of column identifiers and the table itself), and then call inside this method the API functions we define to compute these digests (e.g. no_provenance_metadata_digest
and `compute_row_digest). Wdyt?
…uts for tabular queries
The related document: Indexing Data with No Provable Extraction
Summary
compute_table_row_digest
andmerge_table_row_digests
for table row digest computation.run_no_provable_test_cases
to wrap the original test cases to run the final extraction and check with DB result directly (without values extraction).