-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add single and mapping struct to integration test for generic extraction (Part 4) #397
base: generic-extraction-row-id-update
Are you sure you want to change the base?
fix: add single and mapping struct to integration test for generic extraction (Part 4) #397
Conversation
f453df0
to
a38724c
Compare
da7559c
to
1f979a4
Compare
7c95a5b
to
869a1ab
Compare
869a1ab
to
93c80de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall very nice work ! My comments are mostly related to the
- external API, how should it be, and I will probably make more comments down the line after this first review
- The genericity of the description of what are we extracting. We can not customize for each case that we support now we have to think about all the use case we want to support later as well and all the combination possible.
@@ -338,7 +339,7 @@ fn value_metadata<const MAX_COLUMNS: usize, const MAX_FIELD_PER_EVM: usize>( | |||
} | |||
|
|||
/// Compute the table information for the value columns. | |||
fn compute_table_info( | |||
pub fn compute_table_info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are doing big changes, let's take a shot at having a better unified API.
For example, the address
chain_id
fields could be directly put inside extra
no ? Since we don't prove the computation of these identifiers, we dont need to have special ordering or whatever.
Then we could add if we want another API endpoint that call this one with explicit chain_id
address
genesis_block
for example since those are the information we put. But we can do that on DQ side it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, in commit c21fe13 I added the similar functions (*_raw
) for this and other identifiers computation which could only pass the extra
argument, the original functions (have contract_address
and chain_id
arguments) call these functions. If we set extra = (contract_address || chain_id || extra)
, the result of two similar functions should be same.
@@ -209,7 +210,7 @@ pub enum SlotInputs { | |||
MappingWithLength(Vec<SlotInput>, u8), | |||
} | |||
|
|||
#[derive(Debug)] | |||
#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize)] | |||
pub struct SlotInput { | |||
/// Slot information of the variable | |||
pub(crate) slot: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant information, can we extract it and put it separately ?
i.e. a table column's set is represented by (slot, Vec<SlotInput>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but I make this SlotInput
Struct corresponding to ColumnInfo, and seems it makes the conversion complicated. Please correct me if I was wrong, thanks.
mp2-v1/src/api.rs
Outdated
@@ -209,7 +210,7 @@ pub enum SlotInputs { | |||
MappingWithLength(Vec<SlotInput>, u8), | |||
} | |||
|
|||
#[derive(Debug)] | |||
#[derive(Clone, Debug, Eq, PartialEq, Hash, Serialize, Deserialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SlotInput::new()
<-- now that we now there is no bitpacking level, then we can get rid of the bit_offset
argument and just set it to 0 for the moment, until we refactor the circuit to be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove bit_offset
in the SlotInput
and the new
function in commit a2c4cfc.
MappingValues((MappingValuesExtractionArgs, Option<LengthExtractionArgs>)), | ||
/// Test arguments for single struct extraction | ||
SingleStruct(SingleStructExtractionArgs), | ||
/// Test arguments for mapping struct extraction | ||
MappingStruct((MappingStructExtractionArgs, Option<LengthExtractionArgs>)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to bind MappingValues
and MappingStruct
together, otherwise it will lead to an cambrian explosion of combination down the line.
From what I see the only difference is that the latter has the additional information about the metadata, but we could make it as well for the single value, there is no reason not to, or am i missing stg ?
Same for the SingleValues
and SIngleStruct
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to fix in commit b33cd73:
- Combine both single test cases into one, its table includes
4
single value slots and1
Struct slot now. - Combine the mapping arguments to
MappingExtractionArgs<V>
(V could be Address or LargeStruct). - Update the merge case for the above single columns (4 single values and
1
Struct) andMappingExtractArgs<LargeStruct>
.
/// Test arguments for single struct extraction | ||
SingleStruct(SingleStructExtractionArgs), | ||
/// Test arguments for mapping struct extraction | ||
MappingStruct((MappingStructExtractionArgs, Option<LengthExtractionArgs>)), | ||
Merge(MergeSource), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MergeSource
should be adapted to be able to use the combined enum above, not just SingleValue
and MappingValue
(well it will be after you merge the values andstruct together as i suggest before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this merge case to a more complicated case, it includes the single columns of 4
single values and 1
Struct, and the mapping columns of Struct now. Please correct me if I was wrong.
mp2-v1/src/values_extraction/mod.rs
Outdated
outer_key_id: Option<u64>, | ||
inner_key_id: Option<u64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bit hard for me to realize what should the API look like before this PR, but this definitely doesn't sound good. We need an API that is generic enough to cover
myVar uint256
myStruct MyStruct
myMapping mapping(uint256 => Address)
myStructMapping mapping(uint256 => MyStruct)
myArraySingle []uint256
myArrayStruct []MyStruct
myMappingArray mapping(uint256 => []MyStruct)
myDoubleMapping mapping(uint256 => mapping(Address => MyStruct))
The last two are the most complicated ones but you see the point.
We should not have to ahve these two outer and inner keys if we don't need them here. It looks to me the StorageSlot
structure because it's recursive could already support giving all the details for that. Is this correct or is it missing some things ?
It's ok we don't support the array and mapping of arrays use case now but we should eventually, so thinking about this in the most generic way should be the goal here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the StorageSlotInfo
is a collection which fields should be used during the values extraction proving process in the integration test storage trie. I think it may be different in DQ (let me think more details in PR #404).
Maybe I should replace this outer_key_id
and inner_key_id
to a vector of mapping key IDs. What do you think?
- Single slot has no mapping key ID (it only has value IDs).
- Mapping slot has one mapping key ID.
- Mapping of mappings slot has both the outer and inner mapping key IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove outer_key_id
and inner_key_id
from StorageSlotInfo in commit 50a58d9.
…' into generic-extraction-integration-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave some comments but after a while I skipped table_source
and indexing
because I agree with @nikkolasg that the main change is to avoid having specific data structures for each variant: ideally, we should have a data structure for simple variables (where each could be a scalar or a struct), and one for mappings (where the mapping value can be either a simple variable or a struct). So I think it makes sense to review deeply these files only after these changes
…' into generic-extraction-integration-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good ! Left a few comments tho that I think could make a cleaner approach as suggestion, but let's discuss, it's also LGTM as current code !
} | ||
|
||
/// Common functions for a specific type to interact with the test contract | ||
pub trait SimpleContractValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of the Simple
here ? simple starts to get overloaded.
Maybe ContractManipulator
or ContractController
or stg ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to ContractController
in commit 911aadb.
async fn update_contract_single_values(&self, ctx: &TestContext, contract: &Contract); | ||
|
||
/// Update the mapping values to the test contract. | ||
async fn update_contract_mapping_values(&self, ctx: &TestContext, contract: &Contract); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why two methods different with the same inputs ? From what I see, only one or the other is implemented ?
So we could make the trait simpler no ?
stg like
trait ContractController {
async fn current_values(...) -> Self;
async fn update_contract(&self, ....);
}
wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the trait functions to current_values
and update_contract
in commit 911aadb.
metadata(&table_info) | ||
} | ||
|
||
fn storage_slots(&self, metadata: &[MetadataGadget]) -> Vec<StorageSlot> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's not related to this PR but it's frankly confusing to see an Array of metadata gadget.
- Gadget name refers usually to circuit, here we're not dealing with any circuits
- The reason there's an array is just to separate the columns that corresponds to a certain EVM word. But a regular
HashMap
works as well. We could have a single structMetadata { words: HashMap<EvmWord, MetadataSubset>
for example.
and that would allow us to avoid doing weird shenanigans with re-ordering the columns according to the EVM words being used (here) etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rename the previous MetadataGadget
to ColumnsMetadata
, but leave build
and assign
functions to MetadataGadget
(empty struct now) in commit 051bf96.
Metadata { words: HashMap<EvmWord, MetadataSubset>
Seems the ColumnsMetadata
(or MetadataSubset) needs to include the all information to extract the EVM value, since one ColumnsMetadata belongs to one leaf node of values extraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't find this super intuitive for the end user. Maybe we could change mp2-v1 APIs (i.e., like these methods) to take as input, in place of ColumnsMetadata
, the description of the table (i.e., Vec<ColumnInfo>
) and the set of ids of columns to be extracted, so that MetadataGadget
(which is a circuit specific data structure, and so should not be directly exposed if possible) can always be constructed inside mp2-v1
? Alternatively, in place of the set of ids of columns to be extracted, we might only provide the evm_word
as input and then the columns to be extracted are the ones with ColumnInfo.evm_word == evm_word
?
This would also allow to avoid having both MetadataGagdet
and the "alias" struct ColumnsMetadata
being publicly exposed, which looks a bit weird to me. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I fix to pass the evm_word
and table_info
to these APIs, and extract the column info of same slot and evm word to build the metadata in internal. Please help review this commit 50a58d9, thanks.
panic!("Wrong slot number"); | ||
} | ||
current_values | ||
.update_contract_single_values(ctx, contract) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the updates are deterministic then couldn't we put that inside the trait implementation directly ? Instead of computing the updates outside of the trait (which requires us to be very specific about which struct / value etc it is) and then just ask the trait to update. Wdyt ? Maybe i'm missing stg obvious here, not sure..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the update value is random (as +1
or -2
), and we don't exactly know if slot (or field of a Struct) is the secondary index (when it's a secondary index update), I combine the both 4
single slots and 1
Struct into this SingleExtractionArgs
and set a random slot_input
as the secondary index (it could be one single slot or a field of Struct), so seems only SingleExtractionArgs
knows which is the secondary index (not the Value which implements the trait).
…assign` functions to `MetadataGadget`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better, thanks for the significant refactoring. Still some smaller changes, but it's getting in a good shape ;)
); | ||
old_table_values.compute_update(&new_table_values[0]) | ||
let input = ExtractionProofInput::Single(ExtractionTableProof { | ||
dimension: TableDimension::Compound, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is TableDimension
still necessary or can we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I delete the TableDimension
in commit 29fd77b, re-run the integration test and it could work for me.
}; | ||
debug!("MAPPING ZK COLUMNS -> {:?}", columns); | ||
let index_genesis_block = ctx.block_number().await; | ||
let row_unique_id = TableRowUniqueID::Mapping(columns.secondary.identifier()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The row unique id, in case of mappings, should be always computed from the mapping key, independently from whether the mapping key is the secondary index or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, I fix to the key ID in commit 2d1aa8a.
metadata(&table_info) | ||
} | ||
|
||
fn storage_slots(&self, metadata: &[MetadataGadget]) -> Vec<StorageSlot> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't find this super intuitive for the end user. Maybe we could change mp2-v1 APIs (i.e., like these methods) to take as input, in place of ColumnsMetadata
, the description of the table (i.e., Vec<ColumnInfo>
) and the set of ids of columns to be extracted, so that MetadataGadget
(which is a circuit specific data structure, and so should not be directly exposed if possible) can always be constructed inside mp2-v1
? Alternatively, in place of the set of ids of columns to be extracted, we might only provide the evm_word
as input and then the columns to be extracted are the ones with ColumnInfo.evm_word == evm_word
?
This would also allow to avoid having both MetadataGagdet
and the "alias" struct ColumnsMetadata
being publicly exposed, which looks a bit weird to me. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now! APIs for column gadget looks much better now, thanks for all the hard work on this PR
…' into generic-extraction-integration-test
…' into generic-extraction-integration-test
…on-integration-test
No description provided.