-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add format validation to era2-stats binary #1555
Conversation
Checking for sequentiality and if the content is valid is 2 different things no. I guess you could say we are checking if the files our format specification which includes sequentially. But I would say just saying "validation" is misleading. As we are not checking if the data itself is valid, this PR is checking if our the file follows the format requirements we laid out. Even if the accounts are out of order, the data can still be valid and generate the canonical state trie, so I think the wording is misleading. |
How would you phrase it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are just 2 idea's maybe there is a better one? I just would like it to be more specific so there isn't the implication this verifies we can generate the expected trie. As just because a file isn't sequentially ordered doesn't mean it isn't a valid input to generate the canonical trie
order doesn't matter for that.
We want to ensure sequentially as that is the most optimized case for importing the file era2 file. In the importer we should throw a warning if a file doesn't have it's keys linear etc due to that.
So any phase which is more specific and doesn't imply this is validating the data itself is correct or not I think would be fine.
Updated as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/// | ||
/// It can be run with following command: | ||
/// | ||
/// ```bash | ||
/// cargo run -p e2store --bin era2-stats --features era2-stats-binary -- <path> | ||
/// cargo run --release -p e2store --bin era2-stats --features era2-stats-binary -- <path> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is just because the operation is slow, so --release
helps speed it up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.
e2store/src/bin/era2-stats.rs
Outdated
if index + 1 < account.storage_count { | ||
assert_eq!( | ||
storage.len(), | ||
10_000_000, | ||
"Non-final storage entry doesn't have 10M items", | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a minute to grok this (granted, I was unfamiliar with the era format). Maybe it's the double-negative in the error message? A quick comment might help speed up comprehension for era noobs. Something like:
if index + 1 < account.storage_count { | |
assert_eq!( | |
storage.len(), | |
10_000_000, | |
"Non-final storage entry doesn't have 10M items", | |
); | |
} | |
// Validate that every chunk of storage is full, with 10M items (skipping the final chunk) | |
if index + 1 < account.storage_count { | |
assert_eq!( | |
storage.len(), | |
10_000_000, | |
"Non-final storage entry doesn't have 10M items", | |
); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment, removed double-negative and added more details to the message.
What was wrong?
The "era2-stats" binary doesn't check validity of the file.
How was it fixed?
Added validation. The only thing missing is checking that state data actually results in the canonical state trie, but that is covered by
import-state
subcommand of the trin-execution.To-Do