-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize migration by adding ability to read or extract payloads from state #5386
Optimize migration by adding ability to read or extract payloads from state #5386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank you for adding this 🙏
cmd/util/cmd/execution-state-extract/execution_state_extract_test.go
Outdated
Show resolved
Hide resolved
Added two flags to execution state extraction program: --extract-payloads-by-address produces a file of payloads for specified accounts or for all accounts instead of checkpoint files --use-payload-as-input uses payload file as input instead of checkpoint files The two new flags don't affect migration and other existing functionaly of state extraction program. These two options only affect input and output of state extraction program. In other words, this can be used to extract migrated payloads or extract as-is payloads for specified accounts.
2383569
to
85527b8
Compare
Co-authored-by: Bastian Müller <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #5386 +/- ##
==========================================
+ Coverage 55.97% 56.02% +0.04%
==========================================
Files 1024 1026 +2
Lines 99327 99859 +532
==========================================
+ Hits 55602 55943 +341
- Misses 39483 39617 +134
- Partials 4242 4299 +57
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Refactored payload file related functionality to be more reusable. Added flags: --input-payload-filename --output-payload-filename
This utility can be used to create a subset of execution state which can save time during development, testing, and support/troubleshooting.
…dresses Add utility to extract payloads by addresses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice optimization. I left a few comments about data validation to discuss.
return fmt.Errorf("cannot generate payloads file: %w", err) | ||
} | ||
|
||
log.Info().Msgf("Exported %d payloads out of %d payloads", exportedPayloadCount, len(payloads)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We break the state extraction into 2 steps, the first step to output all payloads into a file after the first migration, and the second step to read all the payloads from the output file and continue running the remaining migrations and extract state.
I wonder how can we make the process verifiable in case someone is trying to reproduce the same process, and try to validate each intermediate steps? Basically we need a way to guarantee and prove that the payloads did not change when reading it back and continue with the remaining migrations and state extraction.
One idea I had is to compute some hash (or checksum) of all the payloads before writing them to disk, and validate it after reading and decode the payloads.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how can we make the process verifiable in case someone is trying to reproduce the same process, and try to validate each intermediate steps? Basically we need a way to guarantee and prove that the payloads did not change when reading it back and continue with the remaining migrations and state extraction.
Making this verifiable "in case someone is trying to reproduce the same process" is not the primary use case now (this PR is to help speed up development/testing) but it makes sense to add a hash for other use cases. 👍
One idea I had is to compute some hash (or checksum) of all the payloads before writing them to disk, and validate it after reading and decode the payloads.
Thoughts?
Due to parallelism, the payloads within the output file may not be in the same sequence each time this program is used. In practice payload file would still produce the same checkpoint when used as input (as long as sequence of payloads inside is the only thing that changed and the contents of all payloads are unchanged).
I'll merge this PR and open a separate PR to address the "reproduce the same process" use case by:
- making the payloads have deterministic sequence in the output payload file
- computing and adding a hash to the payload file representing all the payloads
require.NoError(t, err) | ||
require.Equal(t, len(payloads), numOfPayloadWritten) | ||
|
||
payloadsFromFile, err := util.ReadPayloadFile(zerolog.Nop(), payloadFileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test case assumes the util to read the payload file is the same as the util produces it.
This test case is necessary, however, in the real use case we will be using two utils with different versions. I think it's necessary to verify that too.
One idea is to commit a payload file fixture produced by previous version, and add a test case to read the payload file and verify with the current version. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea is to commit a payload file fixture produced by previous version, and add a test case to read the payload file and verify with the current version. Thoughts?
There is only one version of payload file. Not sure what is meant by "a payload file fixture produced by previous version".
If you are referring to payload version payloadEncodingVersion
, we only use payloadEncodingVersion = 1
, no other version is supported.
@@ -78,6 +109,18 @@ func run(*cobra.Command, []string) { | |||
return | |||
} | |||
|
|||
if len(flagBlockHash) == 0 && len(flagStateCommitment) == 0 && len(flagInputPayloadFileName) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this check, and the block hash and state commitment should always be verified.
Because even though the check was performed when the payload file was generated, the payload itself does not include any information about the previously performed checks, and we are skipping them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this check, and the block hash and state commitment should always be verified.
Maybe we should keep this check because it is to ensure that user provides one of these:
- either
flagBlockHash
orflagStateCommitment
for checkpoint file input flagInputPayloadFileName
for payload file input
If none of these are provided, the program doesn't have input to extract state so it should exit early.
if len(exportedAddresses) == 0 { | ||
outputMsg = fmt.Sprintf("exporting all payloads to %s", flagOutputPayloadFileName) | ||
} else { | ||
outputMsg = fmt.Sprintf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now an output payload file can possibly include only a subset of all the addresses, which means a payload file can possibly include a subset of world state. This is useful for debugging, but also introduce a chance of a wrong payload file used for state extraction, which only include subset of the world state.
Yes, we have include the payload count encoded in the payload file, but it's only used for validating the file integrity instead of ensuring the payloads include the full set of the world state.
Should we add something to the payload file to ensure that a payload file generated by a subset of world state will never be used for state extraction?
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful for debugging, but also introduce a chance of a wrong payload file used for state extraction, which only include subset of the world state.
Great point! 👍
I'll modify payload file format to include a flag indicating whether payloads represent partial/full state. I'll also add a flag in state extraction program to explicitly allow partial state as input so we reduce chance of user error.
These changes will be in a separate PR combined with hashed payload file mentioned earlier.
Co-authored-by: Leo Zhang <[email protected]>
👏👌 |
Closes #5231
Added flags to execution state extraction program:
--input-payload-filename
--output-payload-filename
(if specified without--extract-payloads-by-address
, all payloads are exported)--extract-payloads-by-address
filters payloads to export for specified addresses (accounts)The new flags don't affect migration and other existing functionality of state extraction program. These options only affect input and output of state extraction program.
In other words, this can be used to extract migrated payloads or extract as-is payloads for specified accounts.
There are 2 use cases:
high priority (for Cadence 1.0 migration): save time during development, testing, or troubleshooting by focusing on a small subset of accounts. This PR was opened on Feb 13 after a Feb 12 meeting made this use case a high priority.
low priority (for reducing migration duration): produce intermediate file containing all payloads migrated with atree inlining and Cadence v0.42. Then read use that file as input to Cadence 1.0 migration to reduce total duration of migrations. This use case was confirmed to be low priority, see Feb 14 discord chat.
The 2nd use case became low priority because enabling atree inlining reduces migration duration in the same place. Migrations with atree inlining enabled are faster than do-nothing migrations given same input files. Atree inlining reduces node counts, payload counts, total data size, etc. making the later stages of migrations faster and use less memory.
NOTE: we want all payloads for all accounts to use atree inlining during migration because