-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[move] Added bytecode map generation during bytecode disassembly #20797
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
) | ||
.map_err(|e| Error::Internal(format!("Error creating disassembler: {e}"))) | ||
.extend()? | ||
.disassemble() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how I feel about this. I'd probably prefer something like adding a new function, e.g., disasssemble_with_source_map
(or something like that) that would also generate the source map during disassembly, and then keeping this function the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
if bytecode_map { | ||
d.generate_bytecode_map(); | ||
} | ||
let (disassemble_string, bcode_map) = d.disassemble()?; | ||
if bytecode_map { | ||
println!("{}", serialize_to_json_string(&bcode_map)?); | ||
} | ||
println!("{}", disassemble_string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think places like this will be nicer with the two function approach as this will also allow removing the generate_bytecode_map
method as well. E.g., something like this
if bytecode_map { | |
d.generate_bytecode_map(); | |
} | |
let (disassemble_string, bcode_map) = d.disassemble()?; | |
if bytecode_map { | |
println!("{}", serialize_to_json_string(&bcode_map)?); | |
} | |
println!("{}", disassemble_string); | |
let diss = if bytecode_map { | |
let (disassemble_string, bcode_map) = d.disassemble_with_source_map()?; | |
println!("{}", serialize_to_json_string(&bcode_map); | |
disassemble_string | |
} else { | |
d.disassemble()?; | |
}; | |
println!("{}", disassemble_string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it allowed me to remove this function but it also means that having a global flag is no longer necessary (and somewhat awkward even)
@@ -253,9 +272,59 @@ pub fn run_move_unit_tests<W: Write + Send>( | |||
let coverage_map = CoverageMap::from_trace_file(trace_path); | |||
output_map_to_file(coverage_map_path, &coverage_map).unwrap(); | |||
} | |||
if save_disassembly { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why you put it here, since for the immediate use case it's only needed during testing, but producing disassembly here feels wrong to me.
IMO it feels like an option that should be able to be fed into the build process, and should be done when saving a compiled package out on that side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally followed my principle to avoid modifications to other crates but it certainly makes sense in this case. I think it also made the code cleaner as I didn't have to repeat some of the boilerplate.
@@ -360,7 +448,20 @@ impl<'a> Disassembler<'a> { | |||
"", | |||
buffer, | |||
|buffer, (name, ty)| { | |||
any_write!(buffer, "\t{name}: ")?; | |||
any_write!(buffer, "\t")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of these places could be simplified if any_write
/any_writeln
returned back start and end offsets. e.g., you'd then have something like
let (fname_start, fname_end) = any_write!(buffer, "{name}")?;
I'd also see about abstracting the source map generation and location generation as well. I think just a method would be fine for this. e.g.,
fn map_location(&self, start: u32, end: u32, action: FnMut(Loc) -> ()) {
if self.bytecode_map {
let loc = Loc::new(FileHash::empty(), start, end);
action(loc)
}
}
With that this part could look something like
let (start, end) = any_write!(buffer, "{name}")?;
self.map_location(start, end, |loc| bcode_map.add_struct_field_mapping(struct_def_idx, loc))
Alternately, if map_location
took a tuple this could be
self.map_location(any_write!(buffer, "{name}")?, |loc| bcode_map.add_struct_field_mapping(struct_def_idx, loc));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did something similar, but I actually pulled the Loc
generation into macros. I think it's probably simpler and thus possibly more readable than the callback solution
any_write!(buffer, " }}") | ||
any_write!(buffer, " }}")?; | ||
// perhaps surprisingly, but the location is of the whole variant | ||
// and not just of its name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed surprising. Maybe we should make it just be of its name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't want to sneak it in here, but can certainly do it in a separate PR if need be
unit: &CompiledUnitWithSource, | ||
) -> Result<()> { | ||
let root_package = self.package.compiled_package_info.package_name; | ||
let bytecode_modules_dir = CompiledPackageLayout::CompiledModules.path(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm of two minds on this. I think I'm fine with this, but part of me thinks this might be nice to be in its own separate top-level dir under the build/<pkg_name>
(i.e., a sibling of bytecode_modules
/sources
(Etc)).
Thoughts? I'm pretty open to either option I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's a great point. Putting them in a separate directory for trace-debugger to pick up will make it easier down the line - we can have other methods of generating disassembly (e.g., from the replay tool) to keep the same directory structure. I am putting them into build/disassembly
directory then
bcode_map_gen: bool, | ||
bcode_map: &mut SourceMap, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be so nitty, but I only mention it as I think it will clean up the logic...
What about passing bcode_map: &mut Option<SourceMap>
here. Then below, instead of things like
if bcode_map_gen {
bcode_map.add_top_level_struct_mapping(...)
}
This could be rewritten as
bcode_map.map(|m| m.add_top_level_struct_mapping(..));
or even as
if let Some(m) = bcode_map {
m.add_top_level_struct_mapping(...);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - will also let me shave off a parameter in a few places
// NB: The order in which these are called is important as each function is effectful. | ||
self.print_header(buffer)?; | ||
let mut bcode_map = self.print_header(buffer)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below but I think turning this into an option may make things even cleaner.
SignatureToken::U256 => any_write!(buffer, "u256"), | ||
SignatureToken::Address => any_write!(buffer, "address"), | ||
SignatureToken::Signer => any_write!(buffer, "signer"), | ||
SignatureToken::Bool => any_write!(buffer, "bool").map(|_| ()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO would be cleaner to do something like
match sig_tok {
....
}.map(|_| ())
or to ?
all of these and then have an Ok(())
at the bottom of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that some match arms return Result<()>
instead of Result<Loc>
, and I had to unify them which made this look uglier...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small nit to consider, but otherwise LGTM!
@@ -279,13 +279,14 @@ impl<'a> Disassembler<'a> { | |||
bcode_map_gen: bool, | |||
) -> Result<SourceMap> { | |||
// NB: The order in which these are called is important as each function is effectful. | |||
let mut bcode_map = self.print_header(buffer)?; | |||
let mut res = Some(self.print_header(buffer)?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: consider if this needs to make the source map at all, but I think this is fine as-is if you'd prefer to keep it like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about it decided to let it simply return largely empty SourceMap
rather than None
is that on the client side we then don't have to do another check (and possibly user-facing message) when unwrapping (e.g. for printing)
Description
This PR adds support for generating "bytecode maps" (source maps for disassembled bytecode)
Test plan
A new test has been added and all other test have to pass