Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wmemcheck support for additional allocation functions and granular memory #9641

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cobbal
Copy link

@cobbal cobbal commented Nov 20, 2024

I was recently tracking down a memory corruption bug in swift and found the wmemcheck tool to be very helpful. Since the runtime of swift is fairly large, it required some extra features to remove false positives. I was able to modify wmemcheck to be useful enough for the job.

I'd like to contribute the changes, many of which overlap with parts of issue #7037 . In particular, I needed to add the following:

The last 2 possibly should be hidden behind options, but I'm not familiar enough with the code base or how others use this tool to know what the right approach would be.

@cfallin cfallin self-requested a review November 21, 2024 00:00
@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Nov 21, 2024
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this! Some initial thoughts below.

I'm really happy wmemcheck was actually useful for you, and I think most of this can be upstreamed with no problem. Just a few thoughts below.

} else if func_name == Some("free") {
self.check_free_start(builder);
match self.current_func_name(builder) {
Some("__wrap_malloc") | Some("malloc") =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here noting where the __wrap_* variants come from? That's useful both for its own sake (readers' understanding) but also so that if things change in the future with the external thing causing this need, we can re-evaluate.

// Invoked when wasm stack pointer is updated.
#[cfg(feature = "wmemcheck")]
update_stack_pointer(vmctx: vmctx, value: i32);
// Invoked before memory.grow is called.
#[cfg(feature = "wmemcheck")]
update_mem_size(vmctx: vmctx, num_bytes: i32);

// Invoked before stuff is called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we be more specific than "stuff" here?

// Hook for validating calloc using wmemcheck_state.
#[cfg(feature = "wmemcheck")]
unsafe fn check_calloc(store: &mut dyn VMStore, instance: &mut Instance, addr: u32, count: u32, size: u32) -> Result<u32> {
check_malloc(store, instance, addr, count * size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a note here that since we aren't tracking undefined-ness, we don't have to care about the other semantic difference between calloc and malloc, namely that memory is zeroed?

#[cfg(feature = "wmemcheck")]
unsafe fn check_malloc_usable_size(store: &mut dyn VMStore, instance: &mut Instance, len: u32, addr: u32) -> Result<u32> {
check_free(store, instance, addr)?;
check_malloc(store, instance, addr, len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a correct implementation of malloc_usable_size's semantics? The man page on my system says that it returns the size of a memory block; unclear to me whether we need a hook for that, even, to track alloc'd/free state?

for (_, entry) in instance.exports() {
if let wasmtime_environ::EntityIndex::Memory(mem_index) = entry {
let mem = instance.get_memory(*mem_index);
let out = *(mem.base.offset(outptr as isize) as *mut u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to do a bounds-check here -- could you use the safe APIs for the memory to access the out-pointer?

This is also little-endian-specific code; we'll want to get a [u8; 4] and do an explicit conversion with u32::from_le_bytes. (Wasmtime supports one big-endian architecture, s390x!)

}
unsafe {
print!("{:02x} ", *mem.base.offset((prev_malloc + i) as isize));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like code that could be useful for debugging, but as above with the posix_memalign out-pointer fetch, the memory accesses are unsafe, and also dumping the contents of previously allocated blocks might be too verbose. I think I'd prefer to omit this code for now; ultimately the right answer for any problem that needs examination of memory state like this is probably using a debugger (we have better support planned!) possibly with breakpoints on allocation events or trap-to-debugger on wmemcheck errors.

@@ -140,6 +158,9 @@ impl Wmemcheck {

/// Updates memory checker memory state metadata when free is called.
pub fn free(&mut self, addr: usize) -> Result<(), AccessError> {
if addr == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find! Can we add a comment here noting the implemented semantics (free of a null pointer is a no-op)?

@@ -101,12 +119,12 @@ impl Wmemcheck {
len: len,
});
}
MemState::ValidToWrite => {
/* MemState::ValidToWrite => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out? (If the checks were causing issues, could we put the exclusion under an appropriately named and documented option?)

pub fn malloc(&mut self, addr: usize, len: usize) -> Result<(), AccessError> {
pub fn malloc(&mut self, addr: usize, start_len: usize) -> Result<(), AccessError> {
// round up to multiple of 4
let len = (start_len + 3) & !3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this under an option? Perhaps "wmemcheck granularity" or similar?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants