-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Expanding memory balloon causes VM to freeze #4990
Comments
Hi @maggie-lou, Thanks for raising the issue, with the helpful reproducer steps. I'll investigate myself and let you know what I find. Thanks |
In case this is helpful, when testing the balloon in a more production environment, I've noticed Kernel RCU stalls. I wonder if the balloon has a bad interaction with the RCU.
|
The UFFD handler might receive events out of order compared to how they actually happened. For example, if the guest first frees a page to the balloon device, and then immediately faults it in again, the UFFD handler might see the page fault before the freeing. This is a problem, as any pending `Remove` events in the queue will "block" the userfault FD (all ioctls return -EAGAIN). Fix this by always draining all events from the fd's queue, and gracefully handling -EAGAIN. Please see the code comment for in-depth analysis of the flow. Fixes firecracker-microvm#4990 Signed-off-by: Patrick Roy <[email protected]>
Hey! I've had a look into this today, and think what you're seeing is a combination of two issues: The first one is indeed that the Firecracker incorrectly handles balloon inflation events on restored VMs, and the fix from your PR for that is indeed the right one. However, the behavior you see after that (VM freezes) are not actually a bug in Firecracker, but rather a shortcoming of the simplistic UFFD handler used in our integration tests. Essentially, the problem is with the handling of |
The crux of the issue was that UFFD gets blocked (all ioctls return -EAGAIN) when there's any `remove` events pending in the queue, which means during processing we not only need to look at the "head" of the queue, but also make sure there's no `remove` events in the "tail". Deal with these scenarios correctly by always greedily reading the entire queue, to ensure there's nothing pending, and only then processing things one-by-one. Please see the new code comments for intricacies with this approach. Fixes firecracker-microvm#4990 Signed-off-by: Patrick Roy <[email protected]>
The crux of the issue was that UFFD gets blocked (all ioctls return -EAGAIN) when there's any `remove` events pending in the queue, which means during processing we not only need to look at the "head" of the queue, but also make sure there's no `remove` events in the "tail". Deal with these scenarios correctly by always greedily reading the entire queue, to ensure there's nothing pending, and only then processing things one-by-one. Please see the new code comments for intricacies with this approach. Fixes firecracker-microvm#4990 Signed-off-by: Patrick Roy <[email protected]>
The crux of the issue was that UFFD gets blocked (all ioctls return -EAGAIN) when there's any `remove` events pending in the queue, which means during processing we not only need to look at the "head" of the queue, but also make sure there's no `remove` events in the "tail". Deal with these scenarios correctly by always greedily reading the entire queue, to ensure there's nothing pending, and only then processing things one-by-one. Please see the new code comments for intricacies with this approach. Fixes firecracker-microvm#4990 Signed-off-by: Patrick Roy <[email protected]>
The crux of the issue was that UFFD gets blocked (all ioctls return -EAGAIN) when there's any `remove` events pending in the queue, which means during processing we not only need to look at the "head" of the queue, but also make sure there's no `remove` events in the "tail". Deal with these scenarios correctly by always greedily reading the entire queue, to ensure there's nothing pending, and only then processing things one-by-one. Please see the new code comments for intricacies with this approach. Fixes firecracker-microvm#4990 Signed-off-by: Patrick Roy <[email protected]>
Thanks so much @roypat ! After applying a similar fix to our UFFD handler, it's resolved our issues. In case anyone else hits this, I had to implement your suggestion |
@roypat Would you mind sharing how you debugged this? It would be helpful to have some strategies to debug similar issues in the future. Thanks! |
Admittedly, there wasn't much finesse involved. I had the uffd handler print out all events it received (which showed that the guest didn't actually completely freeze, since page fault events still came in after inflating the balloon), and then I started looking at the EAGAIN return from uffdio_copy, because that was the only change done in the uffd handler. After reading the kernel code a bit to figure out why EAGAIN was being returned, the connection with pending |
Fair enough - thanks again! |
Describe the bug
Even when there should be enough free memory in the VM, expanding the balloon sometimes causes the VM to freeze.
During a sample run (using the scripts linked below), after restoring the VM from a snapshot,
free -h
returned:Originally, the balloon was initialized to 5MB. When I inflated it to 20MB, it inflated successfully. When I inflated it to 30MB, the VM froze and there were a bunch of "Failed to update balloon stats, missing descriptor." errors.
To Reproduce
You can use the scripts in this branch: #4989
Expected behavior
I expected the balloon to be able to expand to 30MB because there is 72Mi of memory available.
Environment
Additional context
We are using UFFD to restore snapshots. The memory snapshots are quite large, so we're looking into using memory balloons with the goal of having the UFFD handler process removed memory ranges, so we don't have to save those memory ranges in the snapshot files. We've noticed that the VM will sometimes freeze when expanding the balloon, even when there should be sufficient memory.
Around the same time as the freeze, we always see the "Failed to update balloon stats, missing descriptor." errors as well as vsock connection errors
VIRTIO_VSOCK_OP_RST
.I've tried disabling async page faults, in case the freezing was related to some sort of race condition in the kernel but the problem persists.
Checks
No - It could be a Linux bug as well. Though we've read cases where people seem to be successfully using UFFD + the balloon, so this use case seems like it should be possible now.
The text was updated successfully, but these errors were encountered: