-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add Steal Time support in ARM #5139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try making the proposed change to pass the right address and see if it works.
I haven't left any comment regarding style or design to focus on functionality first. Once we get something working we can see how to make it "look better".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confused by the KVM API as well! Passing a user pointer to the IPA to kvm is ok, but it needs to be 64B aligned.
The hint was in the KVM documentation:
-EINVAL Base address not 64 byte aligned
https://www.kernel.org/doc/html/v5.8/virt/kvm/devices/vcpu.html#attribute-kvm-arm-vcpu-pvtime-ipa
Thank you so much for your fast response! Totally missed the docs for 64 byte alignment, but that ended up fixing the bugs and I can successfully boot a microVM manually. If you think its a good time now, we can look to flesh out the implementation and "make it better". Other than that, we believe the next step is adding unit/integration tests to make sure our device is working properly, as well as implementing persistence across snapshots. Any leads on how to do both of these things would be really helpful. The lead we have been looking at for snapshotting is the
|
Well done! I'll take a look.
I think extending it is ok, I didn't find any other place where we could fit it.
Correct. Remember to use the resource allocator with exact policy to ensure the system memory gets reserved in our internal allocator.
I see the kernel prints a message "using stolen time PV" in the console when it boots, so we could probably check for that. A more in-depth test to actually check that it's working would be to start 2 cpu hogging workloads on a 2vcpu guest and pin both vcpus to the same pcpu. That should achieve 50% steal time, and increase the relative counter in /proc/stat. We could assert the value actually increased. For the snapshot, we could check that the same (non-zero) value is present after resume and that it continues to increase if we hog it again. |
Hi @Manciukic, I was just about to comment :), thanks for your response. We've started a bare bones implementation of snapshot persistence in the previous commit. So if you also have the time to take a look at it, that would be awesome. We are now trying to change the existing tests to include PVTime. Any input on our design choices we made would be appreciated again, specifically:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I'd just move most of the pv_time specific logic into the new module to avoid cluttering the builder.rs even more.
We also need to bump the snapshot version and add a changelog entry, other than the integration tests.
I tried to run the branch through our CI and there are a couple of issues: https://buildkite.com/firecracker/firecracker-pr/builds/13103#
But the good news is functional and performance tests passed 🎉 To run these tests locally you can run:
|
Managed to fix the errors reported in the build as well as clippy errors, would be awesome if it could be re-run. We started looking at unit/integration tests, as per your suggestions, and were hoping to get a small lead as to going about it.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5139 +/- ##
==========================================
+ Coverage 83.06% 83.12% +0.06%
==========================================
Files 250 250
Lines 26897 26946 +49
==========================================
+ Hits 22342 22399 +57
+ Misses 4555 4547 -8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There are still a few style and compilation errors. You can run the tests from your development machine as:
snippet:
https://buildkite.com/firecracker/firecracker-pr/builds/13116#_ |
It's difficult to unit test code that mostly performs syscalls (ioctls on vcpu fd in this case), so the policy we adopted to not over-complicate the codebase with mocks is to test these features through end-to-end integration tests only. |
Thanks for the clarification on that. We've added some very basic unit tests to PVTime and also 2 integration tests for basic checking and snapshotting. If there's any changes/next steps we need to take, please let us know. A re-run of the test suite would also be very helpful so we can nail down any bugs, as it was a bit buggy locally. |
Can you also add a basic test that checks that the guest has enabled steal PV by looking at the kernel logs?
Make sure that only runs on aarch64.
Done. x86 has the same issues as above: https://buildkite.com/firecracker/firecracker-pr/builds/13138#0196241a-77dd-4fe4-9023-e053f8004c2d |
Added that extra test and should've completely fixed the x86 errors, should be good for test suite re-run. |
Thanks for the revisions @Manciukic. I've gone ahead and:
Please let me know if there is anything else on that end, and looking towards a merge, what would need to be done such as:
|
It's the assumption we already make in other parts of the code. I think it's fine.
Thanks!
Yup, all three are required for merging. I can't think of anything else at the moment, other than keeping the branch up-to-date with main. I will take a deeper look next week after the Easter holidays. |
Hi @Manciukic, Just updated the snapshot version by minor version, added changelog, and attempted to rebase the commits but git freaked out so hopefully I did it right. Also apologies it seemed to auto-request reviews after I force pushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly lgtm except a few minor things. Another small nit: please clean up the commit messages a little (make sure to mention what, why, and how).
update: It also looks like the rebase on top of main didn't go as expected. Make sure to git pull --rebase upstream main
and resolve the conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks reasonable. Couple minor fixes and it should be good to go.
Thanks for the PR review earlier. I have gone ahead and:
I've double checked that style, clippy, steal time tests pass, etc. Any feedback on this new design would be welcome, as well as a run of the test-suite if possible. I believe (hopefully) that it should build fine on x86 as well. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for doing that change! Everything looks a lot simpler now :D
Just a small comment about the restore code, but then this looks good to go from me too!
One more issue on x86 :(
|
@Manciukic Just pushed that fix for x86, sorry about that. Gonna try getting the cross-compilation working locally so I can test that soon. Edit: Apologies, the unused import seems to show up as an error, let me fix that :( Edit 2: Should be good now |
Adds functionality for pvtime, which displays steal time to guest on ARM machines. PVTime is persisted across snapshots as well (snapshot ver updated). - Added ipa per vCPU for mem region storing steal time info. - Persists this ipa per vCPU across snapshots. - Shared steal time mem region is setup on boot and restore from snapshot in builder.rs. Signed-off-by: Dakshin Devanand <dakshind2005@gmail.com>
Added integration tests checking: - steal time increase - steal time persistence across snapshots - pvtime existence on ARM Motivated by addition of PVTime functionality for ARM. Signed-off-by: Dakshin Devanand <dakshind2005@gmail.com>
Add a changelog entry to inform about addition of PVTime (steal time) functionality on ARM. Signed-off-by: Dakshin Devanand <dakshind2005@gmail.com>
You should be able to fairly easily validate warnings / compiler errors locally with just |
Thanks for pointing that out. It was telling me the |
Opening this PR as a draft as we continue working on this issue to add steal time support for ARM. Please see below for the changes we have implemented so far.
Changes
Added PVTime struct in
src/vmm/src/arch/aarch64/pvtime/mod.rs
that allocates and stores per-vCPU stolen_time regionsnew
constructor to create the device (allocating system memory and storing regions per vcpu), andregister_vcpu
which attempts to usekvm_set_device_attr
to register the corresponding IPA for a vcpuCreated a function
setup_pv_time
insrc/vmm/src/builder.rs
which is invoked inbuild_microvm_for_boot
kvm_has_device_attr
Started using debug/info logs while booting microvm for testing
Current problems:
Future Progress
setup_pv_time
Reason
Attempting to resolve issue #4866
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.