Skip to content

RFC: refactor VirtIO devices, preparing for a PCI transport layer #5160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: feature/pcie
Choose a base branch
from

Conversation

bchalios
Copy link
Contributor

Changes

This does a few changes around interrupts for VirtIO devices and code organization for being able to handle multiple transports for VirtIO devices:

  1. Creates a transport module within vmm::devices::virtio and moves mmio module inside it
  2. Adds a VirtioInterrupt trait that abstracts away the functionality of interrupts used by VirtIO devices. At the moment, only IrqTrigger implements VirtioInterrupt.
  3. Changes the moment in which an IrqTrigger object is assigned to a VirtIO device. Up until now we created this object when creating the device itself. This PR changes this logic so that it is the transport layer that creates this interrupt and passes it over to the device during activation time. This is how things are done with PCIe in the Cloud Hypervisor implementation. It also, doesn't change anything for us, semantically; devices only trigger interrupts after having been activated.

Reason

Preparing for adding PCIe support

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 77.33711% with 80 lines in your changes missing coverage. Please review.

Project coverage is 83.01%. Comparing base (ae078ee) to head (e21d2f7).

Files with missing lines Patch % Lines
src/vmm/src/devices/virtio/block/device.rs 0.00% 20 Missing ⚠️
src/vmm/src/devices/virtio/vsock/event_handler.rs 40.00% 15 Missing ⚠️
src/vmm/src/devices/virtio/vsock/device.rs 71.42% 8 Missing ⚠️
src/vmm/src/devices/virtio/rng/device.rs 71.42% 6 Missing ⚠️
src/vmm/src/devices/virtio/transport/mmio.rs 90.00% 6 Missing ⚠️
src/vmm/src/devices/virtio/balloon/persist.rs 0.00% 4 Missing ⚠️
src/vmm/src/devices/virtio/block/virtio/persist.rs 0.00% 4 Missing ⚠️
src/vmm/src/devices/virtio/net/persist.rs 0.00% 4 Missing ⚠️
src/vmm/src/devices/virtio/vsock/persist.rs 0.00% 4 Missing ⚠️
src/vmm/src/devices/virtio/transport/mod.rs 0.00% 3 Missing ⚠️
... and 5 more
Additional details and impacted files
@@               Coverage Diff                @@
##           feature/pcie    #5160      +/-   ##
================================================
- Coverage         83.06%   83.01%   -0.06%     
================================================
  Files               250      251       +1     
  Lines             26897    27063     +166     
================================================
+ Hits              22342    22466     +124     
- Misses             4555     4597      +42     
Flag Coverage Δ
5.10-c5n.metal 83.50% <77.33%> (-0.07%) ⬇️
5.10-m5n.metal 83.50% <77.33%> (-0.07%) ⬇️
5.10-m6a.metal 82.73% <77.33%> (-0.06%) ⬇️
5.10-m6g.metal 79.31% <77.27%> (-0.04%) ⬇️
5.10-m6i.metal 83.50% <77.33%> (-0.06%) ⬇️
5.10-m7a.metal-48xl 82.71% <77.33%> (-0.07%) ⬇️
5.10-m7g.metal 79.31% <77.27%> (-0.04%) ⬇️
5.10-m7i.metal-24xl 83.45% <77.33%> (-0.07%) ⬇️
5.10-m7i.metal-48xl 83.45% <77.33%> (-0.07%) ⬇️
5.10-m8g.metal-24xl 79.30% <77.27%> (-0.04%) ⬇️
5.10-m8g.metal-48xl 79.30% <77.27%> (-0.04%) ⬇️
6.1-c5n.metal 83.55% <77.33%> (-0.07%) ⬇️
6.1-m5n.metal 83.54% <77.33%> (-0.07%) ⬇️
6.1-m6a.metal 82.78% <77.33%> (-0.06%) ⬇️
6.1-m6g.metal 79.31% <77.27%> (-0.04%) ⬇️
6.1-m6i.metal 83.54% <77.33%> (-0.06%) ⬇️
6.1-m7a.metal-48xl 82.76% <77.33%> (-0.06%) ⬇️
6.1-m7g.metal 79.30% <77.27%> (-0.05%) ⬇️
6.1-m7i.metal-24xl 83.55% <77.33%> (-0.07%) ⬇️
6.1-m7i.metal-48xl 83.55% <77.33%> (-0.07%) ⬇️
6.1-m8g.metal-24xl 79.30% <77.27%> (-0.04%) ⬇️
6.1-m8g.metal-48xl 79.30% <77.27%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bchalios bchalios force-pushed the feature/pcie branch 7 times, most recently from 7ab92d6 to 5dd5953 Compare April 22, 2025 12:30
Copy link
Contributor

@Manciukic Manciukic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had just a quick look but it looks good to me overall.

@@ -548,7 +552,8 @@ mod tests {
cmdline: &mut kernel_cmdline::Cmdline,
dev_id: &str,
) -> Result<u64, MmioError> {
let mmio_device = MmioTransport::new(guest_mem, device, false);
let interrupt = Arc::new(IrqTrigger::new().unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it made sense to create this directly in the MmioTransport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that. But restoring from a snapshot was a nightmare, because we first restore the device state and then we restore the transport. So, I needed the IrqTrigger before restoring the device itself.

@bchalios bchalios force-pushed the feature/pcie branch 2 times, most recently from 040a2e2 to 9880eac Compare April 23, 2025 09:07
use crate::vstate::memory::GuestMemoryMmap;

/// Enum that indicates if a VirtioDevice is inactive or has been activated
/// and memory attached to it.
#[derive(Debug)]
pub enum DeviceState {
Inactive,
Activated(GuestMemoryMmap),
Activated((GuestMemoryMmap, Arc<dyn VirtioInterrupt>)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Activated((GuestMemoryMmap, Arc<dyn VirtioInterrupt>)),
Activated(GuestMemoryMmap, Arc<dyn VirtioInterrupt>),

at the very least :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a struct type that includes them

@@ -36,7 +36,7 @@ impl DeviceState {
}

/// Gets the memory and interrupt attached to the device if it is activated.
pub fn active_state(&self) -> Option<(&GuestMemoryMmap, Arc<IrqTrigger>)> {
pub fn active_state(&self) -> Option<(&GuestMemoryMmap, Arc<dyn VirtioInterrupt>)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we necessarily can, because then we'd have to decide at compile time whether we use MMIO or PCIe, right?

&balloon_state.device_info,
constructor_args.event_manager,
)?;
}

for block_state in &state.block_devices {
let interrupt =
Arc::new(IrqTrigger::new().expect("Could not create interrupt for MMIO device"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We unwrap on IrqTrigger::new() literally everywhere. Can we instead just unwrap() inside new() and have it not return Result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +411 to +426
#[cfg(test)]
fn has_pending_interrupt(&self, interrupt_type: VirtioInterruptType) -> bool {
if let Ok(num_irqs) = self.irq_evt.read() {
if num_irqs == 0 {
return false;
}

let irq_status = self.irq_status.load(Ordering::SeqCst);
return matches!(
(irq_status, interrupt_type.into()),
(VIRTIO_MMIO_INT_CONFIG, IrqType::Config) | (VIRTIO_MMIO_INT_VRING, IrqType::Vring)
);
}

false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be a free-standing function has_pending_interrupt<T: VirtioInterrupt>(&T, VirtioInterruptType) -> bool somewhere inside a test module, so we avoid the #[cfg(test)]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can somehow make this easily, but I think it should be doable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm. The problem is that for as long as we support both MMIO and PCIe we will not have an easy way to create an type independent implementation. WDYT?

}

/// API of interrupt types used by VirtIO devices
pub trait VirtioInterrupt: std::fmt::Debug + Send + Sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh, what is the advantage of this being a trait over it being an enum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't necessarily see an advantage. That's how it's defined on the CH side, so got it as is.
I do have a sort of personal preference to dynamic dispatch when it's not on a performance critical path, because I (visually) hate this:

impl VirtioInterrupt {
    pub fn foo(&self) -> Bar {
        match self {
             Irq(irq) => irq.foo()
             Msi(msi) => msi.foo()
        }
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah fair, I personally actually prefer the enum version, but that's indeed personal preference. I mainly left the comment because I found the explicit Debug bound on a trait that functionally has nothing to do with debugging a bit ugly, but can also stay like this if you prefer it

Cargo.lock Outdated
Comment on lines 1627 to 1635
[[package]]
name = "vm-device"
version = "0.1.0"
dependencies = [
"anyhow",
"serde",
"thiserror 2.0.12",
"vmm-sys-util",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftovers from earlier experimentation? :o

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yes. sorry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm. Rebuilding seems to bring this in. I've no idea who's pulling it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, figured it.

@@ -311,7 +311,7 @@ impl<T: VhostUserHandleBackend + Send + 'static> VirtioDevice for VhostUserBlock
&self.queue_evts
}

fn interrupt_trigger(&self) -> Arc<IrqTrigger> {
fn interrupt_trigger(&self) -> Arc<dyn VirtioInterrupt> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose all of these could just return &dyn VirtioInterrupt, as none of the callers make use of the Arcs as far as I can see. But also out of scope here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, shall we do this when we bring the MSI code in?

}
};
if let Some(qidx) = raise_irq {
self.signal_used_queue(qidx as usize).unwrap_or_default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this call to unwrap_or_default() is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the _or_default() variant was used already, I assume, to ignore errors. I am happy to change this to something like expect(). Is that what you mean?

Copy link
Contributor

@roypat roypat Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no I was thinking of just doing

Suggested change
self.signal_used_queue(qidx as usize).unwrap_or_default();
self.signal_used_queue(qidx as usize);

but you're right, clippy would complain about the unused results, so it'd have to be

Suggested change
self.signal_used_queue(qidx as usize).unwrap_or_default();
_ = self.signal_used_queue(qidx as usize);

but iirc we do unwrap() (without _or_default) on these in other devices, so that should be fine too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to expect().

@bchalios bchalios force-pushed the feature/pcie branch 4 times, most recently from 68d197d to c59d619 Compare April 23, 2025 15:00
This is just code organization changes. Create a new module under
`virtio`, called `transport`. For the time being the only transport
supported is `mmio`. Also, move `IrqInterrupt` type within the MMIO
transport code, as it is MMIO specific.

Signed-off-by: Babis Chalios <[email protected]>
The MMIO transport for VirtIO devices uses an `IrqTrigger` object as the
object that models the logic for sending interrupts from the device to
the guest. We create one such object for every VirtIO device when
creating it. The MMIO device manager associates this object with an IRQ
number and registers it with KVM.

This commit changes the timing of association of an `IrqTrigger` with a
VirtIO-mmio device. It only assigns such an object to the device during
its activation. We do this to prepare for supporting a PCI transport for
VirtIO devices. The cloud hypervisor implementation for these passes the
interrupt objects used by the device during activation, so we make this
change to have a uniform way to handle interrupts for both transports.
Functionally, nothing changes for MMIO devices, as before activation we
don't trigger any interrupts.

Signed-off-by: Babis Chalios <[email protected]>
Describing the APIs that need to implement types that are used as
interrupts for VirtIO devices. Currently, we only use `IrqInterrupt`
interrupts, but this will change once we have MSI-X with PCIe devices.

Signed-off-by: Babis Chalios <[email protected]>
VirtIO devices assume they're operating under an MMIO transport and as a
consequence they use IrqTrigger as interrupts. Switch that to using
VirtioInterrupt for all VirtIO device objects. Only assume a
VirtioInterrupt is an IrqTrigger in MMIO specific code.

Signed-off-by: Babis Chalios <[email protected]>
`IrqTrigger::new()` returns a `Result` because creating an `EventFd`
might fail with an `std::io::Error` error. All users of `IrqTrigger`
create the object and directly unwrap the error.

To avoid unwraps all over the place, change `IrqTrigger::new()` to
unwrap a potential error while creating the EventFd internally and just
return `Self`.

Signed-off-by: Babis Chalios <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants