Skip to content

Conversation

@edigaryev
Copy link
Contributor

Currently, Cloud Hypervisor does not set a VIRTIO_IOMMU_F_INPUT_RANGE feature bit for the VirtIO IOMMU device, which, according to spec, means that the guest may use the whole 64-bit address space is for IOMMU purposes:

If the feature is not offered, virtual mappings span over the whole 64-bit address space (start = 0, end = 0xffffffff ffffffff)

As far as I am aware, there are currently no host platforms on the market capable of addressing the whole 64-bit address space.

For example, I am currently working with a host platform that reports 39-bit address space for IOMMU purposes:

DMAR: Host address width 39

When running a VFIO pass-through guest on such a platform, NVIDIA driver in guest gets DMA mapping failures when working with large data, and this results in Cloud Hypervisor exiting with the following error:

cloud-hypervisor: 1501.220535s: <__iommu> ERROR:virtio-devices/src/thread_helper.rs:53 -- Error running worker: HandleEvent(Failed to process request queue : ExternalMapping(Custom { kind: Other, error: "failed to map memory for VFIO container, iova 0x7fff00000000, gpa 0x24ce25000, size 0x1000: IommuDmaMap(Error(22))" }))

Passing --platform iommu_address_width=39 to Cloud Hypervisor built with this change fixes this.

@edigaryev edigaryev requested a review from a team as a code owner January 4, 2025 10:32
@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from 84ad1ae to fb89231 Compare January 4, 2025 10:34
.args(["--memory", "size=4G"])
.args(["--kernel", fw_path(FwType::RustHypervisorFirmware).as_str()])
.args(["--device", format!("path={NVIDIA_VFIO_DEVICE}").as_str()])
.args(["--platform", "iommu_address_width=42"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this test will have a virtio-iommu device as that is only added if any device has iommu=on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Should I keep the test, modifying it to use iommu=on, or have it somewhere else where it's more appropriate in the test suite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added iommu=on, but not sure if the host running this test is actually configured to support this option.

@edigaryev edigaryev force-pushed the iommu-address-width-bits branch 2 times, most recently from 7d9bb66 to d6d5c69 Compare January 4, 2025 14:43
@edigaryev edigaryev requested a review from rbradford January 4, 2025 14:44
@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from d6d5c69 to 4e05c8f Compare January 4, 2025 14:52
@edigaryev
Copy link
Contributor Author

Force-pushed to fix the formatting.

@russell-islam
Copy link
Contributor

russell-islam commented Jan 7, 2025

Is it possible to break down the single commit into multiple like below?

  1. Main changes (vmm, if possible make virtio changes in different commit)
  2. fuzz changes
  3. test scenario

@edigaryev
Copy link
Contributor Author

Is it possible to break down the single commit into multiple like below?

@russell-islam would you mind sharing your rationale behind this?

The PR itself is pretty small (+102 -5), so I'm curious about your goal in splitting it further 🤔

@russell-islam
Copy link
Contributor

Is it possible to break down the single commit into multiple like below?

@russell-islam would you mind sharing your rationale behind this?

The PR itself is pretty small (+102 -5), so I'm curious about your goal in splitting it further 🤔

Typically we try to separate the changes to each crate/components if it is in dependent. It's fine if the commit is small. In the commit we always try to follow the format crate: commit message. At least I got these type of impression before for my PRs.

@edigaryev
Copy link
Contributor Author

Typically we try to separate the changes to each crate/components if it is in dependent.

@russell-islam please take a look at change for fuzz/fuzz_targets/iommu.rs again.

It's inherently related to the "main changes", because virtio_devices::Iommu::new()'s signature was changed, and so we account for that by providing a missing argument.

What would be the point of splitting this into a separate commit?

@russell-islam
Copy link
Contributor

russell-islam commented Jan 7, 2025

Is it possible to break down the single commit into multiple like below?

@russell-islam would you mind sharing your rationale behind this?
The PR itself is pretty small (+102 -5), so I'm curious about your goal in splitting it further 🤔

Typically we try to separate the changes to each crate/components if it is in dependent. It's fine if the commit is small. In the commit we always try to follow the format crate: commit message. At least I got these type of impression before for my PRs.

Just add a tiny commit for this. That's okay. Then make another commit for test related changes.

Please try to see if the following split is possible.

  1. vmm changes(just config related)
  2. virtio-devices changes along with vmm(device_manager)
  3. fuzz
  4. tests

It's okay to have small commits with independent changes.

@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from 4e05c8f to 7d6efea Compare January 8, 2025 08:29
@edigaryev
Copy link
Contributor Author

  1. vmm changes(just config related)
  2. virtio-devices changes along with vmm(device_manager)
  3. fuzz
  4. tests

I've performed the split and force pushed the changes, please take a look.

It's okay to have small commits with independent changes.

I agree, but fuzz changes, for example, are dependent on virtio-devices changes (signature change), yet you ask to split them 🤷

@rbradford
Copy link
Member

  1. vmm changes(just config related)
  2. virtio-devices changes along with vmm(device_manager)
  3. fuzz
  4. tests

I've performed the split and force pushed the changes, please take a look.

It's okay to have small commits with independent changes.

I agree, but fuzz changes, for example, are dependent on virtio-devices changes (signature change), yet you ask to split them 🤷

If the fuzz changes are tightly dependent on the virtio device changes (due to function signature changes) then you were correct to merge them. The CI checks that each commit is independently compilable (however I don't think it spans across the the main and separate fuzz workspace.)

@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from 7d6efea to 38b38e1 Compare January 8, 2025 12:00
@edigaryev
Copy link
Contributor Author

If the fuzz changes are tightly dependent on the virtio device changes (due to function signature changes) then you were correct to merge them.

Force pushed a new set of commits.

@russell-islam
Copy link
Contributor

  1. vmm changes(just config related)
  2. virtio-devices changes along with vmm(device_manager)
  3. fuzz
  4. tests

I've performed the split and force pushed the changes, please take a look.

It's okay to have small commits with independent changes.

I agree, but fuzz changes, for example, are dependent on virtio-devices changes (signature change), yet you ask to split them 🤷

Thats fine.

Copy link
Contributor

@russell-islam russell-islam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

SeccompAction::Allow,
EventFd::new(EFD_NONBLOCK).unwrap(),
((MEM_SIZE - IOVA_SPACE_SIZE) as u64, (MEM_SIZE - 1) as u64),
64,
Copy link
Contributor

@russell-islam russell-islam Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use something like, const MAX_IOMMU_ADDRESS_WIDTH_BITS: u8 = 64;

Copy link
Contributor Author

@edigaryev edigaryev Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you've asked me to split the commits above, and @rbradford stated that commits should be independently compilable (atomic), this would not be possible to achieve because:

  • virtio-devices: iommu: allow limiting maximum address width in bits contains this line
  • vmm: introduce platform option to limit maximum IOMMU address width contains the MAX_IOMMU_ADDRESS_WIDTH_BITS constant

@rbradford
Copy link
Member

Please rebase these changes and force push - I want to ensure we're running atop the latest tree as there have been some changes related to the kernel CI.

Currently, Cloud Hypervisor does not set a VIRTIO_IOMMU_F_INPUT_RANGE
feature bit for the VirtIO IOMMU device, which, according to spec[1],
means that the guest may use the whole 64-bit address space is for
IOMMU purposes:

>If the feature is not offered, virtual mappings span over the whole
>64-bit address space (start = 0, end = 0xffffffff ffffffff)

As far as I am aware, there are currently no host platforms on
the market capable of addressing the whole 64-bit address space.

For example, I am currently working with a host platform that reports
39-bit address space for IOMMU purposes:

>DMAR: Host address width 39

When running a VFIO pass-through guest on such a platform, NVIDIA
driver in guest gets DMA mapping failures when working with large data,
and this results in Cloud Hypervisor exiting with the following error:

>cloud-hypervisor: 1501.220535s: <__iommu>
>ERROR:virtio-devices/src/thread_helper.rs:53 -- Error running worker:
>HandleEvent(Failed to process request queue : ExternalMapping(Custom
>{ kind: Other, error: "failed to map memory for VFIO container, iova
>0x7fff00000000, gpa 0x24ce25000, size 0x1000: IommuDmaMap(Error(22))"
>}))

Passing "--platform iommu_address_width=39" to Cloud Hypervisor built
with this change fixes this.

[1]: https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/
virtio-v1.3-csd01.html#x1-5420006

Signed-off-by: Nikolay Edigaryev <[email protected]>
@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from 38b38e1 to 31c3780 Compare January 14, 2025 13:00
@edigaryev
Copy link
Contributor Author

edigaryev commented Jan 14, 2025

Please rebase these changes and force push - I want to ensure we're running atop the latest tree as there have been some changes related to the kernel CI.

@rbradford done 👌

Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask if 64 bits is a sensible default - and what QEMU does. I can see that QEMU defaults to 64-bits too.

@rbradford rbradford enabled auto-merge January 14, 2025 14:08
@rbradford rbradford added this pull request to the merge queue Jan 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 14, 2025
@rbradford
Copy link
Member

@edigaryev Your newly added test failed - I think it's because there is no virtio-iommu device added as no devices are placed behind an iommu (i.e. with iommu=on or with an iommu segment --platform num_pci_segments=2,iommu_segments=1 - the latter might be better than placing the NIVIDIA behind another layer of IOMMU.)

@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from 31c3780 to f8aa0e5 Compare January 14, 2025 20:00
@edigaryev
Copy link
Contributor Author

@rbradford thanks for the heads up and a good catch!

I've updated the test in f8aa0e5 to additionally specify num_pci_segments=2,iommu_segments=1 in --platform's arguments.

@rbradford rbradford enabled auto-merge January 14, 2025 20:36
auto-merge was automatically disabled January 14, 2025 20:38

Head branch was pushed to by a user without write access

@edigaryev edigaryev force-pushed the iommu-address-width-bits branch from f8aa0e5 to 957f4ea Compare January 14, 2025 20:38
@edigaryev
Copy link
Contributor Author

Fixed code formatting for test.

@rbradford rbradford enabled auto-merge January 14, 2025 20:45
@rbradford
Copy link
Member

rbradford commented Jan 14, 2025

@rbradford any idea what's wrong with Clippy?

Yes - this is an issue with beta clippy - rust-lang/rust-clippy#13953 affecting all our PRs. I've not disabled beta as I think (hope?) this will be backporting soon. Failures of clippy beta does not block merging.

@rbradford rbradford added this pull request to the merge queue Jan 14, 2025
Merged via the queue into cloud-hypervisor:main with commit 5a8df36 Jan 14, 2025
34 of 38 checks passed
@edigaryev edigaryev deleted the iommu-address-width-bits branch January 16, 2025 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants