Skip to content

Conversation

@blitz
Copy link
Contributor

@blitz blitz commented Apr 8, 2025

When devices don't implement a capabilities list or just return 0xFF for the capabilities pointer sad things happen. It typically boils down to receiving 0xFF from the capabilities register. In debug builds, this will then crash the VMM, because cap_iter + 1 overflows:

thread 'vmm' panicked at pci/src/vfio.rs:934:63:
attempt to add with overflow

Fix by:

  • checking whether the device claims to have a capability list,
  • terminate the checking loop if the capability points to itself.

A very broken (malicious) device might still send this code into an endless loop, but I assume this is not part of the threat model and we don't want to overcomplicate this code.

It would be nice to implement a small iterator for the capability list, so we don't have to duplicate the logic (and quirks handling). If this is interesting, please say so. :)

While I was here, I also changed the capability pointer handling to follow the PCI spec and mask out the lowest two bits. See PCI Local Bus Specification 3.0 Chapter 6.7 "Capabilities List".

@blitz blitz requested a review from a team as a code owner April 8, 2025 15:35
Copy link
Member

@likebreath likebreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blitz Thanks for reporting and fixing the bug. Since we are addressing multiple (but related) issues, would you please split the the first commit into three:

  1. Check whether the device claims to have a capability list;
  2. Mask the lower 2 bits of a capability pointer to align with PCI spec;
  3. Terminate the caps loop if the current capability points to itself.

Also, want to confirm, issue 3 would only happen for devices that are malicious or broken, right?

@likebreath likebreath added the bug-fix Bug fix to include in release notes label Apr 9, 2025
OSDev has cranked up its bot protection. The following link works for
me locally after clicking the "I'm a human" button. I guess the CI
fails this check...

Without this exception the CI fails the link check stage:

* [403] [https://wiki.osdev.org/IOAPIC](https://wiki.osdev.org/IOAPIC) | Network error: Forbidden

Signed-off-by: Julian Stecklina <[email protected]>
@blitz
Copy link
Contributor Author

blitz commented Apr 10, 2025

@blitz Thanks for reporting and fixing the bug. Since we are addressing multiple (but related) issues, would you please split the the first commit into three:

  1. Check whether the device claims to have a capability list;
  2. Mask the lower 2 bits of a capability pointer to align with PCI spec;
  3. Terminate the caps loop if the current capability points to itself.

Also, want to confirm, issue 3 would only happen for devices that are malicious or broken, right?

Done!

blitz added 4 commits April 10, 2025 14:07
Currently, the code tries to follow the PCI capabilities list in
offset 0x34 in the config space regardless of whether the status
registers says this is valid. Fix by adding the appropriate check.

Signed-off-by: Julian Stecklina <[email protected]>
The PCI standard mandates that the lower bits of the capability
pointer are masked out before using the pointer. See PCI Local Bus
Specification 3.0 Chapter 6.7 "Capabilities List".

Signed-off-by: Julian Stecklina <[email protected]>
If a device returns 0xff as a capability pointer bad things happen.
The code before the previous commits would crash in debug builds due
to integer overflow. With the two lowest bits masked out, it sends the
code into an endless loop.

Be more robust by at least handling the case where the capability
appears to point to itself.

Signed-off-by: Julian Stecklina <[email protected]>
There are a lot of internal functions that are not and probably should
not be called from other places.

Signed-off-by: Julian Stecklina <[email protected]>
// doesn't handle all cases where a device might send us in a loop here, but it
// handles case of a device returning 0xFF instead of implementing a real
// capabilities list.
if cap_next == 0 || cap_next == cap_iter {
Copy link
Member

@likebreath likebreath Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit makes sense to me that it handles a corner case where we should break the loop if a cap pointer points to itself. But I don't see how returning 0xff as the next cap pointer would be a problem, particularly with the previous commit where the last two bits are masked out. Basically, the loop would stop at the top level with condition while cap_iter != 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top level condition check would catch the cap_next == 0 case, but not cap_iter == cap_next for values other than 0 (i.e. 0xfc for the masked 0xff example).

There is a subtle implementation detail in that this break check leaves cap_iter at the last offset instead of updating it. And it is still used after the loop to set up the nv_gpudirect_clique_cap. That logic would break. So the top level loop check only ever triggered if the initial offset is already 0, even before this change.

But I don't see how returning 0xff as the next cap pointer would be a problem

The problem is not with 0xff in general. This issue was triggered with a bus device emulation that returns -1 (0xff) on all reads as a "there is nothing here" marker. Any case that just returns the same fixed non-zero value for this situation is caught here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I misread the code for cap_next and cap_iter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @snue for explaining. I just got back from vacation and saw that you already took care of this. 🙏

Comment on lines +14 to +16

# OSDev has added bot protection and accesses my result in 403 Forbidden.
'^https://wiki.osdev.org',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to split for this time. Generally we want to have distinct changes from separate PRs. Please keep this in mind in the future. :)

@likebreath likebreath added this pull request to the merge queue Apr 18, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 18, 2025
@likebreath likebreath added this pull request to the merge queue Apr 18, 2025
Merged via the queue into cloud-hypervisor:main with commit de76445 Apr 18, 2025
37 of 39 checks passed
@blitz blitz deleted the vfio-user branch April 22, 2025 08:33
@likebreath likebreath moved this from 🆕 New to ✅ Done in Cloud Hypervisor Roadmap May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix Bug fix to include in release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants