vmm: raise the (v)CPU limit on kvm/x86_64 #7299

posk-io · 2025-08-26T18:52:34Z

Raise the max number of supported (v)CPUs on x86_64 kvm hosts to 8192.

Other platfroms (non-x64) keep their existing CPU limits pending further development and testing.

Signed-off-by: Barret Rhoden [email protected]
Signed-off-by: Neel Natu [email protected]
Signed-off-by: Ofir Weisse [email protected]
Signed-off-by: Peter Oskolkov [email protected]

posk-io · 2025-08-26T18:59:19Z

Note: I have a draft of several new itegration tests here: posk-io@0a250e1, but they fail when run from a VM (i.e. they fail making a nested VM with >= 254 vCPUs). I don't have a spare baremetal host to run the tests at the moment.

Another reason could be that the kernel in the VM image that is used to run the tests has a smaller CONFIG_NR_CPUS. Anyway, manual testing indicates the PR/patch series works on AMD and Intel hosts with up to 1024 vCPUs (the limit that the host KVM imposes in my tests).

phip1611

Awesome, thanks for working on this! I left a few remarks.

vmm/src/config.rs

vmm/src/cpu.rs

vmm/src/vm.rs

phip1611 · 2025-08-27T10:27:58Z

Follow-up of #7299 (comment): I just noticed that using > 512 vCPUs on my AMD Ryzen 7 7840U w/ Radeon 780M Graphics causes:

Error: Cloud Hypervisor exited with the following chain of errors:
  0: Error booting VM
  1: The VM could not boot
  2: Error from CPU manager
  3: Error creating vCPU
  4: Failed to create Vcpu
  5: Invalid argument (os error 22)

This surprises me. I expected that at least 4096 vCPUs should be supported. Aye, @tpressure?

tpressure · 2025-08-27T11:57:08Z

General remark: we should not generate an MPTable if we have more than 255 vCPUs but solely rely on the MADT.

tpressure · 2025-08-27T14:59:22Z

I'm also missing the part where you expose KVM_FEATURE_MSI_EXT_DEST_ID . Is this missing on purpose?

tpressure · 2025-08-27T15:34:41Z

Follow-up of #7299 (comment): I just noticed that using > 512 vCPUs on my AMD Ryzen 7 7840U w/ Radeon 780M Graphics causes:
Error: Cloud Hypervisor exited with the following chain of errors:
  0: Error booting VM
  1: The VM could not boot
  2: Error from CPU manager
  3: Error creating vCPU
  4: Failed to create Vcpu
  5: Invalid argument (os error 22)
This surprises me. I expected that at least 4096 vCPUs should be supported. Aye, @tpressure?

On my (Intel) system, it works fine with 768 vCPUs. I will check with my AMD system tomorrow (evening). Did you make sure to increase your ulimit correctly? I'm using ulimit -n 10000.

posk-io · 2025-08-27T23:55:28Z

I'm also missing the part where you expose KVM_FEATURE_MSI_EXT_DEST_ID . Is this missing on purpose?

This was done in a previous PR:

cloud-hypervisor/arch/src/x86_64/mod.rs

Line 756 in 5357761

entry.eax |= 1 << KVM_FEATURE_MSI_EXT_DEST_ID;

posk-io · 2025-08-27T23:56:28Z

General remark: we should not generate an MPTable if we have more than 255 vCPUs but solely rely on the MADT.

This was done in a previous PR:

cloud-hypervisor/arch/src/x86_64/mptable.rs

Line 137 in 5357761

if x2apic_id_max >= MAX_SUPPORTED_CPUS_LEGACY {

posk-io · 2025-08-28T00:00:57Z

Update: I was able to boot VMs with 1024 vCPUs on both Intel and AMD hosts. For some reason max KVM host limit is 1024 on machines I have access to. Does anybody know if this is tunable?

I'll lower the max vCPU limit in the PR to 8192, as this seems to be the Linux kernel's max, and do the cosmetic fixes suggested tomorrow. Will also experiment with the integration test (although it may be flaky, as it is unclear how the kvm host nr_cpu limit is determined).

up2wing · 2025-09-01T02:19:19Z

Update: I was able to boot VMs with 1024 vCPUs on both Intel and AMD hosts. For some reason max KVM host limit is 1024 on machines I have access to. Does anybody know if this is tunable?

I'll lower the max vCPU limit in the PR to 8192, as this seems to be the Linux kernel's max, and do the cosmetic fixes suggested tomorrow. Will also experiment with the integration test (although it may be flaky, as it is unclear how the kvm host nr_cpu limit is determined).

kvm has a KVM_MAX_VCPUS macro which can be defined by CONFIG_KVM_MAX_NR_VCPUS, so the best method I suppose is not setting a max value in cloud hypervisor, instead, it's better to query the value from kvm.

For example, qemu did like this:
https://github.com/qemu/qemu/blob/master/accel/kvm/kvm-all.c#L2692

likebreath

LGTM. Two minor comments.

vmm/src/lib.rs

vmm/src/vm.rs

likebreath · 2025-09-02T21:25:59Z

Update: I was able to boot VMs with 1024 vCPUs on both Intel and AMD hosts. For some reason max KVM host limit is 1024 on machines I have access to. Does anybody know if this is tunable?
I'll lower the max vCPU limit in the PR to 8192, as this seems to be the Linux kernel's max, and do the cosmetic fixes suggested tomorrow. Will also experiment with the integration test (although it may be flaky, as it is unclear how the kvm host nr_cpu limit is determined).

kvm has a KVM_MAX_VCPUS macro which can be defined by CONFIG_KVM_MAX_NR_VCPUS, so the best method I suppose is not setting a max value in cloud hypervisor, instead, it's better to query the value from kvm.

For example, qemu did like this: https://github.com/qemu/qemu/blob/master/accel/kvm/kvm-all.c#L2692

The current implementation already checks with KVM_MAX_VCPUS - see Error::MaximumVcpusExceeded. We kind of need a constant limit (say MAX_SUPPORTED_CPUS) to clarify the different limits on different architectures.

likebreath · 2025-09-02T21:33:08Z

Will also experiment with the integration test (although it may be flaky, as it is unclear how the kvm host nr_cpu limit is determined).

That would be great to have. The challenge is that the most majority of our integration tests run on fairly small Azure VMs (8/16 vCPUs), and I'd assume that won't serve the purpose for launching a VM with 255+ vCPUs on top of it. We do have a couple baremetal x86_64 self-host runners, and we may can experiment with these machines. Not a blocker for landing this PR.

likebreath · 2025-09-05T17:46:11Z

@posk-io I see we are very close to conclude on this series with landing this PR (just two small comments above). It would be good to include the increased vCPU limits in our next release (scheduled next Thursday). Thanks.

posk-io · 2025-09-05T18:49:16Z

@posk-io I see we are very close to conclude on this series with landing this PR (just two small comments above). It would be good to include the increased vCPU limits in our next release (scheduled next Thursday). Thanks.

Ack. Give me a couple of hours (traveling - not in my usual habitat).

phip1611 · 2025-09-05T18:55:42Z

Very nit but please strip the Intel from the PR name, as this works for Intel and AMD. This will help developers in the future to look through old PRs without getting confused

posk-io · 2025-09-05T20:16:05Z

Very nit but please strip the Intel from the PR name, as this works for Intel and AMD. This will help developers in the future to look through old PRs without getting confused

Done.

Signed-off-by: Peter Oskolkov <[email protected]>

Raise the max number of supported (v)CPUs on kvm x86_64 hosts to 8192 (the max allowed value of CONFIG_NR_CPUS in the Linux kernel). Other platfroms keep their existing CPU limits pending further development and testing. The change has been tested on Intel and AMD hosts. Signed-off-by: Barret Rhoden <[email protected]> Signed-off-by: Neel Natu <[email protected]> Signed-off-by: Ofir Weisse <[email protected]> Signed-off-by: Peter Oskolkov <[email protected]>

posk-io requested a review from a team as a code owner August 26, 2025 18:52

phip1611 requested changes Aug 27, 2025

View reviewed changes

phip1611 reviewed Aug 27, 2025

View reviewed changes

vmm/src/vm.rs Outdated Show resolved Hide resolved

posk-io force-pushed the x2apic-config branch 2 times, most recently from b80ff51 to 28ee5ce Compare August 28, 2025 18:12

likebreath reviewed Sep 2, 2025

View reviewed changes

vmm/src/lib.rs Outdated Show resolved Hide resolved

vmm/src/vm.rs Outdated Show resolved Hide resolved

posk-io changed the title ~~vmm: raise the (v)CPU limit on kvm/Intel/x86_64~~ vmm: raise the (v)CPU limit on kvm/x86_64 Sep 5, 2025

posk-io force-pushed the x2apic-config branch 2 times, most recently from b0a0e0a to cd04317 Compare September 5, 2025 20:45

posk-io added 2 commits September 5, 2025 20:45

arch: x86_64: make MAX_SUPPORTED_CPUS_LEGACY public

31a2ecd

Signed-off-by: Peter Oskolkov <[email protected]>

posk-io force-pushed the x2apic-config branch from cd04317 to a803cff Compare September 5, 2025 20:46

likebreath approved these changes Sep 8, 2025

View reviewed changes

likebreath added this pull request to the merge queue Sep 8, 2025

likebreath added this to Cloud Hypervisor Roadmap Sep 8, 2025

github-project-automation bot moved this to 🆕 New in Cloud Hypervisor Roadmap Sep 8, 2025

likebreath moved this from 🆕 New to ✅ Done in Cloud Hypervisor Roadmap Sep 8, 2025

Merged via the queue into cloud-hypervisor:main with commit 05d222f Sep 8, 2025
40 checks passed

posk-io deleted the x2apic-config branch September 9, 2025 18:19

likebreath mentioned this pull request Sep 9, 2025

Add an integration test for VMs using 255+ vCPUs #7341

Open

rbradford mentioned this pull request Dec 10, 2025

Max vcpu limit of 255 is too low #5345

Open

vmm: raise the (v)CPU limit on kvm/x86_64 #7299

vmm: raise the (v)CPU limit on kvm/x86_64 #7299

Uh oh!

Conversation

posk-io commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

posk-io commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phip1611 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phip1611 commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tpressure commented Aug 27, 2025

Uh oh!

tpressure commented Aug 27, 2025

Uh oh!

tpressure commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

posk-io commented Aug 27, 2025

Uh oh!

posk-io commented Aug 27, 2025

Uh oh!

posk-io commented Aug 28, 2025

Uh oh!

up2wing commented Sep 1, 2025

Uh oh!

likebreath left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

likebreath commented Sep 2, 2025

Uh oh!

likebreath commented Sep 2, 2025

Uh oh!

likebreath commented Sep 5, 2025

Uh oh!

posk-io commented Sep 5, 2025

Uh oh!

phip1611 commented Sep 5, 2025

Uh oh!

posk-io commented Sep 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

posk-io commented Aug 26, 2025 •

edited

Loading

posk-io commented Aug 26, 2025 •

edited

Loading

phip1611 commented Aug 27, 2025 •

edited

Loading

tpressure commented Aug 27, 2025 •

edited

Loading