qemu: refactor maximum vcpus supported in aarch64#585
qemu: refactor maximum vcpus supported in aarch64#585devimc merged 2 commits intokata-containers:masterfrom
Conversation
|
Build failed (third-party-check pipeline) integration testing with
|
sboeuf
left a comment
There was a problem hiding this comment.
Looks good to me!
Only a few comments.
And also please extend the unit testing for the function MaxQemuVCPUs().
virtcontainers/qemu_arm64.go
Outdated
| func MaxQemuVCPUs() uint32 { | ||
| bytes, err := ioutil.ReadFile(interruptFile) | ||
| if err != nil { | ||
| qemuArmLogger().WithError(err).Error("Failed to read /proc/interrrupts") |
There was a problem hiding this comment.
s/interrrupts/interrupts/ You've added one extra r ;)
There was a problem hiding this comment.
Can you use the variable interruptFile here instead?
virtcontainers/qemu_arm64.go
Outdated
| qemuArmLogger().WithError(err).Error("Failed to read /proc/interrrupts") | ||
| } | ||
| for gicType, vCPUs := range gicList { | ||
| pattern := regexp.MustCompile(`\b` + gicType + `\b`) |
There was a problem hiding this comment.
The regex seems a bit too much here, maybe something like:
if strings.Contains(string(bytes), gicType) {
return vCPUs
}might be simpler and cost less to process.
There was a problem hiding this comment.
yep, that's better, updated asap.
There was a problem hiding this comment.
when I added support for arm64 I was unable to start a VM with maxcpus > actual number of physical CPUs. I think it's a limitation of KVM, having said that, I believe that this PR will fail on systems with gicV3 and actual number of physical CPUs < 123. @Pennyzct let me know when this PR is ready, I'd like to take a look
|
@Pennyzct Looks good. Please add a unit test as well. You can override |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
Codecov Report
@@ Coverage Diff @@
## master #585 +/- ##
=======================================
Coverage 65.34% 65.34%
=======================================
Files 85 85
Lines 9846 9846
=======================================
Hits 6434 6434
Misses 2756 2756
Partials 656 656 |
bergwolf
left a comment
There was a problem hiding this comment.
LGTM! Just one comment about the source of the GIC version vs. vcpu number mapping.
virtcontainers/qemu_arm64.go
Outdated
| //on aarch64, we support different gic interrupt controllers | ||
| //maximum number of vCPUs depends on the GIC version, or on how | ||
| //many redistributors we can fit into the memory map. | ||
| var gicList = map[string]uint32{ |
There was a problem hiding this comment.
These are plain magic mapping. Is there any documentation that we can refer to in the public domain? If so, please add it to the above comments.
There was a problem hiding this comment.
OK.. That is really dark magic. Please refer to these places so that we know where to look at if they need change in future.
And qemu doesn't seem to have gic version 4 ( virt_get_gic_version ). Is it added to be future proof?
There was a problem hiding this comment.
virtcontainers/qemu_arm64.go
Outdated
| } | ||
| for gicType, vCPUs := range gicList { | ||
| if strings.Contains(string(bytes), gicType) { | ||
| return vCPUs |
There was a problem hiding this comment.
I tested this PR in a system with GICv3 and 96 cores. This function returns 123, unfortunately it won't be honoured since the maximum number of vCPUs can't be exceeded
Lines 215 to 232 in 14bcd69
My concern with this patch is that the actual number of physical cores will be exceeded and the memory footprint will be big (again) see 07db945
There was a problem hiding this comment.
@Pennyzct have you tested this change in a system with GIC >= 3 and physical CPUs > 123 ?
There was a problem hiding this comment.
Hi~ @devimc That's how I found the problem and pulled this pr😊. I was installing and running kata in new ThunderX II which contains 224 physical cpu cores.
:~# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 224
On-line CPU(s) list: 0-223
Thread(s) per core: 4
Core(s) per socket: 28
Socket(s): 2
NUMA node(s): 2
NUMA node0 CPU(s): 0-111
NUMA node1 CPU(s): 112-223
Err occurred and outputs were as follows:
docker: Error response from daemon: OCI runtime create failed: qemu-system-aarch64: Number of SMP CPUs requested (224) exceeds max CPUs supported by machine 'mach-virt' (123): unknown
when applying my patch, vCPUs will be reduced into 123 in func defaultMaxVCPUs as you mentioned above and kata will run successfully.
|
Based on upstream discussion, wei @Weichen81 and I issued #614 and #615 to discuss the missing gicv4 scenario and max vCPU varying according to qemu version. This pr may need them landed firstly. 😊 |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
virtcontainers/qemu_arm64.go
Outdated
| return uint32(runtime.NumCPU()) | ||
| if hostGICVersion != 0 { | ||
| return gicList[hostGICVersion] | ||
| } else { |
virtcontainers/qemu_arm64_test.go
Outdated
|
|
||
| assert.Equal(d.expectedResult, vCPUs) | ||
| } | ||
|
|
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
updated. ptal. @devimc @jodh-intel 😊 |
|
Nasty 16.04 CI failure which seems to be hotplug-related. Using a Clear Linux image (kernel Note: There's a bit of corruption on the /cc @devimc, @grahamwhaley. |
|
lgtm |
|
@jodh-intel about the kernel trace. Considering that happen when CPU hotplug is happening and taking a look to the code. Seems that fits in the first case that is documented, so I think that is expected(?). @mcastelino @devimc @grahamwhaley * We should be running this queue from one of the CPUs that
* are mapped to it.
*
* There are at least two related races now between setting
* hctx->next_cpu from blk_mq_hctx_next_cpu() and running
* __blk_mq_run_hw_queue():
*
* - hctx->next_cpu is found offline in blk_mq_hctx_next_cpu(),
* but later it becomes online, then this warning is harmless
* at all
* |
|
@jcvenegas - good find! It seems like the kernel is dumping debug info to help the devs debug the races referred to there but atleast it's not a "real" crash :) |
|
Build succeeded (third-party-check pipeline).
|
on aarch64, we support different gic interrupt controllers. The maximum number of vCPUs depends on the GIC version, or on how many redistributors we can fit into the memory map. Fixes: kata-containers#584 Signed-off-by: Penny Zheng <[email protected]> Signed-off-by: Wei Chen <[email protected]>
we should add unit test for func MaxQemuVCPUS in qemu_amd64_test.go Signed-off-by: Penny Zheng <[email protected]> Signed-off-by: Wei Chen <[email protected]>
|
just re-based it on the latest master branch. 😊. @jodh-intel |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
@devimc could you give it a last review to this PR? |
agent: sandbox_pause should not take arguments
on aarch64, we support different gic interrupt controllers.
The maximum number of vCPUs depends on the GIC version, or on how many redistributors we can fit into the memory map.
Fixes: #584
Signed-off-by: Penny Zheng [email protected]
Signed-off-by: Wei Chen [email protected]