-
Notifications
You must be signed in to change notification settings - Fork 565
Description
FYI: We (=@cyberus-technology) do have a very hacky solution. Will take time to upstream.
Short Bug Description
- we are not clearing
immediate_exiton-EINTR - we are sometimes exiting too late / missing signals
Long Bug Description
A common scenario for a VMM to regain control over the vCPU thread from
the hypervisor is to interrupt the vCPU. A use-case might be the pause
API call of CHV.
VMMs using KVM as hypervisor must use signals for this interception, i.e., a
thread sends a signal to the vCPU thread. Sending and handling these signals
is inherently racy because the signal sender does not know if the receiving
thread is currently in the RUN_VCPU [0] call, or executing userspace VMM
code.
If we are in kernel space in KVM_RUN, things are easy as KVM just exits with
-EINTR. For user-space this is more complicated. For example, it might
happen that we receive a signal but the vCPU thread was about to go into the
KVM_RUN system call as next instruction. There is no more opportunity to
check for any pending signal flag or similar.
KVM offers the immediate_exit flag [1] as part of the KVM_RUN structure
for that. The signal handler of a vCPU is supposed to set this flag, to
ensure that we do not miss any events. If the flag is set, KVM_RUN will
exit immediately [2].
We will miss signals to the vCPU if the vCPU thread is in userspace VMM
code and we do not use the immediate_exit flag.
We must have access to the KVM_RUN data structure when the signal
handler executes in a vCPU thread's context and set the
immediate_exit [1] flag. This way, the next invocation of KVM_RUN
exits immediately and the userspace VMM code can do the normal event
handling.
We must not use any shared locks between the normal vCPU thread VMM
code and the signal handler, as otherwise we might end up in deadlocks.
The signal handler therefore needs its dedicated mutable version of
the KVM_RUN structure.
[0] https://docs.kernel.org/virt/kvm/api.html#kvm-run
[1] https://docs.kernel.org/virt/kvm/api.html#the-kvm-run-structure
[2] https://elixir.bootlin.com/linux/v6.12/source/arch/x86/kvm/x86.c#L11566
To Reproduce
Send a signal when the vCPU thread executes userspace VMM code and is right before entering KVM_RUN. Hard to reproduce. One can test that with thousands of pause()-resume() cycles.