-
Notifications
You must be signed in to change notification settings - Fork 2
vmm: fix kicking vCPU out of KVM_RUN vCPU from signal handler #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmm: fix kicking vCPU out of KVM_RUN vCPU from signal handler #20
Conversation
88f62ce to
9cf2a8d
Compare
215c4c7 to
58691e2
Compare
|
Any idea how to abstract away the things we have to do here more nicely? I am thinking about a new function on the Vcpu trait that does something like |
b00df93 to
867af0a
Compare
olivereanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this simpler and more efficient if we simply set kvm_run in a thread local as I suggest in my comments below:
867af0a to
5246c44
Compare
I think we should make the PoC "good enough" to merge first and then answer this question together with what to do with the single vCPU removal case when we start preparing this for production usage/upstream. |
perhaps we could add a |
cf011a3 to
24f2291
Compare
|
Please re-check and also review this please commit-by-commit and read the commit messages. |
24f2291 to
151a827
Compare
olivereanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of the first commit.
olivereanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of commit 2: LGTM
1a59d0e to
81897a2
Compare
be0303e to
dc79b66
Compare
olivereanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the code that sets KVM_RUN into the spawned thread. Otherwise each thread will have a null pointer for kvm_run. Apart from that this is good.
dc79b66 to
c8b6a7b
Compare
c41c34d to
39948f3
Compare
No need to grab the lock multiple times in this short period of time. The lock is anyway held for the duration of the long operation (KVM_RUN).
These are the prerequisites for the upcoming (quick and dirty) solution to the problem that we might miss some events.
A common scenario for a VMM to regain control over the vCPU thread from the hypervisor is to interrupt the vCPU. A use-case might be the `pause` API call of CHV. VMMs using KVM as hypervisor must use signals for this interception, i.e., a thread sends a signal to the vCPU thread. Sending and handling these signals is inherently racy because the signal sender does not know if the receiving thread is currently in the RUN_VCPU [0] call, or executing userspace VMM code. If we are in kernel space in KVM_RUN, things are easy as KVM just exits with -EINTR. For user-space this is more complicated. For example, it might happen that we receive a signal but the vCPU thread was about to go into the KVM_RUN system call as next instruction. There is no more opportunity to check for any pending signal flag or similar. KVM offers the `immediate_exit` flag [1] as part of the KVM_RUN structure for that. The signal handler of a vCPU is supposed to set this flag, to ensure that we do not miss any events. If the flag is set, KVM_RUN will exit immediately [2]. We will miss signals to the vCPU if the vCPU thread is in userspace VMM code and we do not use the `immediate_exit` flag. We must have access to the KVM_RUN data structure when the signal handler executes in a vCPU thread's context and set the `immediate_exit` [1] flag. This way, the next invocation of KVM_RUN exits immediately and the userspace VMM code can do the normal event handling. We must not use any shared locks between the normal vCPU thread VMM code and the signal handler, as otherwise we might end up in deadlocks. The signal handler therefore needs its dedicated mutable version of KVM_RUN. This commit introduces a (very hacky but good enough for a PoC) solution to this problem. [0] https://docs.kernel.org/virt/kvm/api.html#kvm-run [1] https://docs.kernel.org/virt/kvm/api.html#the-kvm-run-structure [2] https://elixir.bootlin.com/linux/v6.12/source/arch/x86/kvm/x86.c#L11566
3557949 to
5877ee0
Compare
olivereanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for the changes!
7586272
into
cyberus-technology:gardenlinux
About
vmm: fix kicking vCPU out of kvm run vCPU from signal handler
More details in the commit messages!
Checklist (for Author)
pipeline succeeded (currently this must be done manually locally)
Hints for Reviewers
Steps to Undraft (if draft)
None.