core: save/restore FPU registers in VM entry/exit#85
Conversation
| fxsave((mword *)hfx); | ||
| fxrstor((mword *)gfx); | ||
| } | ||
| fxsave((mword *)hfx); |
There was a problem hiding this comment.
To minimize the performance penalty that the unconditional FXSAVE/FXRSTOR incurs, we should replace them with XSAVEOPT/XRSTOR if the host supports them. According to the description of XSAVEOPT in Intel SDM Vol. 2c, it is pretty smart in avoiding unnecessary memory writes. In addition, only the XSAVE instructions can save/restore AVX context.
Of course, this optimization can and should be implemented in a future patch.
There was a problem hiding this comment.
As to the question of whether we should utilize CR0.TS and #NM (Device Not Found exception) to implement lazy FPU (including SSE, etc.) context switching, here's the official recommendation from Intel SDM Vol. 3A 13.4 (Designing OS Facilities for Saving x87 FPU, SSE and Extended States on Task or Context Switches). Basically the answer is no:
The operating system can take the responsibility for saving the states as part of the task switch process, but delay the restoring of the states until an instruction operating on the states is actually executed by the new task. See Section 13.4.1, “Using the TS Flag to Control the Saving of the x87 FPU and SSE State,” for more information. This approach is called lazy restore.
The use of lazy restore mechanism in context switches is not recommended when XSAVE feature set is used to save/restore states for the following reasons.
— With XSAVE feature set, Intel processors have optimizations in place to avoid saving the state components that are in their initial configurations or when they have not been modified since they were restored last. These optimizations eliminate the need for lazy restore. See section 13.5.4 in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.
— Intel processors have power optimizations when state components are in their initial configurations. Use of lazy restore retains the non-initial configuration of the last thread and is not power efficient.
— Not all extended states support lazy restore mechanisms. As such, when one or more such states are enabled it becomes very inefficient to use lazy restore as it results in two separate state restore, one in context switch for the states that does not support lazy restore and one in the #NM handler for states that support lazy restore.
06fa035 to
63af3c2
Compare
|
A new warning is generated. |
Guest OS kernel/app might use SSE instruction and registers. When Guest OS VM exits, these registers should be saved, or else it might be corrupted by host OS/app. In next time VM entry, guest's SSE registers context might be corrupted. Guest app segfault and kernel panic were reported which should be related with this issue. This change is to remove is_fpu_used flag so guest FPU registers could be saved in VM exit and restored in VM entry unconditionally. Fixes #39, fixes #74.
63af3c2 to
6c2cd4d
Compare
Guest OS kernel/app might use SSE instruction and registers.
When Guest OS VM exits, these registers should be saved, or else
it might be corrupted by host OS/app. In next time guest VM
enter, guest's SSE registers context might be corrupted.
Guest app segfault and kernel panic were reported
which should be related with this issue.
This change is to remove is_fpu_used flag so guest FPU registers
could be saved in VM exit and restored in VM entry unconditionally.
Fixes #39, fixes #74.