-
Notifications
You must be signed in to change notification settings - Fork 565
hypervisor: Fix live migration when amx features are enabled #7536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hypervisor: Fix live migration when amx features are enabled #7536
Conversation
|
I think you have to change your commit messages from |
In order for guests to use AMX it is necessary to ask the kernel to enable the related state components for guests. While cloud hypervisor already does this, we would prefer to extract the logic into a stand alone (reusable) function. In this commit we only introduce the error type that will later be part of the enable_amx_state_components function's signature. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]
We introduce a static enable_amx_state_components method on the XSaveState struct that will be used in a follow up commit. We will also extend the logic of what this method does in the near future. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]
We use the central amx_state_components enabling function in the CpuManager constructor. This way we can make changes to the AMX related state components functionality without needing to update the CpuManager's constructor. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP <[email protected]>
We will need to query KVM for the size of the xsave struct in a follow up commit. This commit introduces the necessary method on the hypervisor trait to do that. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]
As we are going to replace all KVM_GET_XSAVE calls with KVM_GET_XSAVE2 we need to update the seccomp filters accordingly. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]
AMX requires dynamically enabling certain large state components which leads to an increase in the size of kvm_xsave. This was not taken into account by Cloud hypervisor until now. We solve this by refactoring `XSaveState` to directly wrap `kvm::Xsave` and always ensure (via a OnceLocked static variable) that all operations on the wrapped xsave state obtain an instance of the right size. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]
e94c360 to
a66dc72
Compare
|
I don't like the use of the static variable for the length - I think how #7534 handled that aspect was cleaner. |
| F: FnMut(&mut kvm_bindings::Xsave) -> Result<(), E>, | ||
| E: Into<Box<dyn std::error::Error + Send + Sync + 'static>>, | ||
| { | ||
| let fam_length = XSAVE_FAM_LENGTH.get().unwrap_or(&0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For safety we should probably use get_or_init here instead.
|
@rbradford @phip1611 @lisongqian what do you think about me moving the three first commits from this PR into a dedicated PR and then we go with #7534? |
|
IMO, I prefer to create a new struct like |
I mean we can have it as a free standing function, it does not have to be tied to any struct at all. All I am trying to say is that your changes are good (at least if I am understanding the kernel documentation correctly), but we would like to extract the preexisting state component enabling logic into a separate function. With my new understanding that is an orthogonal issue and can be done independently of your PR. If everyone agrees I think I will go ahead and close this (my) open PR and open another one with a standalone function (that will not have any effect on your PR). |
| if result != 0 || (mask & XFEATURE_XTILEDATA_MASK) != XFEATURE_XTILEDATA_MASK { | ||
| return Err(Error::AmxEnable(anyhow!("Guest AMX usage not supported"))); | ||
| } | ||
| hypervisor::arch::x86::XsaveState::enable_amx_state_components(hypervisor.as_ref()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh, I don't think should be on XsaveState - like @lisongqian said. That struct is really for storing the register state. Can we have this on hypervisor instead?
Closes #5800.
This PR was developed independently from #7534, but they both end up essentially solving the same problem. Thus issue #7533 will also be closed if this gets merged.
I would be quite happy to combine code and ideas from @lisongqian's PR (as already suggested by @lisongqian here) so we get the best of both the approaches. The most important thing for me is that we keep the standalone
enable_amx_state_components(static) method introduced here as we want to use it in the ongoing work we are doing on CPU profiles/templates (#7068 and #7111).Details about this PR itself
Variable length Xsave state
When XSAVE state components are dynamically enabled, the de-facto length of
kvm_xsavechanges and this needs to be accounted for. I do this by updating the pre-existingXSaveStatestruct to wrap the FAM-structXsavefrom thekvm_bindingscrate directly and I track the FAM length in a Oncelocked static variable.How this relates to the upcoming CPU profiles feature
This fix is necessary to properly test the upcoming CPU Profiles feature, but the
XSaveState::enable_amx_state_componentsmethod introduced here will also be called during CPU profile generation and also prior to checking CPU compatibility when receiving a live migration command.Risks
There are a few things we should keep in mind with regards to this fix:
kvm_bindingscrate is sound. I am not entirely convinced that this will be the case in the eyes of the Rust abstract machine, but I think we can investigate that more in the future.cargo nextestruns each test in a separate process hence as long as that is used for running the unit/integration tests this should not be a problem.XsaveStatewill now heap allocate whenever AMX is enabled. Each initialization will require roughly 1KB of memory on the heap. We have however determined that this will never be called in any hot paths.How this PR has been tested
This was tested by running an AMX workload (loading and mutiplying some matrices, then printing the result in a loop)
inside a guest on a Sapphire rapids server and then migrating it to a Granite rapids server. The migration was successful and the program kept running.
Hints for reviewers
I recommend reviewing one commit at a time.