-
Notifications
You must be signed in to change notification settings - Fork 565
devices: gracefully close preserved FDs on hot device remove #7371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
devices: gracefully close preserved FDs on hot device remove #7371
Conversation
014ccc5 to
99e72c1
Compare
vmm/src/config.rs
Outdated
| fds.append(&mut preserved_fds.clone()); | ||
| for fd in preserved_fds { | ||
| if fds.contains(fd) { | ||
| log::warn!("fd {fd} is already preserved, skipping"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, this should be an error I think. Otherwise, we have non-unique resources which is undefined behavior anyway
rbradford
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the terminology of attached/detached - we use added/removed. Please update the commit message, comment, etc to be consistent with that (otherwise it might be confusing!.)
vmm/src/device_manager.rs
Outdated
| .unwrap() | ||
| .device_type(), | ||
| ); | ||
| // When the device is hot-detached, we close all file descriptors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We also use "hotplugged" as well as added.)
f415134 to
82070e1
Compare
82070e1 to
f9d6055
Compare
For a graceful resource management of externally provided FDs in Cloud Hypervisor, corresponding FDs need to be closed on a device removal. This is the case for virtio-net devices using external FDs, for example. With the fix introduced in this commit, we allow management software to properly clean up resources, e.g., libvirt can clean up tap devices. PS: CHV uses "added" and "removed", which has the same meaning as hot device attach/hotplug and hot device detach/unplug. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
f9d6055 to
e261ecf
Compare
a6426e3
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is dropped, the FD for the second device becomes invalid [0]. To avoid this misconfiguration, we now prevent multiple preservation attempts of the same FD. [0] cloud-hypervisor#7371
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. This however has some caveats: In the current model, external FDs are added to the list of preserved FDs after the corresponding device has been created successfully. Now assume the following: - VM boots - VM config has no preserved FDs - We add a device to the VM that uses an external FD for its Tap dev - The device is successfully initialized - VM config is extended with the preserved FDs - VM reboots - VM config already has preserved FDs - Same as above - CRASH Therefore, we need to know for every preserved FD whether it is "cold", i.e., only has been part of a former VmConfig without associated device, or "hot", i.e., it is currently also actively in use. This allows state transitions from cold->hot in which case we won't throw an error for a reused FD. Otherwise, we throw errors. This requires that on shutdown, all preserved FDs are marked as cold. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. This however has some caveats: In the current model, external FDs are added to the list of preserved FDs after the corresponding device has been created successfully. Now assume the following: - VM boots - VM config has no preserved FDs - We add a device to the VM that uses an external FD for its Tap dev - The device is successfully initialized - VM config is extended with the preserved FDs - VM reboots - VM config already has preserved FDs - Same as above - CRASH Therefore, we need to know for every preserved FD whether it is "cold", i.e., only has been part of a former VmConfig without associated device, or "hot", i.e., it is currently also actively in use. This allows state transitions from cold->hot in which case we won't throw an error for a reused FD. Otherwise, we throw errors. This requires that on shutdown, all preserved FDs are marked as cold. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. This however has some caveats: In the current model, external FDs are added to the list of preserved FDs after the corresponding device has been created successfully. Now assume the following: - VM boots - VM config has no preserved FDs - We add a device to the VM that uses an external FD for its Tap dev - The device is successfully initialized - VM config is extended with the preserved FDs - VM reboots - VM config already has preserved FDs - Same as above - CRASH Therefore, we need to know for every preserved FD whether it is "cold", i.e., only has been part of a former VmConfig without associated device, or "hot", i.e., it is currently also actively in use. This allows state transitions from cold->hot in which case we won't throw an error for a reused FD. Otherwise, we throw errors. This requires that on shutdown, all preserved FDs are marked as cold. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. This however has some caveats: In the current model, external FDs are added to the list of preserved FDs after the corresponding device has been created successfully. Now assume the following: - VM boots - VM config has no preserved FDs - We add a device to the VM that uses an external FD for its Tap dev - The device is successfully initialized - VM config is extended with the preserved FDs - VM reboots - VM config already has preserved FDs - Same as above - CRASH Therefore, we need to know for every preserved FD whether it is "cold", i.e., only has been part of a former VmConfig without associated device, or "hot", i.e., it is currently also actively in use. This allows state transitions from cold->hot in which case we won't throw an error for a reused FD. Otherwise, we throw errors. This requires that on shutdown, all preserved FDs are marked as cold. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Allowing the same file descriptor (FD) to be preserved more than once can lead to severe issues. Consider the case where management software adds two virtio-net devices at runtime, both backed by the same externally provided FD. When the first device is removed during runtime, the FD for the second device becomes invalid [0]. To avoid misconfigurations, we now prevent multiple preservation attempts of the same FD. This however has some caveats: In the current model, external FDs are added to the list of preserved FDs after the corresponding device has been created successfully. Now assume the following: - VM boots - VM config has no preserved FDs - We add a device to the VM that uses an external FD for its Tap dev - The device is successfully initialized - VM config is extended with the preserved FDs - VM reboots - VM config already has preserved FDs - Same as above - CRASH Therefore, we need to know for every preserved FD whether it is "cold", i.e., only has been part of a former VmConfig without associated device, or "hot", i.e., it is currently also actively in use. This allows state transitions from cold->hot in which case we won't throw an error for a reused FD. Otherwise, we throw errors. This requires that on shutdown, all preserved FDs are marked as cold. [0] cloud-hypervisor#7371 Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
For a graceful resource management of externally provided FDs in Cloud
Hypervisor, corresponding FDs need to be closed on a device removal.
This is the case for virtio-net devices using external FDs, for example.
With the fix introduced in this commit, we allow management software to
properly clean up resources, e.g., libvirt can clean up tap devices.
PS: CHV uses "added" and "removed", which has the same meaning as
hot device attach/hotplug and hot device detach/unplug.
Part of #7291.