-
-
Notifications
You must be signed in to change notification settings - Fork 425
Description
When using Incus with lvmcluster storage pools, snapshot creation is generally allowed by Incus. It will temporarily acquire the volume in exclusive mode, perform the snapshot and move the volume back to shared mode.
This worked pretty well on older LVM versions. Unfortunately this worked because of a bug. LVM should never have allowed activation of a LV with snapshots back to shared mode as the copy-on-write handling of the snapshot isn't safe in a multi-writer environment.
This isn't really an issue with instance volumes as those effectively have a single writer. We just use the shared mode to allow for live-migration even when the current server is unreachable (and couldn't release its lock).
But it can be an actual issue with snapshots on block custom storage volumes that have the shared property enabled.
Given modern LVM doesn't let us do the kind of snapshots we want, we need to either stop supporting snapshots altogether (quite problematic) or change strategy around how we handle those. Given the majority of the problems affect virtual machines and block volumes, the easiest way to handle this is to use QCOW2 snapshots rather than LVM snapshots. This also happens to be the approach taken by recent Proxmox releases.
To do this, I believe we should:
- Completely disallow snapshot creation on lvmcluster volumes that have the
security.sharedproperty set totrueas I don't currently see a safe way to support that, even with QCOW2 in the mix. - Implement a new
volume.typeread-only volume option which would supportraw(current default) orqcow2(new). - Change
lvmclusterso that all new instance VM volumes get theqcow2volume.type. - Prevent the creation of snapshots on
lvmclustervolumes of type VM or custom (block) that don't have theqcow2volume.type - Container volumes and filesystem custom volumes will keep using raw volumes as we need those directly mountable and they don't need concurrent write access so can work with exclusive write access
- Implement the creation of QCOW2 formatted volumes when on
lvmcluster. For the image unpack, that should let us just copy the image (qcow2) directly into the block device without needing to unpack it. For empty volumes we'd need to useqemu-img createand for migrations, we'll need to also play withqemu-img ddto get the raw data from the migration into a qcow2 formatted volume.
In general the goal is that for existing lvmcluster users, their existing VMs and volumes should keep working. They'll still see their snapshots and will be able to restore them (if they are on an old LVM version), but they won't be able to create new ones. Instead all new instances and block volumes will be using QCOW2 and those will support snapshots again. Converting from one format to another should be possible by copying to a local storage pool and then back onto the shared storage pool.