Skip to content

Conversation

@MggMuggins
Copy link
Contributor

This should have a check for all corner cases around VM root volume attachments (coverage for canonical/lxd#14532):

  • security.protection.start allows one other VM to attach the machine's root disk, and can only be removed if the disk is not attached
  • security.shared allows unchecked attachments of root disks
  • VM attachments are correctly reported in used_by
  • hotplug of VM root attachments works (as this is the method reccomended by the docs to avoid UUID/LABEL conflicts)

I ran into an interesting behavior of udev in Noble VMs while testing that slowed this down a bit. In order for udev to create a symlink for SCSI_IDENT_SERIAL (/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_virtual--machine-vm5), I had to hotplug the device in an ubuntu:24.04 VM. Starting the VM from cold didn't create the link, and ubuntu-minimal:24.04 VMs never created it at all.

Without hotplug or in an ubuntu-minimal:24.04:

# udevadm info /dev/sdb
...
S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machine
S: disk/by-path/pci-0000:02:00.0-scsi-0:0:1:1
S: disk/by-diskseq/10
E: DEVPATH=/devices/pci0000:00/0000:00:01.1/0000:02:00.0/virtio6/host0/target0:0:1/0:0:1:1/block/sdb
E: DEVNAME=/dev/sdb
...
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_lxd_virtual--machine
E: ID_SERIAL_SHORT=lxd_virtual--machine
E: ID_SCSI_SERIAL=lxd_virtual--machine-v2
E: ID_BUS=scsi
...
E: DEVLINKS=/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machine /dev/disk/by-path/pci-0000:02:00.0-scsi-0:0:1:1 /dev/disk/by-diskseq/10
...

With hotplug:

# udevadm info /dev/sdb
...
S: disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_virtual--machine-vm5
S: disk/by-path/pci-0000:02:00.0-scsi-0:0:1:1
S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machine
S: disk/by-diskseq/15
...
E: DEVPATH=/devices/pci0000:00/0000:00:01.1/0000:02:00.0/virtio6/host0/target0:0:1/0:0:1:1/block/sdb
E: DEVNAME=/dev/sdb
...
E: SCSI_IDENT_SERIAL=lxd_virtual--machine-vm5
E: SCSI_IDENT_LUN_VENDOR=lxd_virtual--machine
...
E: ID_SERIAL=0QEMU_QEMU_HARDDISK_lxd_virtual--machine
E: ID_SERIAL_SHORT=lxd_virtual--machine
E: ID_SCSI_SERIAL=lxd_virtual--machine-vm5
...
E: DEVLINKS=/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_virtual--machine-vm5 /dev/disk/by-path/pci-0000:02:00.0-scsi-0:0:1:1 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machin>
...

I worked around it by using a shorter LXD device name so that the symlink name wouldn't be truncated; not sure if this is worth chasing further.

Copy link
Member

@simondeziel simondeziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I made some comments but they are mostly nit picks.

I haven't seen it run in GHA due to the extension not being in latest/edge just yet. I'm assuming it all works locally but wanted to check how much longer it made the test :)

@tomponline
Copy link
Member

I ran into an interesting behavior of udev in Noble VMs while testing that slowed this down a bit. In order for udev to create a symlink for SCSI_IDENT_SERIAL (/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_lxd_virtual--machine-vm5), I had to hotplug the device in an ubuntu:24.04 VM. Starting the VM from cold didn't create the link, and ubuntu-minimal:24.04 VMs never created it at all.

@hamistao any experience with this?

@hamistao
Copy link

hamistao commented Dec 6, 2024

Not particularly, but it is not the first time we have seen udev or other packages mangling with disk devices that LXD manages (canonical/lxd#13701 (comment)) and since it only happens on a specific distro version and not on minimal it is almost certainly a problem of the same nature and probably isn't worth digging too deep into.

On Without hotplug or in an ubuntu-minimal:24.04 we have S: disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machine and E: DEVLINKS=/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_virtual--machine ... so LXD seems to be doing its job just fine.

@MggMuggins MggMuggins force-pushed the attach-vm-root-volumes branch from 4ea7f64 to 2040505 Compare December 9, 2024 22:23
@MggMuggins
Copy link
Contributor Author

Thanks @tomponline @hamistao, I agree that this isn't worth chasing.

@simondeziel They do pass locally 😉 . It looks like it goes from 5 lxc starts to 6 lxc starts. So ~120% of the previous runtime for this test.

simondeziel
simondeziel previously approved these changes Dec 12, 2024
Comment on lines 196 to 197
lxc init --empty --vm v2 --storage "${poolName}" --device root,size=8KiB
lxc init --empty --vm v3 --storage "${poolName}" --device root,size=8KiB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to test with such small volumes. I'm pretty sure LVM will round them up but that should happen all transparently, hopefully!

Dunno how it would fare with PowerFlex's minimum of 8GiB though. I guess we'll see when we run it there :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

powerflex doesnt round up to 8gib afaik @roosterfish ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, it only allows multiples of 8GiB which is why it's special cased everywhere in tests/storage-vm like in https://github.com/canonical/lxd-ci/blob/main/tests/storage-vm#L89-L95

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a special case for powerflex.

@MggMuggins MggMuggins force-pushed the attach-vm-root-volumes branch from 2040505 to 533a96f Compare December 17, 2024 00:25
@MggMuggins
Copy link
Contributor Author

Added a check to make sure that security.shared can be unset when security.protection.start is set.

@MggMuggins MggMuggins force-pushed the attach-vm-root-volumes branch from 533a96f to ef82585 Compare December 19, 2024 03:43
@MggMuggins
Copy link
Contributor Author

Added a check for instance rename.

This should have a check for all corner cases around VM root volume
attachments:
- security.protection.start allows one other VM to attach the machine's
  root disk, and can only be removed if the disk is not attached
- security.shared allows unchecked attachments of root disks
- VM attachments are correctly reported in used_by
- hotplug of VM root attachments works (as this is the method reccomended
  by the docs to avoid UUID/LABEL conflicts)
- Rename/Delete of instances when their root disk is attached elsewhere
  is disallowed

Signed-off-by: Wesley Hershberger <[email protected]>
@MggMuggins MggMuggins force-pushed the attach-vm-root-volumes branch from 259b5b9 to 6abe711 Compare December 19, 2024 16:10
tomponline added a commit to canonical/lxd that referenced this pull request Jan 11, 2025
Includes commits from #14491 

This PR enables attaching a virtual machine's root storage volume to
another virtual machine via a disk device:
```
# vm2 yaml
...
devices:
  v1-root:
    type: disk
    pool: default
    source: virtual-machine/vm1
```

This has some constraints because simultaneous access to storage volumes
with content-type `block` is unsafe:
- `vm1`'s root volume can be attached to exactly one other instance if
`vm1` has `security.protection.start: true`
- `vm1`'s root volume can be attached to any number of other instances
if the storage volume `virtual-machine/vm1` has `security.shared: true`

`security.protection.start` is recommended for interactive use; e.g. a
user temporarily needs to access a bricked machine's root volume to fix
it or recover data. `security.shared` can be used if more than one
running instance must have access to the block volume.

## TODO
- [x] Docs
- [x] [lxd-ci](https://github.com/canonical/lxd-ci/) tests to validate
attachments to running VMs.
[Done](canonical/lxd-ci#366)
- [x] Bootorder bug (see comment) - current workaround is to hotplug
disk devices to avoid UUID/LABEL collisions at boot time.
@tomponline
Copy link
Member

@MggMuggins seems to be failing:

  • lxc storage volume attach vmpool14779 virtual-machine/v2 v1 v2-root
    Error: Failed to start device "v2-root": Failed to call monitor hook for block device: Failed adding block device for disk device "v2-root": Failed adding block device: error reading header from custom_test_v2.block: No such file or directory

@MggMuggins
Copy link
Contributor Author

The qemu driver infers the volume type based on the source path (here). It seems to me that the "RBDFormat" (see here) should really be sending over the volume name as it appears to Ceph, rather than requiring the driver to figure out project and volume type info.

I'm working on building a reproducer and should have a fix up by EOD Wed.

@MggMuggins
Copy link
Contributor Author

canonical/lxd#14807 fixes this.

tomponline added a commit to canonical/lxd that referenced this pull request Jan 18, 2025
Different calls of `DiskGetRBDFormat` and `DiskParseRBDFormat` are
currently using both senses of "volume":
- A Ceph block device (RBD Image)
- A LXD storage volume

Because a disk device can specify an RBD image with source
`ceph:<pool_name>/<volume_name>"`, these functions should use the first
meaning: the name passed should be the name of a RADOS block device (RBD
image), not a LXD storage volume.

These functions should (in future) be replaced with a more comprehensive
abstraction for what `MountInfo.DevPath` is currently doing. This
serialization doesn't seem to be written anywhere, it's just
deserialized when it gets to the instance driver.

This fixes the failures from
canonical/lxd-ci#366; the QEMU driver was
inferring a virtual-machine `VolumeType` only when `TargetPath == "/"`
Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tomponline tomponline merged commit b4a1e6e into canonical:main Jan 22, 2025
174 of 179 checks passed
@MggMuggins MggMuggins deleted the attach-vm-root-volumes branch January 23, 2025 13:28
@tomponline
Copy link
Member

@MggMuggins this will need to be updated now too since canonical/lxd#14886 right?

@MggMuggins
Copy link
Contributor Author

Sorry, just saw this. As we discussed last week, canonical/lxd#14886 only changes the device config keys used by the server; the cli interface didn't change, so this is alright without updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants