Skip to content

nixos/qemu-vm: introduce virtualisation.mountHostNixStore option#227881

Merged
RaitoBezarius merged 1 commit intomasterfrom
qemu-vm/mount-host-nix-store
Apr 24, 2023
Merged

nixos/qemu-vm: introduce virtualisation.mountHostNixStore option#227881
RaitoBezarius merged 1 commit intomasterfrom
qemu-vm/mount-host-nix-store

Conversation

@RaitoBezarius
Copy link
Member

Description of changes

Now that useBootLoader produces a full system image, moving disk images can be slow because they have a full Nix store in them.

It does not make sense to keep the 9p mountpoint to shadow the /nix/store of the VM.

We disable it if we have useBootLoader and introduce an option for easy overrides.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.05 Release Notes (or backporting 22.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Apr 23, 2023
Now that `useBootLoader` produces a full system image, moving disk
images can be slow because they have a full Nix store in them.

It does not make sense to keep the 9p mountpoint to shadow the
/nix/store of the VM.

We disable it if we have `useBootLoader` and introduce an option for
easy overrides.
@RaitoBezarius RaitoBezarius force-pushed the qemu-vm/mount-host-nix-store branch from 36933e4 to 0df5257 Compare April 23, 2023 21:58
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Apr 23, 2023
Copy link
Contributor

@nikstur nikstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RaitoBezarius RaitoBezarius merged commit 7812abd into master Apr 24, 2023
@RaitoBezarius RaitoBezarius deleted the qemu-vm/mount-host-nix-store branch May 8, 2023 20:00
@alyssais
Copy link
Member

This broke the initrd-luks-empty-passphrase NixOS test.

@alyssais alyssais mentioned this pull request May 25, 2023
12 tasks
@RaitoBezarius
Copy link
Member Author

This broke the initrd-luks-empty-passphrase NixOS test.

Root cause:

Overlay filesystem mounting of /nix/store happens too late, therefore, when the test tries to exercise "wrong keyfile", etc. It cannot boot the stage 2 init script as the store is not available.

Mounting the host Nix store was hiding this issue because well, you never had to worry about mounting the Nix store at the right moment.

I believe that by moving the /nix/store mounting in fileSystems and marking them neededForBoot = true;, we can fix all the tests like those.

@RaitoBezarius
Copy link
Member Author

I spent all the day working on that. The whole problem is that we had very weird semantics on how we did our LUKS testing.

Instead of booting on the real rootfs, we just had a "bootfs" which was used for preparating a LUKS rootfs.

This way, what we do is booting on bootfs, formatting our LUKS rootfs, use bootctl to select a specialization that boots on the LUKS rootfs and hop.

The problem here is that a large implicit assumption is done here: when booting the LUKS rootfs, we assume we have a proper Nix store to "boot" into our system closure even if the LUKS rootfs is… empty (of 512MiB).

It's not written anywhere and it is like some ad-hoc practice to do this kind of testing.

I think this is misleading test, what should be tested is booting a NixOS system in a LUKS rootfs, not a frankenstein of NixOS system in host Nix store and empty LUKS rootfs via specialization.

Therefore, I set to find a solution on how to get our framework to enable testing LUKS rootfs, which is annoyingly hard because make-disk-image is made for fsType which are supported by mkfs.${fsType} basically.

And even if I hacked a support for LUKS there (which does not really make sense), parameters would will expose an impedence mismatch.

So really, the end solution here is to provide custom partitioning scripts via disko or any structured tooling similar, but that's not realistic for the time being.

Other options include:

  • doing a bootfs + having a nix store image or a host nix store: make-disk-image it and go with it. → I am not in favor of this solution because we should not be testing with /dev/vda and /dev/vdb, but really with one disk tying everything together.
  • mounting the host nix store even with the bootloader: waste of the "inside nix store image", we could also not generate it, see the noNixStore draft PR. nixos/lib/make-disk-image: introduce noNixStore option #227883 — this indeed fixes the issue because I can reboot on the "empty" LUKS root via my host Nix store. → okay grade solution but probably not "long term" and "proper"
  • at make-disk-image time: generate a side-effect to re-encrypt the target disk in LUKS via a postDeviceCommands or something similar. → that's a not that bad solution, it's just not obvious how to wire up the side-effect and control it nicely.
  • use systemd-repart to create the good images → that's the optimal solution IMHO here, but @nikstur is the domain expert on this for now and I'm not so sure how to integrate it. It seems it cannot do re-encryption also unfortunately which would have been sweet.

I'm looking into the 2 last solutions at the moment, if you have any preference or input @alyssais let me know.

@ElvishJerricco
Copy link
Contributor

I think this is misleading test, what should be tested is booting a NixOS system in a LUKS rootfs, not a frankenstein of NixOS system in host Nix store and empty LUKS rootfs via specialization.

Isn't this what the installer tests are for? Testing LUKS doesn't necessarily mean testing the store being on the root fs, and the latter is significantly less lightweight as a test.

@RaitoBezarius
Copy link
Member Author

Isn't this what the installer tests are for? Testing LUKS doesn't necessarily mean testing the store being on the root fs, and the latter is significantly less lightweight as a test.

Now you are saying this, I feel relieved, I don't have to implement a NixOS test trick for that actually indeed. I can just try to unlock LUKS on another disk, do you agree with this?

@RaitoBezarius
Copy link
Member Author

(But I think we should test re-encryption, this is neat.)

@alyssais
Copy link
Member

cc @lheckemann @therishidesai who have worked on the initrd tests

@ElvishJerricco
Copy link
Contributor

Now you are saying this, I feel relieved, I don't have to implement a NixOS test trick for that actually indeed. I can just try to unlock LUKS on another disk, do you agree with this?

@RaitoBezarius Yea that's probably ok.

@ElvishJerricco
Copy link
Contributor

I guess you'll want to make sure to test that the device is actually unlocked and present in /dev/mapper if it's not going to be the root device.

@RaitoBezarius
Copy link
Member Author

RaitoBezarius commented May 25, 2023

Will open a "mass" fix PR for all failed tests due to this PR.

(thankfully, they are exactly the set of PRs which useBootLoader = true; (i.e. no direct boot) + switch to an empty rootfs, so ~10 of them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

Development

Successfully merging this pull request may close these issues.

4 participants