nixos/qemu-vm: Allow building a Nix store image instead of relying on 9p + more#140792
nixos/qemu-vm: Allow building a Nix store image instead of relying on 9p + more#140792yu-re-ka merged 11 commits intoNixOS:masterfrom
Conversation
fe6016e to
ce971b7
Compare
|
I wonder if something like fscache could be a solution. Potentially it gives the best of both worlds: not building an entire image, yet avoiding 9p for repeated accesses. Could you compare performance with |
|
@roberth Tried setting |
|
I have no idea but could https://virtio-fs.gitlab.io/ be an alternative? |
|
@mohe2015 Yeah, I was considering it, but figured it wouldn't give quite the same performance boost, plus it needs a userspace daemon running on the host, so this seemed like the safer bet. It could certainly be interesting to try, though - if the benchmarks are correct, it should give a decent boost over 9p :) |
|
I'm curious about the downsides of this. Will this increase disk usage or network traffic on hydra or the binary-cache? Do we need to worry about that? |
That currently requires cooperation from the Host OS as virtiofs requires a daemon running as root (and a socket available to QEmu during execution). |
|
Yeah, that was already mentioned above. As far as I can tell this solution has the disadvantage of more disk usage but the performance is just so much better |
|
@knedlsepp AFAIK, this should only increase disk usage on the builder. The disk image doesn't end up in the test's closure and I don't see why any of it should end up in the binary cache. Maybe someone with more insight into hydra can chime in, though :) |
1bfc013 to
67a6de6
Compare
|
Anything more to do here? Any objections to merging it? |
roberth
left a comment
There was a problem hiding this comment.
Found a potential simplification. Otherwise lgtm!
67a6de6 to
5d7977e
Compare
|
Okay, now it's done! |
| shared = { source = ''"''${SHARED_DIR:-$TMPDIR/xchg}"''; target = "/tmp/shared"; }; | ||
| nix-store = mkIf (!cfg.useNixStoreImage) { | ||
| source = builtins.storeDir; | ||
| target = "/nix/store"; |
There was a problem hiding this comment.
| target = "/nix/store"; | |
| target = "/nix/store"; # should be storeDir, but is not supported in scripts yet |
There was a problem hiding this comment.
Well, technically, it should be set to whatever storeDir was set to in the nix derivation used in the vm, which currently isn't exposed, but could be added to its passthru attributes.
There was a problem hiding this comment.
If it was possible, that would be the solution, but you can't choose storeDir from inside the Nix language.
There was a problem hiding this comment.
Oh yeah, you're right. I guess the reasonable thing to do then is just un-hardcode all the values.
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
|
While #144014 breaks the infinite recursion when
The first has no performance impact. The second does. |
|
@roberth It seems like hydra doesn't want to build the disk images due to their size. Raising the |
This may be a hard sell.
Not 100% sure, but I believe it caches everything it builds. You generally don't want to throw away your tools. |
A shame if it's true, since that would mean all tests using this will always fail :/
True, but I would have guessed that all tools we want to keep are primary inputs anyway. It would be interesting to know if there are any attributes that control this, like |
|
Yep, everything ends up in the cache, it seems. At least I'm able to fetch stuff that shouldn't be referenced by any build's final closure. Unless the outputs are GCd at some point, they would eat up quite a lot of disk space and be essentially useless. |
|
A potential solution is to override this setting in |
|
Yes, there's no GC unfortunately. I'm not sure if we should keep all this in the binary cache. As long as there's no way to tell hydra to not upload that, I'm afraid we might need to revert the tests :-/
|
|
Well, currently, hydra fails to build the disk images since they larger than It's all a bit ironic, since making this build better on hydra was one of my goals with this - the GitLab |
|
|
|
As for |
|
Another possibility is to create an image not for the entire system, but only for the paths that matter for performance. When the filesystem is smaller, it may also be feasible to generate it at the last minute, in the test runner, so that it doesn't have be stored in the store. If the filesystem generation is quick enough, that'd be worthwhile. This can be done with Instead of So our options for improving this are:
|
|
I've done some exploration with adding an overlay containing a subset of derivations, with squashfs and ext4. Conclusion so far: a non-stored squashfs is a viable replacement for a stored ext4 image in the gitlab test, but not quite as efficient as what could be achieved by patching Squashfs work in #181537. Findings
Open questions:
My data and some notes: ext4 store image (this PR baseline) 9p host store + reversed overlay (ineffective) (probably close to baseline before this PR) 9p host store + squashfs overlay 9p host store + uncompressed squashfs overlay 9p host store + erofs overlay (default options) erofs (writableStore = false) 9p host store + ext4 overlay 9p host store + uncompressed squashfs overlay with entire Hardware: AMD Ryzen 9 3900 12-Core |
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
…n useNixStoreImage = true The involved test was nixosTests.nextcloud.basic21. It has a testScript that is strict in nodes.nextcloud.config.system.build.vm, in assertions about imagemagick being in the system closure. The recursion was introduced in 329a446 from NixOS#140792
Motivation for this change
The goal of this PR is to speed up some IO heavy tests by building a Nix store disk image instead of using 9p to access the host store.
To this end, the following things are done:
make-disk-image.nix:onlyNixStoreargument, which allows building an image containing only the Nix storeinstallBootLoaderargument, which seemed to have been accidentally removedcopyChannelargument, to control whether a default nix channel is copied to the imageadditionalPathsargument to allow additional paths to be copied to the image's nix storeqemu-vm.nix:useNixStoreImageoption, which controls whether an image is used rather than 9ppathsInNixDBoption to the more generaladditionalPaths, since its purpose is different when building an imagetrivial-builders.nix:writeStringReferencesToFile, a builder which extracts a string's references to derivations and paths and writes them to a text file, removing the input string itself from the dependency graphtesting-python.nix:additionalPaths, since they need to be explicitly made available when building an imageThese changes are then put to use in the
discourseandgitlabtests. With my laptop (AMD Ryzen 7 PRO 2700U) as the reference system, I see the following test run times:Discourse
No change:
25 mins, 49 secs
Building a Nix store image:
4 mins, 44 secs
Building a Nix store image and bumping the core count to
4:3 mins, 6 secs
GitLab
No change:
Times out after 28 mins
Building a Nix store image:
7 mins, 48 secs
Building a Nix store image and bumping the core count to
4:7 mins, 17 secs
The times include the time it takes to build the image (~40 secs for
discourseand ~1 min, 20 secs forgitlab).Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"./result/bin/)