NixOS: Systemd unit-linking script rewrite for 60x speedups#479442
NixOS: Systemd unit-linking script rewrite for 60x speedups#479442Profpatsch wants to merge 4 commits intoNixOS:staging-nixosfrom
Conversation
A simple binary writer for go scripts. Follows the setup of the other writers. No support for third-party dependencies yet, only go stdlib.
The wrapper would have the same name as the actual thing to be built, which was confusing.
|
Note that I used golang because I personally think it’s the ideal language for nixpkgs build scripts (good runtime, awesome stdlib, fast compilation & binaries, lots of systems supported), but if we don’t want to have it in the system closure I could also be convinced to rewrite it to rust std. |
|
Just bear in mind that rust std by default does not even include a json parser, so we’d need cargo … crates … nix build dependencies … rustc bootstrap … linked libraries … |
|
I would prefer anything over bash, but if rust requires all that tooling and we can't have it without all this download/compile closure i am fine with go. Thank you for your work |
The systemd unit linking script has been a thorn in my side for a while now. It has to be rebuilt every single time any unit dependency in a system closure changes, and due to the multiple for-loops, we’d see multi-second build times that were essentially caused by spawning thousands of ln/cp/etc processes one after the other. An example from a server (small-ish deployment) configuration: building '/nix/store/2lz2v7dfqvc6azih4ppriv0b40c6gwif-system-units-bash-legacy.drv'... Built system-units with bash in 2788ms building '/nix/store/r4xqf8g4aikg2ihr0psvnwa6kj1nzci4-user-units-bash-legacy.drv'... Built user-units with bash in 193ms building '/nix/store/qir3gc3la4fndpi45c226q9ysl5qs28v-system-units-go-ng.drv'... Built system-units with Go in 79ms (consider enabling systemd.unitGenerator.useGoImpl) building '/nix/store/fah3ipi5iwd44fj5q1ypm3r8zd94mcc9-user-units-go-ng.drv'... Built user-units with Go in 18ms (consider enabling systemd.unitGenerator.useGoImpl) As you can see the new scripts take 30–60 times less time to run and are for all intents and purposes is instant, even without and paralellization. This speedup is purely from replacing `exec` with direct syscalls. The default configuration after this change is optimized for debugging the new implementation, it will run both the bash and the go version and compare their outputs. In case something is different, it displays a warning and asks the user to open a new issue with the verbose output. By default, the old bash script output is used. There is a new option `systemd.unitGenerator.useGoImpl` which, when enabled, does not run the old bash scripts anymore and will give the speedup. This should be flipped in future versions of nixos, and the bash implementation dropped. Further advantages of the new script is exhaustive error handling, and a verbose debugging mode that can be enabled with `systemd.unitGenerator.debug = true;`.
500fa1c to
c0e21b6
Compare
|
Imo rust wouldn't be that big of a deal, the linux kernel already requires rust. I also believe the motivation for this change shouldn't be the speedup during build. Downloading go is probably more expensive than the few seconds of bash execution... That said, bash is painful to debug, the motivation should be maintainability here imo. |
|
To be clear: this is a multi-second |
|
So, personally I think this should be written in Rust. Rust seems to be the preferred language among the systemd team, and that's who's ultimately going to be maintaining this long term. It's also good to minimize the number of languages that team has to be good at. And it would help with the future potential of sharing more code among the various in-house Rust programs in NixOS these days. The things mentioned with tooling and bootstrapping I think aren't an issue. Whether it's the go compiler or the Rust toolchain and a few Rust dependencies, these things are only needed when the program is built. The resulting program will be cached by Hydra, so users will not be compiling it themselves either way. As for bootstrapping, NixOS already requires Rust in the bootstrap path between That said, I don't think we should block this PR on requiring you to rewrite it in rust. I would prefer that we have this in Go than in Bash. But given that I'd prefer Rust even more, I think I will try my hand at doing this in Rust myself, if you'll give me the evening to try that. |
|
@ElvishJerricco Yeah sure, you can just rewrite the golang part and keep the interface, it should be reasonably straightforward. I marked all interesting bits with comments, maybe those should be preserved. Just make sure the semantics of e.g. the symlink function are the same, golang is a trivial wrapper around |
|
@ElvishJerricco feel free to push a new commit onto this PR :) |
A direct transliteration of the go implementation.
|
Here you go, I added a direct transliteration of the go implementation. |
|
oh, heh, I had also written a rust version already and hadn't gotten around to integrating it here yet 😅 master...ElvishJerricco:nixpkgs:push-wwrosnwuvopx (though, I am also looking into a couple other options regarding this that's more about getting rid of bash loops than getting rid of bash logic completely) |
I’m very uncomfortable about the fact that this tries to abstract the bash code instead of translating it 1:1, including all weird differences in the loops. We can’t really know which parts of this are intentional and which parts are accidentally doing the right thing, and there’s a good chance we are going to break downstream users this way. I’d rather take the speedup and not change any of the logic. So sorry, but if we create a rust version, it should be a line-by-line translation of the bash code. |
|
Huh? The code may not literally be a line for line translation of the original bash, but I assure you I took care to have it output the exact same results as the bash. e.g. My Anyway, I think I'm going to withdraw from this PR and leave it to others. |
Yeah, there’s a few instances of “copy the symlink file verbatim” to preserve the original (possibly relative) target and not introduce another indirection. Though I’m not sure this is what you are referencing. But that’s why I mean we shouldn’t change behavior because there’s so many combinations of upstream unit files that can happen. A counterpoint to this is that if you look at the git blame and ignore the formatting commit, most of the bash code was written 10 years ago by Eelco and has not been touched since, so there must be some improvements that can be made. |
|
I’m not sure how I should proceed now, I don’t think anybody really wants to review it, half the people want to RiiR but nobody except @ElvishJerricco has stepped up. So I’m tempted to just merge the go impl for now. Though who are the people that currently effectively maintain systemd? Maybe I should ping them on matrix. |
|
Mate. The systemd maintainers were already pinged by CI, and you did a great job pissing ElvishJerrico of, who does a lot of the maintainance work. Pinging on matrix won't fix that. I can understand the dislike for the bash implementation, but frankly saying "So sorry, but [...]" when presented with an alternative approach is a wildly demeaning attitude you have towards existing systemd maintainers.
This is honestly the sanest choice, and i think i will do the same. These systemd internals are intricate, and need proper thorough review. If all alternative approaches are met with such demeaning attitude, there is really not much motivation left to touch this PR at all. Nixpkgs is a volunteer-run project after all.
Yeah, no, i don't think that will happen in this current form. The PR doesn't even pass the very basic CI checks... |
That is exactly what I was talking about and it's exactly what my version took significant care not to change the behavior of.
I would recommend against merging this as is at least on the grounds that:
Regardless, one of the reasons I decided to avoid this PR is because this whole With all that in mind, I will leave it to others in the systemd team to decide if this should be iterated on and merged. To be completely clear: I'm not against it; there's just plenty that needs to be done before this is merge-worthy, and I have other, more substantial plans for this part of NixOS already. |
Okay, so you are against preventing downstream breakage by having a grace period in which people can report problems? I don’t think this is a good plan tbh. fwiw this is one of my major gripes with NixOS as it stands, lots of “rearchitecting” that breaks downstream without any migration periods apart from “someone reviewed it very closely and it worked on their machine”.
Cool, please do. I’d be interested in them. Were they pointing to bugs, inconsistencies?
This is another major gripe I have with nixpkgs, obvious improvements are not done for long stretches of time because people have big plans for redesigns that might or might not happen, meanwhile every single nixos-rebuild takes 5 seconds more than it needs to because the straightforward fix was never reviewed and merged. Sorry if I come off as a little salty, most people here do this in their free time after all, but I think the base criticism is valid. |
I did not say that my other ideas on how to improve this stuff were reasons not to merge your PR. You are the reason I am not reviewing this PR. Someone else is free to. |
sure |
The systemd unit linking script has been a thorn in my side
for a while now. It has to be rebuilt every single time any
unit dependency in a system closure changes, and due to the
multiple for-loops, we’d see multi-second build times that
were essentially caused by spawning thousands of ln/cp/etc
processes one after the other.
An example from a server (small-ish deployment) configuration:
building '/nix/store/2lz2v7dfqvc6azih4ppriv0b40c6gwif-system-units-bash-legacy.drv'...
Built system-units with bash in 2788ms
building '/nix/store/r4xqf8g4aikg2ihr0psvnwa6kj1nzci4-user-units-bash-legacy.drv'...
Built user-units with bash in 193ms
building '/nix/store/qir3gc3la4fndpi45c226q9ysl5qs28v-system-units-go-ng.drv'...
Built system-units with Go in 79ms (consider enabling systemd.unitGenerator.useGoImpl)
building '/nix/store/fah3ipi5iwd44fj5q1ypm3r8zd94mcc9-user-units-go-ng.drv'...
Built user-units with Go in 18ms (consider enabling systemd.unitGenerator.useGoImpl)
As you can see the new scripts take 30–60 times less time to run
and are for all intents and purposes is instant, even without
and paralellization. This speedup is purely from replacing
execwith direct syscalls.The default configuration after this change is optimized
for debugging the new implementation, it will run both
the bash and the go version and compare their outputs.
In case something is different, it displays a warning and
asks the user to open a new issue with the verbose output.
By default, the old bash script output is used.
There is a new option
systemd.unitGenerator.useGoImplwhich, when enabled, does not run the old bash scripts
anymore and will give the speedup. This should be flipped
in future versions of nixos, and the bash implementation
dropped.
Further advantages of the new script is exhaustive error
handling, and a verbose debugging mode that can be enabled
with
systemd.unitGenerator.debug = true;.Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.Add a 👍 reaction to pull requests you find important.