nix-daemon.socket.in: fix condition unit properties#6282
nix-daemon.socket.in: fix condition unit properties#6282flokli wants to merge 1 commit intoNixOS:masterfrom
Conversation
|
But if the socket is bind-mounted (like in a NixOS container), then we shouldn't start the daemon right? |
|
We could probably skip Another option would be to have a NixOS option to disable |
|
Yup, On the host: In the container: I'll update this PR. |
nix-daemon.socket is used to socket-activate nix-daemon.service when /nix/var/nix/daemon-socket/socket is accessed. Having a ConditionPathIsReadWrite on the /nix/var/nix/daemon-socket directory will cause systemd to just skip if it's not present yet. > [ 237.187747] systemd[1]: Nix Daemon Socket was skipped because of a failed condition check (ConditionPathIsReadWrite=/nix/var/nix/daemon-socket). As it's the nix-daemon itself that creates this directory, we're in a chicken-and-egg problem - as long as the folder isn't created, nix-daemon won't start (as it's only socket-activated), and the socket unit will get skipped, as the directory doesn't exist yet. As elaborated in NixOS#6282 (comment), the initial goal of this statement was to detect the "/nix/var/nix/daemon-socket is bind-mounted into a container" situation. In these cases, we want to skip starting nix-daemon.socket, as something the host is providing the `socket` file in there. This can be better described by skipping `nix-daemon.socket` if `/nix/var/nix/daemon-socket/socket` already exists. This has surfaced in the systemd 250 bump (which probably did apply some more rigid checks), where tests with containers that bind-mount /nix/var/nix/daemon-socket started to fail.
925703b to
633fbe1
Compare
nix-daemon.socket is used to socket-activate nix-daemon.service when /nix/var/nix/daemon-socket/socket is accessed. Having a ConditionPathIsReadWrite on the /nix/var/nix/daemon-socket directory will cause systemd to just skip if it's not present yet. > [ 237.187747] systemd[1]: Nix Daemon Socket was skipped because of a failed condition check (ConditionPathIsReadWrite=/nix/var/nix/daemon-socket). As it's the nix-daemon itself that creates this directory, we're in a chicken-and-egg problem - as long as the folder isn't created, nix-daemon won't start (as it's only socket-activated), and the socket unit will get skipped, as the directory doesn't exist yet. As elaborated in NixOS#6282 (comment), the initial goal of this statement was to detect the "/nix/var/nix/daemon-socket is bind-mounted into a container" situation. In these cases, we want to skip starting nix-daemon.socket, as something the host is providing the `socket` file in there. This can be better described by skipping `nix-daemon.socket` if `/nix/var/nix/daemon-socket/socket` already exists. This has surfaced in the systemd 250 bump (which probably did apply some more rigid checks), where tests with containers that bind-mount /nix/var/nix/daemon-socket started to fail.
633fbe1 to
021753f
Compare
|
@edolstra I updated this PR, as well as the downstream nixpkgs workaround in NixOS/nixpkgs#164644. PTAL. |
Doesn't this mean that if the service is ever abnormally killed leaving behind the socket, that it can't be restarted by systemd anymore? |
|
^ Yes I was worried about that too. It seems like what we are doing is the common case, and systemd ought to support it. Maybe we can ask upstream what they recommend. |
Good catch! I quickly reproduced this, by a The This means a manual restart of the However, of course we still want to be able to explicitly restart However, the Condition is probably checked before the cleanup is happening, so this can still break if for some reason there's a stale socket file around - for example, when unplugging the power and leaving the stale socket file around. I'm not really sure how to fix this. It's probably too late to move this socket file to something tmpfs-backed. |
|
To recap:
The proposal from this PR tried to skip the socket unit from activation if the socket file ( Maybe the right way to fix this is letting systemd-tmpfiles create the folder on the host, by shipping a We'd need to check how this effects read-only mountpoints in containers though. What do you think? |
|
Alternative PR is up at #6285 |
|
It seems like we should start with the alternative, then see, what, if anything is needed in addition? |
The problem is that we don't always want to skip starting nix-daemon in a container, but only if we're bind-mounting the host's Nix store. You can have containers with their own Nix store, which is why we can't use |
Yes, this is addressed by #6285 - we keep the |
|
If you want to close this, I have perms to do that :) |
nix-daemon.socket is used to socket-activate nix-daemon.service when
/nix/var/nix/daemon-socket/socket is accessed.
Having a ConditionPathIsReadWrite on the /nix/var/nix/daemon-socket
directory will cause systemd to just skip if it's not present yet.
As it's the nix-daemon itself that creates this directory, we're in a
chicken-and-egg problem - as long as the folder isn't created,
nix-daemon won't start (as it's only socket-activated), and the socket
unit will get skipped, as the directory doesn't exist yet.
I think we don't actually want to skip starting the socket unit when the
directory doesn't exist yet.
This has surfaced in the systemd 250 bump (which probably did apply some
more rigid checks), where tests with containers that bind-mount
/nix/var/nix/daemon-socket started to fail.