Skip to content

Add "perpetual" unit concept, sysctl fixes, networkd fixes, systemctl color fixes, nspawn discard#4481

Merged
keszybz merged 14 commits intosystemd:masterfrom
poettering:perpetual
Nov 3, 2016
Merged

Add "perpetual" unit concept, sysctl fixes, networkd fixes, systemctl color fixes, nspawn discard#4481
keszybz merged 14 commits intosystemd:masterfrom
poettering:perpetual

Conversation

@poettering
Copy link
Member

Various smaller improvements

src/core/mount.c Outdated
return 1;
}

static void synthesize_root_mount(Manager *m) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this return an int? Is seems like a failure here should cascade up to the caller of mount enumerate. And you'd avoid the ugly { log_oom(); return; } combos.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i figure synthesize_root_mount() could return an int, indeed. will fix.

mount_enumerate() is a vtable function however. It definitely makes sense to make it return an int too, but that would mean we'd have to change the unit vtable first, and alter all unit type implementations. Makes a lot of sense, but is more invasive, hence let's leave that for a later PR...

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's a reasonable change. But it doesn't solve the problem completely, because you still end with a log of spurious log_notices in a container. Containers otherwise boot nice and quiet, so this is noticeable.

I think we should (at some point in the future), add a way to mark certain sysctl.d files with conditions, similarly to units.

@virtualization=!yes

or so, by taking ConditionX and removing the Condition prefix. Not sure about the exact syntax.

@@ -252,7 +252,7 @@ int main(int argc, char *argv[]) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we need = NULL here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true!

l = path_startswith(what, "/dev");
if (l && startswith(l, "loop"))
options = "discard";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we instead call stat on what? Comparing paths seems rather awkward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what's the other option? checking stat() and the major/minor? So far we tried to avoid hardcoding major/minors, so that the kernel can assign them entirely dynamically as it likes (it does that only for some subsystems so far, but we really shouldn't hardcode for which one it does that and for which one it doesn't, as that might change...).

So far the preferable way to identify the type of a device was either by the subsystem string (which doesn't work here, it's always "block") or by prefix checking of the device name, which is what we do here...

Yes, doing such a prefix name check isn't pretty, but I really see no better way to handle this. All the other options I see are much worse...

@keszybz
Copy link
Member

keszybz commented Oct 27, 2016

Reviewed. One uninitialized mem access, and a couple of questions…

Maybe you should split out the last two patches for networkd into a separate PR? They are rather unrelated and could go in faster.

* usually protected their sysctls.) In all other cases log an error and make the tool fail. */

if (IN_SET(k, -EPERM, -EACCES, -EROFS, -ENOENT))
log_notice_errno(k, "Couldn't write '%s' to '%s', ignoring: %m", value, property);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this makes sense inside the container. But I think we shouldn't ignore these errors on the host.
Anyway, #4425 (comment)

-bash-4.3# systemctl --state=failed --no-legend --no-pager
systemd-sysctl.service loaded failed failed Apply Kernel Variables

is better than a "clean" boot-up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having /proc/sys read-only (either partially or full) should be considered independent of running in a container in the host: it just means "your system shall not be able to make changes to these settings". I think a setup where a host initrd mounts /proc/sys read-only after applying some initial settings, and then handing control over to the host PID 1 is an absolutely OK usecase, that should be handled the exact same way as if we'd run in a container and a container manager had set it up like that for us: it just means "some higher power doesn't want you to touch this, live with it".

Hence, I am pretty sure we should not add an explicit container check here, or handle things differently here if we run in a container: we should just consider this the exact same case.

Copy link
Contributor

@evverx evverx Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poettering , that's about EROFS.
Hm, what about EACCESS/EPERM? Maybe, that's OK to ignore these errors inside the container: #4154
But I'm not sure about EACESS/EPERM on the host

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evverx well, I really wouldn't make a distinction here between EROFS/EPERM/EACCESS, which one you get with this is simply result of the security mechanism you pick. You can use r/o bind mounts, which means EROFS, but you could also implement something like this with selinux/aarmoer/smack/some future LSM or anything else. I think which one you use and hence which kind of permission check you see is an implementation detail and shouldn't alter our handling of it...

keszybz added a commit that referenced this pull request Oct 30, 2016
@poettering
Copy link
Member Author

or so, by taking ConditionX and removing the Condition prefix. Not sure about the exact syntax.

Hmm, not sure I like that too much. Maybe instead add "-" or so as additional line prefix, that follows the scheme we have with "-" in ExecStart= and suchlike, i.e. suppresses errors triggered by this line entirely.

…l" flag

So far "no_gc" was set on -.slice and init.scope, to units that are always
running, cannot be stopped and never exist in an "inactive" state. Since these
units are the only users of this flag, let's remodel it and rename it
"perpetual" and let's derive more funcitonality off it. Specifically, refuse
enqueing stop jobs for these units, and report that they are "unstoppable" in
the CanStop bus property.
Now that have a proper concept of "perpetual" units, let's make the root mount
one too, since it also cannot go away.
…ad-only

Let's make missing write access to /proc/sys non-fatal to the sysctl service.

This is a follow-up to 411e869 which altered
the condition for running the sysctl service to check for /proc/sys/net being
writable, accepting that /proc/sys might be read-only. In order to ensure the
boot-up stays clean in containers lower the log level for the EROFS errors
generated due to this.
This way, we can get rid of a label/goto.
Let's only check for eof once after the fgets(). There's no point in checking
EOF before the first read, and twice in each loop.
Let's place only one ternary operator.
If we turn on red color for the active column and it is not combined with
underlining, then we need to turn it off explicitly afterwards. Do that.
Make the underlining between the header and the body and between the units of
different types span the whole width of the table.

Let's never make the table wider than necessary (which is relevant due the
above).

When space is limited and we can't show the full ID or description string
prefer showing the full ID over the full description. The ID is after all
something people might want to copy/paste, while the description is mostly just
helpful decoration.
Let's make sure that our loopback files remain sparse, hence let's set
"discard" as mount option on file systems that support it if the backing device
is a loopback.
Let's propagate the error here, instead of eating it up early.

In a later change we should probably also change mount_enumerate() to propagate
errors up, but that would mean we'd have to change the unit vtable, and thus
change all unit types, hence is quite an invasive change.
@poettering
Copy link
Member Author

i force pushed a new, rebased version now with the two changes made. Please review and merge!

@keszybz keszybz merged commit 7fa6328 into systemd:master Nov 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants