sd-shutdown improvements#8429
Conversation
keszybz
left a comment
There was a problem hiding this comment.
Looks good in general, some minor nitpicks below.
| n_failed ++; | ||
| continue; | ||
| } | ||
| } |
There was a problem hiding this comment.
I don't think there's any reason for this chunk.
src/core/umount.c
Outdated
| if (r < 0) | ||
| return r; | ||
| if (!isempty(unknown_flags)) | ||
| log_warning("Ignoring unkown mount flags '%s'.", unknown_flags); |
|
|
||
| /* MS_BIND is special. If it is provided it will only make the mount-point | ||
| * read-only. If left out, the super block itself is remounted, which we want. */ | ||
| m->remount_flags = (m->remount_flags|MS_REMOUNT|MS_RDONLY) & ~MS_BIND; |
There was a problem hiding this comment.
For the whole commit: please provide justification for this change in the patch comment. What are the benefits? For which file systems does this matter?
| if (nonunmountable_path(m->path)) | ||
| /* Remount failed, but try unmounting anyway, | ||
| * unless this is a mount point we want to skip. */ | ||
| if (nonunmountable_path(m->path)) { |
There was a problem hiding this comment.
Hm, isn't this patch made obsolete by the subsequent patch to skip remount-ro on api filesystems?
This most likely amounts to no real benefits and is just here for completeness sake.
In the case of some api filesystems remounting read-only fails while unmounting succeeds.
There is little point in logging about unmounting errors if the exact mountpoint will be successfully unmounted in a later retry due unmounts below it having been removed. Additionally, don't log those errors if we are going to switch back to a initrd, because that one is also likely to finalize the remaining mountpoints. If not, it will log errors then.
|
Made some changes. Originally, I thought that the kernel might be complaining about the flags not being the same (sans the ro) when remounting the cgroup tree. But it turned out not to do anything. I decided to keep it, because the manpage said that the flags should be set that way after all. In fact, I was the one that added the code to reuse the mount options some time ago. I simply must've skipped this detail the last time, or I would've added it then. Yes, not remounting api filesystems in the first place would make the umount after failed remount obsolete. But I figured it would future-proof the whole thing. If some new (api) fs should come up that cannot be remounted but also isn't detected as api yet, we would once again be in a situation where we cannot unmount / or /oldroot just because of some (api) fs mounted below. |
|
OK, this seems good to merge. This area is tricky, so I'm not 100% sure that it's all correct, but we're pretty far from a release now, so if there are any regressions we should catch them. Let's keep an eye out. |
|
Has this been tested with a LUKS encrypted root partition? This patch was just introduced into Manjaro unstable branch. All my systems have an encrypted root partition and all crash during shutdown / reboot. Hard power off is required. This is a virtualbox screen grab, but the text is identical on bare metal systems with the same crash. The following systemd-cryptsetup errors are also generated during shutdown / reboot. With a non encrypted root partition reboot and shutdown work normally. |
|
I haven't tested this with encrypted root. Can you git bisect this to a specific commit? |
|
@sueridgepipe: please get the sources of our package and add the commits one by one after line 67 to see which commit is the culprit. |
|
I think I have a fix for you, no need to bisect. I'll push a PR shortly. |
|
I added the "ci-fails" and "merged-but-needs-fixing" labels because the CI had failed with [�[0;32m OK �[0m] Stopped Remount Root and Kernel File Systems.
[�[0;32m OK �[0m] Removed slice system-systemd\x2dfsck.slice.
[�[0;32m OK �[0m] Stopped Cryptography Setup for varcrypt.
[�[0;32m OK �[0m] Reached target Unmount All Filesystems.
[�[0;32m OK �[0m] Removed slice system-systemd\x2dcryptsetup.slice.
[�[0;32m OK �[0m] Reached target Shutdown.
[�[0;32m OK �[0m] Reached target Final Step.
Starting Power-Off...
[ 21.831426] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[ 21.831426]
[ 21.832309] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.4.0-116-generic #140-Ubuntu
[ 21.832309] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 21.832309] 0000000000000086 609837e11178841e ffff88001c73fc48 ffffffff813ffc13
[ 21.832309] ffffffff81cbc7b0 ffff88001c73fce0 ffff88001c73fcd0 ffffffff81191327
[ 21.832309] ffff880000000010 ffff88001c73fce0 ffff88001c73fc78 609837e11178841e
[ 21.832309] Call Trace:
[ 21.832309] [<ffffffff813ffc13>] dump_stack+0x63/0x90
[ 21.832309] [<ffffffff81191327>] panic+0xd3/0x227
[ 21.832309] [<ffffffff810865c4>] do_exit+0xaf4/0xb00
[ 21.832309] [<ffffffff81086653>] do_group_exit+0x43/0xb0
[ 21.832309] [<ffffffff81092e74>] get_signal+0x294/0x600
[ 21.832309] [<ffffffff8102e567>] do_signal+0x37/0x6f0
[ 21.832309] [<ffffffff8109073a>] ? signal_wake_up_state+0x2a/0x30
[ 21.832309] [<ffffffff81090876>] ? complete_signal+0x106/0x1f0
[ 21.832309] [<ffffffff810cd681>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[ 21.832309] [<ffffffff8100320c>] exit_to_usermode_loop+0x8c/0xd0
[ 21.832309] [<ffffffff81003c26>] prepare_exit_to_usermode+0x26/0x30
[ 21.832309] [<ffffffff8184fbe2>] retint_user+0x8/0x50
[ 21.832309] Kernel Offset: disabled
[ 21.832309] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[ 21.832309] |
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>
OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>

This should make sure the shutdown process only logs when it should. I'm not too sure about the last patch, though. It feels a bit ugly, but it gets the job done.
I am also wondering whether the sd-shutdown umount loop shouldn't be turned into a while loop. We are already checking if we are making any changes during iterations, and currently, there is potential for mounts not to get handled should we have a crazy nested setup that goes beyond FINALIZE_ATTEMPTS.
Fixes #8155