sd-shutdown improvements by medhefgo · Pull Request #8429 · systemd/systemd

medhefgo · 2018-03-12T12:56:53Z

This should make sure the shutdown process only logs when it should. I'm not too sure about the last patch, though. It feels a bit ugly, but it gets the job done.

I am also wondering whether the sd-shutdown umount loop shouldn't be turned into a while loop. We are already checking if we are making any changes during iterations, and currently, there is potential for mounts not to get handled should we have a crazy nested setup that goes beyond FINALIZE_ATTEMPTS.

Fixes #8155

keszybz

Looks good in general, some minor nitpicks below.

keszybz · 2018-03-12T15:52:31Z

src/core/umount.c

                                n_failed ++;
                                continue;
                        }
+                }


I don't think there's any reason for this chunk.

keszybz · 2018-03-12T15:56:52Z

src/core/umount.c

+                        if (r < 0)
+                                return r;
+                        if (!isempty(unknown_flags))
+                                log_warning("Ignoring unkown mount flags '%s'.", unknown_flags);


Typo in "unkown"

keszybz · 2018-03-12T15:58:57Z

src/core/umount.c

+
+                        /* MS_BIND is special. If it is provided it will only make the mount-point
+                         * read-only. If left out, the super block itself is remounted, which we want. */
+                        m->remount_flags = (m->remount_flags|MS_REMOUNT|MS_RDONLY) & ~MS_BIND;


For the whole commit: please provide justification for this change in the patch comment. What are the benefits? For which file systems does this matter?

keszybz · 2018-03-12T16:12:02Z

src/core/umount.c

-                                if (nonunmountable_path(m->path))
+                                /* Remount failed, but try unmounting anyway,
+                                 * unless this is a mount point we want to skip. */
+                                if (nonunmountable_path(m->path)) {


Hm, isn't this patch made obsolete by the subsequent patch to skip remount-ro on api filesystems?

This most likely amounts to no real benefits and is just here for completeness sake.

In the case of some api filesystems remounting read-only fails while unmounting succeeds.

There is little point in logging about unmounting errors if the exact mountpoint will be successfully unmounted in a later retry due unmounts below it having been removed. Additionally, don't log those errors if we are going to switch back to a initrd, because that one is also likely to finalize the remaining mountpoints. If not, it will log errors then.

medhefgo · 2018-03-12T17:45:51Z

Made some changes.

Originally, I thought that the kernel might be complaining about the flags not being the same (sans the ro) when remounting the cgroup tree. But it turned out not to do anything. I decided to keep it, because the manpage said that the flags should be set that way after all. In fact, I was the one that added the code to reuse the mount options some time ago. I simply must've skipped this detail the last time, or I would've added it then.

Yes, not remounting api filesystems in the first place would make the umount after failed remount obsolete. But I figured it would future-proof the whole thing. If some new (api) fs should come up that cannot be remounted but also isn't detected as api yet, we would once again be in a situation where we cannot unmount / or /oldroot just because of some (api) fs mounted below.

keszybz · 2018-03-13T08:47:04Z

OK, this seems good to merge. This area is tricky, so I'm not 100% sure that it's all correct, but we're pretty far from a release now, so if there are any regressions we should catch them. Let's keep an eye out.

sueridgepipe · 2018-03-13T23:51:03Z

Has this been tested with a LUKS encrypted root partition?

This patch was just introduced into Manjaro unstable branch. All my systems have an encrypted root partition and all crash during shutdown / reboot. Hard power off is required.

This is a virtualbox screen grab, but the text is identical on bare metal systems with the same crash.

The following systemd-cryptsetup errors are also generated during shutdown / reboot.

Mar 13 manjaro systemd-cryptsetup[14160]: Failed to deactivate: Device or resource busy
Mar 13 manjaro systemd-cryptsetup[14161]: Failed to deactivate: Device or resource busy
Mar 13 manjaro kernel: watchdog: watchdog0: watchdog did not stop!

With a non encrypted root partition reboot and shutdown work normally.

medhefgo · 2018-03-14T09:17:17Z

I haven't tested this with encrypted root. Can you git bisect this to a specific commit?

hphilm · 2018-03-14T10:46:52Z

@sueridgepipe: please get the sources of our package and add the commits one by one after line 67 to see which commit is the culprit.

keszybz · 2018-03-14T12:08:51Z

I think I have a fix for you, no need to bisect. I'll push a PR shortly.

keszybz · 2018-03-14T12:49:33Z

See #8425 and in particular b936186.

hphilm · 2018-03-14T17:29:19Z

@keszybz: I assume you mean pull #8452

evverx · 2018-03-14T18:32:24Z

I added the "ci-fails" and "merged-but-needs-fixing" labels because the CI had failed with

[�[0;32m  OK  �[0m] Stopped Remount Root and Kernel File Systems.
[�[0;32m  OK  �[0m] Removed slice system-systemd\x2dfsck.slice.
[�[0;32m  OK  �[0m] Stopped Cryptography Setup for varcrypt.
[�[0;32m  OK  �[0m] Reached target Unmount All Filesystems.
[�[0;32m  OK  �[0m] Removed slice system-systemd\x2dcryptsetup.slice.
[�[0;32m  OK  �[0m] Reached target Shutdown.
[�[0;32m  OK  �[0m] Reached target Final Step.
         Starting Power-Off...
[   21.831426] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[   21.831426] 
[   21.832309] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.4.0-116-generic #140-Ubuntu
[   21.832309] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   21.832309]  0000000000000086 609837e11178841e ffff88001c73fc48 ffffffff813ffc13
[   21.832309]  ffffffff81cbc7b0 ffff88001c73fce0 ffff88001c73fcd0 ffffffff81191327
[   21.832309]  ffff880000000010 ffff88001c73fce0 ffff88001c73fc78 609837e11178841e
[   21.832309] Call Trace:
[   21.832309]  [<ffffffff813ffc13>] dump_stack+0x63/0x90
[   21.832309]  [<ffffffff81191327>] panic+0xd3/0x227
[   21.832309]  [<ffffffff810865c4>] do_exit+0xaf4/0xb00
[   21.832309]  [<ffffffff81086653>] do_group_exit+0x43/0xb0
[   21.832309]  [<ffffffff81092e74>] get_signal+0x294/0x600
[   21.832309]  [<ffffffff8102e567>] do_signal+0x37/0x6f0
[   21.832309]  [<ffffffff8109073a>] ? signal_wake_up_state+0x2a/0x30
[   21.832309]  [<ffffffff81090876>] ? complete_signal+0x106/0x1f0
[   21.832309]  [<ffffffff810cd681>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[   21.832309]  [<ffffffff8100320c>] exit_to_usermode_loop+0x8c/0xd0
[   21.832309]  [<ffffffff81003c26>] prepare_exit_to_usermode+0x26/0x30
[   21.832309]  [<ffffffff8184fbe2>] retint_user+0x8/0x50
[   21.832309] Kernel Offset: disabled
[   21.832309] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[   21.832309]

OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements Signed-off-by: Florin Sarbu <[email protected]>

OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo Signed-off-by: Florin Sarbu <[email protected]>

OS 2.26.0+rev1 reports the following at reboot: systemd-shutdown[1]: Failed to wait for process: Protocol error systemd-shutdown[1]: Failed to wait for process: Protocol error As per systemd/systemd#8155 (comment), systemd/systemd#8429 needs to be backported. We currently backport these patches for sumo only Change-type: minor Changelog-entry: Backport systemd sd-shutdown improvements for sumo versions Signed-off-by: Florin Sarbu <[email protected]>

umount: Fix memory leak

659b153

keszybz reviewed Mar 12, 2018

View reviewed changes

keszybz added the pid1 label Mar 12, 2018

medhefgo added 6 commits March 12, 2018 18:32

umount: Add more asserts and remove some unused arguments

0494cae

umount: Decide whether to remount read-only earlier

1d62d22

umount: Provide the same mount flags too when remounting read-only

3bc341b

This most likely amounts to no real benefits and is just here for completeness sake.

umount: Try unmounting even if remounting read-only failed

8645ffd

In the case of some api filesystems remounting read-only fails while unmounting succeeds.

umount: Don't bother remounting api and ro filesystems read-only

e783b49

keszybz merged commit d4f5c00 into systemd:master Mar 13, 2018

evverx mentioned this pull request Mar 14, 2018

Fixes for bugs found by oss-fuzz #8441

Merged

evverx added ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR merged-but-needs-fixing labels Mar 14, 2018

keszybz mentioned this pull request Mar 16, 2018

Use libmount in systemd-shutdown, add tests #8452

Merged

medhefgo deleted the sd-shutdown branch March 16, 2018 16:45

poettering mentioned this pull request Jun 4, 2018

shutdown[1]: Failed to wait for process: Protocol error #8155

Closed

floion mentioned this pull request Nov 7, 2018

Backport https://github.com/systemd/systemd/pull/8429 balena-os/meta-balena#1275

Closed

floion mentioned this pull request Nov 7, 2018

systemd/sumo: Backport sd-shutdown improvements balena-os/meta-balena#1276

Merged

4 tasks

Uh oh!

Conversation

medhefgo commented Mar 12, 2018

Uh oh!

keszybz left a comment

Choose a reason for hiding this comment

Uh oh!

keszybz Mar 12, 2018

Choose a reason for hiding this comment

Uh oh!

keszybz Mar 12, 2018

Choose a reason for hiding this comment

Uh oh!

keszybz Mar 12, 2018

Choose a reason for hiding this comment

Uh oh!

keszybz Mar 12, 2018

Choose a reason for hiding this comment

Uh oh!

medhefgo commented Mar 12, 2018

Uh oh!

keszybz commented Mar 13, 2018

Uh oh!

sueridgepipe commented Mar 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

medhefgo commented Mar 14, 2018

Uh oh!

hphilm commented Mar 14, 2018

Uh oh!

keszybz commented Mar 14, 2018

Uh oh!

keszybz commented Mar 14, 2018

Uh oh!

hphilm commented Mar 14, 2018

Uh oh!

evverx commented Mar 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

sueridgepipe commented Mar 13, 2018 •

edited

Loading

evverx commented Mar 14, 2018 •

edited

Loading