google-drive-ocamlfuse icon indicating copy to clipboard operation
google-drive-ocamlfuse copied to clipboard

systemd automounting example is not stopping cleanly

Open twz123 opened this issue 3 years ago • 1 comments

I have a similar, but non-identical problem as reported in #658 when managing google-drive-ocamlfuse via systemd. I followed the automounting example from the docs folder. When I stop the service unit, everything shuts down (too) quickly, and systemd status "looks good", i.e. it reports the service is stopped, with no notion of any timeouts or signals sent to any (sub) processes.

However, it seems that it wasn't shut down cleanly. At the next startup, google-drive-ocamlfuse reports "google-drive-ocamlfuse didn't shut down correctly.".

I checked man systemd.kill and tried the unsafe KillMode=none config, which made it work for me. Every other KillMode (control-group, mixed, process) resulted in the same unclean shutdown. I can only speculate what's going on under the hood here. As far as I understand, systemd will send signals to some/all of the processes as soon as fusermount -u returns, which, for some reason, interrupts google-drive-ocamlfuse while it's shutting down. I can somehow reproduce this when terminating a manually started google-drive-ocamlfuse process via fusermount -u /path/to/drive && pkill -f '.*google-drive-ocamlfuse.*'.

A clean shutdown of a manually launched process with the -debug option looks like this:

[42.871022] TID=0: Exiting.
[42.871084] TID=0: Stopping flush DB thread (TID=2)...done
[43.057058] TID=0: Flushing cache...
[43.057093] TID=0: Flushing DB...done
[43.080855] TID=0: Storing clean shutdown flag...done
[43.085764] TID=0: CURL cleanup...done
[43.085787] TID=0: Clearing context...done

When I add the -debug flag to the unit's ExecStart setting, change the unit Type to simple (and no KillMode), systemd status reports this after the service is stopped:

Jun 24 14:32:46 laptop systemd[388637]: Stopping FUSE filesystem over Google Drive...
Jun 24 14:32:46 laptop google-drive-ocamlfuse[706529]: [47.639397] TID=0: Exiting.
Jun 24 14:32:46 laptop google-drive-ocamlfuse[706529]: [47.639450] TID=0: Stopping flush DB thread (TID=2)...
Jun 24 14:32:46 laptop systemd[388637]: Stopped FUSE filesystem over Google Drive.

So the shutdown seems to get interrupted, obviously.

I wouldn't recommend to add that KillMode setting to the systemd example, though, as it's considered unsafe. The man page says:

Note that it is not recommended to set KillMode= to process or even none, as this allows processes to escape the service manager's lifecycle and resource management, and to remain running even while their service is considered stopped and is assumed to not consume any resources.

And systemd status reports:

Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

All of this is not a big deal in the end, the only problem is that caches get wiped out on each process start and things take a bit longer, but still everything is working. I will just live with KillMode=none for the time being. But anyhow, there might be a way to induce some more :heart: between systemd and google-drive-ocamlfuse?

PS: As always, thanks so much for working on great software like this for free and for everybody. :v:

twz123 avatar Jun 24 '22 12:06 twz123

Have you tried configuring a TimeoutStopSec= to delay the process termination? There is a loop to check for termination that runs every second, before joining the thread that gets killed in your example, so I think you can specify maybe 5 seconds to be on the safe side. Let me know if this is enough to solve your issue.

And, thanks for your kind words about my project!

astrada avatar Jun 29 '22 14:06 astrada