permit bus clients to pin units to avoid automatic GC#3984
permit bus clients to pin units to avoid automatic GC#3984evverx merged 19 commits intosystemd:masterfrom
Conversation
|
hmm, CI failure appears unrelated. |
|
I've never seen the storage test fail so far, so that's at least suspicious. Anyway, retried the test(s). |
|
i have rebased the patch set on current git (since it didn't apply anymore). I added a trivial patch on top. |
|
Thank you! We were considering adding a |
|
It probably needs to extend |
|
Is this an expected behaviour? -bash-4.3# systemd-run -p CPUAccounting=1 --wait sleep 10 & sleep 1; systemctl restart dbus; kill -TERM %1; sleep 15
[1] 141
Running as unit: run-u3.service
Warning! D-Bus connection terminated.
[1]+ Terminated systemd-run -p CPUAccounting=1 --wait sleep 10
# still in memory
-bash-4.3# systemctl status run-u3 --no-pager
● run-u3.service - /usr/bin/sleep 10
Loaded: loaded (/run/systemd/transient/run-u3.service; transient; vendor preset: enabled)
Transient: yes
Active: inactive (dead) since Tue 2016-08-16 16:48:52 UTC; 23s ago
Process: 143 ExecStart=/usr/bin/sleep 10 (code=exited, status=0/SUCCESS)
Main PID: 143 (code=exited, status=0/SUCCESS) |
A lot of basic code wants to know the stack size, and it is safe if they do, hence let's permit getrlimit() (but not setrlimit()) by default. See: systemd#3970
This adds an optional "recursive" counting mode to sd_bus_track. If enabled adding the same name multiple times to an sd_bus_track object is counted individually, so that it also has to be removed the same number of times before it is gone again from the tracking object. This functionality is useful for implementing local ref counted objects that peers make take references on.
And while ware at it, also drop some references to kdbus, and stop claiming sd-bus wasn't stable yet. Also order man page references in the main sd-bus man page alphabetically.
This adds two (privileged) bus calls Ref() and Unref() to the Unit interface. The two calls may be used by clients to pin a unit into memory, so that various runtime properties aren't flushed out by the automatic GC. This is necessary to permit clients to race-freely acquire runtime results (such as process exit status/code or accumulated CPU time) on successful service termination. Ref() and Unref() are fully recursive, hence act like the usual reference counting concept in C. Taking a reference is a privileged operation, as this allows pinning units into memory which consumes resources. Transient units may also gain a reference at the time of creation, via the new AddRef property (that is only defined for transient units at the time of creation).
Let's make sure we can read the exit code/status properties exposed by PID 1 properly. Let's reuse the existing code for unsigned fields, as we just use it to copy words around, and don't calculate it.
Instead of ignoring empty strings retrieved via the bus, treat them as NULL, as it's customary in systemd.
…nknown type Make sure we return proper errors for types not understood yet.
It is useful for clients to be able to read the last CPU usage counter value of a unit even if the unit is already terminated. Hence, before destroying a cgroup's cgroup cache the last CPU usage counter and return it if the cgroup is gone.
Also, make major improvements to the an page in general.
…unction When a bus connection is closed we dispatch all reply callbacks. Do so in a new function if its own. No behaviour changes.
…mes to them In order to add a name to a bus tracking object we need to do some bus operations: we need to check if the name already exists and add match for it. Both are synchronous bus calls. While processing those we need to make sure that the tracking object is not dispatched yet, as it might still be empty, but is not going to be empty for very long. hence, block dispatching by removing the object from the dispatch queue while adding it, and readding it on error.
…racking objects immediately If the server side kicks us from the bus, from our view no names are on the bus anymore, hence let's make sure to dispatch all tracking objects immediately.
This tests in particular that disconnecting results in the tracking object's handlers to be called.
Old libdbus has a feature that the process is terminated whenever the the bus connection receives a disconnect. This is pretty useful on desktop apps (where a disconnect indicates session termination), as well as on command line apps (where we really shouldn't stay hanging in most cases if dbus daemon goes down). Add a similar feature to sd-bus, but make it opt-in rather than opt-out, like it is on libdbus. Also, if the bus is attached to an event loop just exit the event loop rather than the the whole process.
bus_connect_transport() is exclusively used from our command line tools, hence let's set exit-on-disconnect for all of them, making behaviour a bit nicer in case dbus-daemon goes down.
|
I force pushed another version now, that is rebase on current git. It also adds a short doc fix, as suggested by @arvidjaar. It also adds infrastructure to deal with the issue pointed out by @evverx: we need to handle the tracking of peers properly if we lose the bus connection and hence will never actually see the peer disappear from the bus. |
@poettering , thanks! |
This adds a concept of ref counting of units to the bus API. Two new calls on the Unit interface (Ref() and Unref()), plus the matching calls on the Manager interface (RefUnit() and UnrefUnit()) permit influencing the GC logic of systemd units, and make sure that units are not GC'ed right-away after they exited. A new pseudo-property AddRef is also added that may be specified when a transient unit is created, so that a Ref() and a StartTransientUnit() call may be combined atomically.
This is ref counting concept deals with a long-standing problem of the GC logic: a service that exited cleanly would likely be cleaned up immediately, thus losing all runtime information, such as CPU times and suchlike. With the new GC logic a client may now pin a unit into memory, thus ensuring the that exit status information can still be read before the unit is cleaned up.
Note that ref counting is only opened up to privileged clients, so that unprivileged clients cannot pin units into memory unbounded.
This PR includes a patch that extends "systemd-run" with a new "--wait" switch that makes use of this. When specified, systemd-run will wait for the created service to finish and show terse information about its runtime. This is particularly useful to see the consumed CPU time and such. Example invocation: