permit bus clients to pin units to avoid automatic GC by poettering · Pull Request #3984 · systemd/systemd

poettering · 2016-08-18T20:10:55Z

This adds a concept of ref counting of units to the bus API. Two new calls on the Unit interface (Ref() and Unref()), plus the matching calls on the Manager interface (RefUnit() and UnrefUnit()) permit influencing the GC logic of systemd units, and make sure that units are not GC'ed right-away after they exited. A new pseudo-property AddRef is also added that may be specified when a transient unit is created, so that a Ref() and a StartTransientUnit() call may be combined atomically.

This is ref counting concept deals with a long-standing problem of the GC logic: a service that exited cleanly would likely be cleaned up immediately, thus losing all runtime information, such as CPU times and suchlike. With the new GC logic a client may now pin a unit into memory, thus ensuring the that exit status information can still be read before the unit is cleaned up.

Note that ref counting is only opened up to privileged clients, so that unprivileged clients cannot pin units into memory unbounded.

This PR includes a patch that extends "systemd-run" with a new "--wait" switch that makes use of this. When specified, systemd-run will wait for the created service to finish and show terse information about its runtime. This is particularly useful to see the consumed CPU time and such. Example invocation:

# systemd-run -p CPUAccounting=1 --wait dd if=/dev/urandom of=/dev/null bs=1024 count=102400
Running as unit: run-u42547.service
Finished with result: success
Main processes terminated with: code=exited/status=0
Service runtime: 614ms
CPU time consumed: 610ms

poettering · 2016-08-18T22:40:10Z

hmm, CI failure appears unrelated.

martinpitt · 2016-08-19T10:26:08Z

I've never seen the storage test fail so far, so that's at least suspicious. Anyway, retried the test(s).

poettering · 2016-08-19T17:33:51Z

i have rebased the patch set on current git (since it didn't apply anymore). I added a trivial patch on top.

davide125 · 2016-08-19T18:35:43Z

Thank you! We were considering adding a --wait to systemd-run as well to use it kickoff expensive processes from Chef, this will save us a lot of work :-)

arvidjaar · 2016-08-19T18:40:28Z

It probably needs to extend systemctl list-units manual page by adding pinned units to list.

evverx · 2016-08-19T20:33:17Z

Is this an expected behaviour?

-bash-4.3# systemd-run -p CPUAccounting=1 --wait sleep 10 & sleep 1; systemctl restart dbus; kill -TERM %1; sleep 15
[1] 141
Running as unit: run-u3.service
Warning! D-Bus connection terminated.
[1]+  Terminated              systemd-run -p CPUAccounting=1 --wait sleep 10

# still in memory
-bash-4.3# systemctl status run-u3 --no-pager
● run-u3.service - /usr/bin/sleep 10
   Loaded: loaded (/run/systemd/transient/run-u3.service; transient; vendor preset: enabled)
Transient: yes
   Active: inactive (dead) since Tue 2016-08-16 16:48:52 UTC; 23s ago
  Process: 143 ExecStart=/usr/bin/sleep 10 (code=exited, status=0/SUCCESS)
 Main PID: 143 (code=exited, status=0/SUCCESS)

A lot of basic code wants to know the stack size, and it is safe if they do, hence let's permit getrlimit() (but not setrlimit()) by default. See: systemd#3970

This adds an optional "recursive" counting mode to sd_bus_track. If enabled adding the same name multiple times to an sd_bus_track object is counted individually, so that it also has to be removed the same number of times before it is gone again from the tracking object. This functionality is useful for implementing local ref counted objects that peers make take references on.

And while ware at it, also drop some references to kdbus, and stop claiming sd-bus wasn't stable yet. Also order man page references in the main sd-bus man page alphabetically.

This adds two (privileged) bus calls Ref() and Unref() to the Unit interface. The two calls may be used by clients to pin a unit into memory, so that various runtime properties aren't flushed out by the automatic GC. This is necessary to permit clients to race-freely acquire runtime results (such as process exit status/code or accumulated CPU time) on successful service termination. Ref() and Unref() are fully recursive, hence act like the usual reference counting concept in C. Taking a reference is a privileged operation, as this allows pinning units into memory which consumes resources. Transient units may also gain a reference at the time of creation, via the new AddRef property (that is only defined for transient units at the time of creation).

Let's make sure we can read the exit code/status properties exposed by PID 1 properly. Let's reuse the existing code for unsigned fields, as we just use it to copy words around, and don't calculate it.

Instead of ignoring empty strings retrieved via the bus, treat them as NULL, as it's customary in systemd.

…nknown type Make sure we return proper errors for types not understood yet.

It is useful for clients to be able to read the last CPU usage counter value of a unit even if the unit is already terminated. Hence, before destroying a cgroup's cgroup cache the last CPU usage counter and return it if the cgroup is gone.

Also, make major improvements to the an page in general.

…unction When a bus connection is closed we dispatch all reply callbacks. Do so in a new function if its own. No behaviour changes.

…mes to them In order to add a name to a bus tracking object we need to do some bus operations: we need to check if the name already exists and add match for it. Both are synchronous bus calls. While processing those we need to make sure that the tracking object is not dispatched yet, as it might still be empty, but is not going to be empty for very long. hence, block dispatching by removing the object from the dispatch queue while adding it, and readding it on error.

…racking objects immediately If the server side kicks us from the bus, from our view no names are on the bus anymore, hence let's make sure to dispatch all tracking objects immediately.

This tests in particular that disconnecting results in the tracking object's handlers to be called.

Old libdbus has a feature that the process is terminated whenever the the bus connection receives a disconnect. This is pretty useful on desktop apps (where a disconnect indicates session termination), as well as on command line apps (where we really shouldn't stay hanging in most cases if dbus daemon goes down). Add a similar feature to sd-bus, but make it opt-in rather than opt-out, like it is on libdbus. Also, if the bus is attached to an event loop just exit the event loop rather than the the whole process.

bus_connect_transport() is exclusively used from our command line tools, hence let's set exit-on-disconnect for all of them, making behaviour a bit nicer in case dbus-daemon goes down.

poettering · 2016-08-22T15:45:55Z

I force pushed another version now, that is rebase on current git. It also adds a short doc fix, as suggested by @arvidjaar. It also adds infrastructure to deal with the issue pointed out by @evverx: we need to handle the tracking of peers properly if we lose the bus connection and hence will never actually see the peer disappear from the bus.

evverx · 2016-08-23T09:35:31Z

we need to handle the tracking of peers properly if we lose the bus connection and hence will never actually see the peer disappear from the bus

@poettering , thanks!
This fixes the issue.

poettering added pid1 run labels Aug 18, 2016

poettering changed the title ~~permit bus clients to pin units to avoid automatica GC~~ permit bus clients to pin units to avoid automatic GC Aug 18, 2016

This was referenced Aug 19, 2016

Watch the stream of service status changes #3726

Open

seccomp: missing getrlimit from default whitelisting #3970

Closed

poettering force-pushed the refcnt branch from 1aa1961 to 6c12ba6 Compare August 19, 2016 17:31

poettering added 18 commits August 22, 2016 14:17

seccomp: make sure getrlimit() is among the default permitted syscalls

4a4485a

A lot of basic code wants to know the stack size, and it is safe if they do, hence let's permit getrlimit() (but not setrlimit()) by default. See: systemd#3970

man: don't claim arguments where const that actually are not

512e3bb

man: document sd_bus_track objects

ecddb2b

And while ware at it, also drop some references to kdbus, and stop claiming sd-bus wasn't stable yet. Also order man page references in the main sd-bus man page alphabetically.

bus-util: support mapping signed integers with bus_map_properties()

bdf97b8

Let's make sure we can read the exit code/status properties exposed by PID 1 properly. Let's reuse the existing code for unsigned fields, as we just use it to copy words around, and don't calculate it.

bus-util: treat an empty string as a NULL

0b83b8a

Instead of ignoring empty strings retrieved via the bus, treat them as NULL, as it's customary in systemd.

bus-util: make sure map_basic() returns EOPNOTSUPP if called for an u…

8b3b6f5

…nknown type Make sure we return proper errors for types not understood yet.

run: optionally, wait for the service to finish and show its result

2a453c2

man: document the new --wait switch of systemd-run

014c4c9

Also, make major improvements to the an page in general.

core: let's use set_contains() where appropriate

390bc2b

sd-bus: split out handling of reply callbacks on close into its own f…

217fcc7

…unction When a bus connection is closed we dispatch all reply callbacks. Do so in a new function if its own. No behaviour changes.

sd-bus: when the server-side disconnects, make sure to dispatch all t…

232f367

…racking objects immediately If the server side kicks us from the bus, from our view no names are on the bus anymore, hence let's make sure to dispatch all tracking objects immediately.

sd-bus: add a small test case for sd_bus_track

70cb8b7

This tests in particular that disconnecting results in the tracking object's handlers to be called.

update TODO

2c5f295

bus-util: turn on exit-on-disconnect for all command line tools

3830349

bus_connect_transport() is exclusively used from our command line tools, hence let's set exit-on-disconnect for all of them, making behaviour a bit nicer in case dbus-daemon goes down.

poettering force-pushed the refcnt branch from 6c12ba6 to 3830349 Compare August 22, 2016 15:44

evverx merged commit 6afe14f into systemd:master Aug 26, 2016

evverx mentioned this pull request Aug 26, 2016

Add new ProtectKernelTunables=, ProtectControlGroups=, ProtectSystem=strict unit file settings and more #4018

Closed

davide125 mentioned this pull request Oct 21, 2016

RFE: add --wait to systemd-mount #4437

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

permit bus clients to pin units to avoid automatic GC#3984

permit bus clients to pin units to avoid automatic GC#3984
evverx merged 19 commits intosystemd:masterfrom
poettering:refcnt

poettering commented Aug 18, 2016

Uh oh!

poettering commented Aug 18, 2016

Uh oh!

martinpitt commented Aug 19, 2016

Uh oh!

poettering commented Aug 19, 2016

Uh oh!

davide125 commented Aug 19, 2016

Uh oh!

arvidjaar commented Aug 19, 2016

Uh oh!

evverx commented Aug 19, 2016

Uh oh!

poettering commented Aug 22, 2016

Uh oh!

evverx commented Aug 23, 2016

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

poettering commented Aug 18, 2016

Uh oh!

poettering commented Aug 18, 2016

Uh oh!

martinpitt commented Aug 19, 2016

Uh oh!

poettering commented Aug 19, 2016

Uh oh!

davide125 commented Aug 19, 2016

Uh oh!

arvidjaar commented Aug 19, 2016

Uh oh!

evverx commented Aug 19, 2016

Uh oh!

poettering commented Aug 22, 2016

Uh oh!

evverx commented Aug 23, 2016

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants