Skip to content

libcontainerd/supervisor: refactoring and cleanup in preparation of custom containerd.toml#43945

Merged
tianon merged 10 commits intomoby:masterfrom
thaJeztah:libcontainer_supervisor_cleanups
Aug 11, 2022
Merged

libcontainerd/supervisor: refactoring and cleanup in preparation of custom containerd.toml#43945
tianon merged 10 commits intomoby:masterfrom
thaJeztah:libcontainer_supervisor_cleanups

Conversation

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Aug 10, 2022

libcontainerd/supervisor: remove unused RWMutex

This RWMutex was added in 9c4570a (#20662), and used in the remote.Client() method. Commit dd2e19e (#37149) split the code for client and daemon, but did not remove the mutex.

libcontainerd/supervisor: remove unused remote.rootDir

libcontainerd/supervisor: platformCleanup(): use canonical socket address

Consider Address() (Config.GRPC.Addres) to be the source of truth for the location of the containerd socket.

libcontainerd/supervisor: use correct logger

Don't call logrus directly, but use the logger that was set.

libcontainerd/supervisor: make supervisor adjust OOM score for containerd

Containerd, like dockerd has a OOMScore configuration option to adjust its own OOM score. In dockerd, this option was added when default installations were not yet running the daemon as a systemd unit, which made it more complicated to set the score, and adding a daemon option was convenient.

A binary adjusting its own score has been frowned upon, as it's more logical to make that the responsibility of the process manager starting the daemon, which is what we did for dockerd in 2157853 (#42373).

There have been discussions on deprecating the daemon flag for dockerd, and similar discussions have been happening for containerd.

This patch changes how we set the OOM score for the containerd child process, and to have dockerd (supervisor) set the OOM score, as it's acting as process manager in this case (performing a role similar to systemd otherwise).

With this patch, the score is still adjusted as usual, but not written to the containerd configuration file;

dockerd --oom-score-adjust=-123
cat /proc/$(pidof containerd)/oom_score_adj
-123

As a follow-up, we may consider to adjust the containerd OOM score based on the daemon's own score instead of on the cli.OOMScoreAdjust configuration so that we will also adjust the score in situations where dockerd's OOM score was set through other ways (systemd or manually adjusting the cgroup). A TODO was added for this.

libcontainerd/supervisor: store location of pidFile

Adding a remote.pidFile to store the location instead of re-constructing its location each time. Also performing a small refactor to use strconv.Itoa instead of fmt.Sprintf.

libcontainerd/supervisor: store location of config-file

Adding a remote.configFile to store the location instead of re-constructing its location each time. Also fixing a minor inconsistency in the error formats.

libcontainerd/supervisor: don't write log-level to config file

The --log-level flag overrides whatever is in the containerd configuration file;
https://github.com/containerd/containerd/blob/f033f6ff85ce397202ee69de22b4e3a4bf4093f5/cmd/containerd/command/main.go#L339-L352

Given that we set that flag when we start the containerd binary, there is no need to write it both to the generated config-file and pass it as flag.

This patch also slightly changes the behavior; as both dockerd and containerd use "info" as default log-level, don't set the log-level if it's the default.

cmd/dockerd: initContainerD(): clean-up some logs

Change the log-level for messages about starting the managed containerd instance to be the same as for the main API. And remove a redundant debug-log.

With this patch:

dockerd
INFO[2022-08-11T11:46:32.573299176Z] Starting up
INFO[2022-08-11T11:46:32.574304409Z] containerd not running, starting managed containerd
INFO[2022-08-11T11:46:32.575289181Z] started new containerd process                address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=5370

cmd/dockerd: initContainerd() use early return

  • return early if we're expecting a system-containerd
  • rename initContainerD to initContainerd ':)
  • remove .Config to reduce verbosity

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah
Copy link
Member Author

follow-up PR in #43946

@thaJeztah
Copy link
Member Author

Ah, booh

cmd/dockerd/daemon_unix.go:160:2: printf: github.com/docker/docker/vendor/github.com/sirupsen/logrus.Info call has possible formatting directive %s (govet)
	logrus.Info("started managed containerd, listening on %s", cli.ContainerdAddr)
	^

This RWMutex was added in 9c4570a, and used in
the `remote.Client()` method. Commit dd2e19e
split the code for client and daemon, but did not remove the mutex.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the libcontainer_supervisor_cleanups branch from 6e9b011 to aa9f49c Compare August 10, 2022 21:17
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important that functional options are always optional as there are no compile-time checks to catch mistakes. The compiler will happily compile a function call with no optional arguments so a function call with no optional arguments should do something reasonable. It looks like the WithPIDFile and WithPlatformDefaults options are mandatory as supervisor.Start will either fail to start containerd, or will start then immediately kill it, unless both options are passed in. The WithPIDFile option is easy enough to make truly optional simply by writing the PID file iff the option is passed in. As for WithPlatformDefaults, a default is usually implied and only the overrides are explicit options. I think (*remote).setDefaults and having the rootDir argument on supervisor.Start was the right design. Someone using the WithCustomConfigFile option would be free to pass the empty string "" for the stateDir argument, but they have to be explicit about it. I expect reviewers to see an empty-string argument as a sign that the function call needs more intense scrutiny. Forgetting a mandatory optional argument, on the other hand, leaves no clues at the call site that something might be amiss, so a reviewer not already familiar with that function may skim right past it without a second thought.

@thaJeztah
Copy link
Member Author

I think it's important that functional options are always optional as there are no compile-time checks to catch mistakes.
...
I think (*remote).setDefaults and having the rootDir argument on supervisor.Start was the right design.

Yes, I agree. I know I played with some variations and moving setting the defaults to an optional argument may not be the right thing to do; I initially un-exported the option and appended it to opts inside Start(), but at some point moved it.

As to "what arguments are optional", things are a bit more fuzzy; effectively, the only things required are;

  • the socket location (as this is what dockerd needs to connect)
  • a location for the PIDfile (which can use a default that's based on the socket location (socket-path + .pid))

Currently, both of those use a default that's based on execDir (out of convenience), but setting the "root" and "exec" directories themselves are only needed when we manage containerd's configuration to make them use a location different from containerd's defaults.

With the work on containerd integration, we should actually start to consider using containerd's own defaults as default as well, so that manually starting dockerd for debugging continues to use the same configuration as when running dockerd as a service, i.e.;

systemctl stop docker

dockerd --debug

@thaJeztah
Copy link
Member Author

I'll update some bits in this PR, and will move the WithPidFile() and WithPlatformDefaults() changes to the other PR and work a bit on them.

@thaJeztah thaJeztah force-pushed the libcontainer_supervisor_cleanups branch from aa9f49c to 02c6080 Compare August 11, 2022 12:06
…ress

Consider Address() (Config.GRPC.Addres) to be the source of truth for
the location of the containerd socket.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
Don't call logrus directly, but use the logger that was set.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
…nerd

Containerd, like dockerd has a OOMScore configuration option to adjust its own
OOM score. In dockerd, this option was added when default installations were not
yet running the daemon as a systemd unit, which made it more complicated to set
the score, and adding a daemon option was convenient.

A binary adjusting its own score has been frowned upon, as it's more logical to
make that the responsibility of the process manager _starting_ the daemon, which
is what we did for dockerd in 2157853.

There have been discussions on deprecating the daemon flag for dockerd, and
similar discussions have been happening for containerd.

This patch changes how we set the OOM score for the containerd child process,
and to have dockerd (supervisor) set the OOM score, as it's acting as process
manager in this case (performing a role similar to systemd otherwise).

With this patch, the score is still adjusted as usual, but not written to the
containerd configuration file;

    dockerd --oom-score-adjust=-123
    cat /proc/$(pidof containerd)/oom_score_adj
    -123

As a follow-up, we may consider to adjust the containerd OOM score based on the
daemon's own score instead of on the `cli.OOMScoreAdjust` configuration so that
we will also adjust the score in situations where dockerd's OOM score was set
through other ways (systemd or manually adjusting the cgroup). A TODO was added
for this.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
Adding a remote.pidFile to store the location instead of re-constructing its
location each time. Also performing a small refactor to use `strconv.Itoa`
instead of `fmt.Sprintf`.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
Adding a remote.configFile to store the location instead of re-constructing its
location each time. Also fixing a minor inconsistency in the error formats.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
the `--log-level` flag overrides whatever is in the containerd configuration file;
https://github.com/containerd/containerd/blob/f033f6ff85ce397202ee69de22b4e3a4bf4093f5/cmd/containerd/command/main.go#L339-L352

Given that we set that flag when we start the containerd binary, there is no need
to write it both to the generated config-file and pass it as flag.

This patch also slightly changes the behavior; as both dockerd and containerd use
"info" as default log-level, don't set the log-level if it's the default.

Signed-off-by: Sebastiaan van Stijn <[email protected]>
Change the log-level for messages about starting the managed containerd instance
to be the same as for the main API. And remove a redundant debug-log.

With this patch:

    dockerd
    INFO[2022-08-11T11:46:32.573299176Z] Starting up
    INFO[2022-08-11T11:46:32.574304409Z] containerd not running, starting managed containerd
    INFO[2022-08-11T11:46:32.575289181Z] started new containerd process                address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=5370
cmd/dockerd: initContainerD(): clean-up some logs

Signed-off-by: Sebastiaan van Stijn <[email protected]>
- return early if we're expecting a system-containerd
- rename `initContainerD` to `initContainerd` ':)
- remove .Config to reduce verbosity

Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the libcontainer_supervisor_cleanups branch from 02c6080 to 226e071 Compare August 11, 2022 12:11
@@ -144,17 +153,16 @@ func (r *remote) getContainerdPid() (int, error) {
}

func (r *remote) getContainerdConfig() (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I am not a fan of functions with side-effects masquerading as getters. And returning the path to the config file hides the fact there is a single source of truth.

Suggested change
func (r *remote) getContainerdConfig() (string, error) {
func (r *remote) writeContainerdConfig(path string) error {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh. I agree!! "get" is a pretty bad name for this (and I had to look a few times "where is that thing created??)

I initially had a commit that renamed it to generateContainerdConfig(), but combined with #43946, that no longer made sense (that PR changes it to skip the generating and only returns the name when using a custom config), so I removed that change.

Let me look at some of that in the follow-up as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ensure" ? 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure could work; it stands out enough to make anyone reading the code aware "it's doing something"

@thaJeztah
Copy link
Member Author

FWIW: #43945 (comment)

which can use a default that's based on the socket location (socket-path + .pid)

So I was about to make a change like this (pid always/defaulting to same location as .socket), but then recalled "windows", which uses a named-pipe for the "socket", so I'll need to give that some thinking.

@tianon tianon merged commit 4d70822 into moby:master Aug 11, 2022
@thaJeztah thaJeztah deleted the libcontainer_supervisor_cleanups branch August 11, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants