Skip to content

daemon: read-copy-update the daemon config#43980

Merged
corhere merged 6 commits intomoby:masterfrom
corhere:rcu-daemon-config
Jun 1, 2023
Merged

daemon: read-copy-update the daemon config#43980
corhere merged 6 commits intomoby:masterfrom
corhere:rcu-daemon-config

Conversation

@corhere
Copy link
Contributor

@corhere corhere commented Aug 17, 2022

- What I did
Made daemon config reloading transactional and almost entirely atomic. Config reloading will either succeed or leave the daemon state untouched. Reloading no longer pauses the world. Daemon operations see a consistent and immutable view of the daemon config, even when racing a reload.

- How I did it

  • Made the config.Config struct copyable by removing the embedded mutexes
  • Implemented read-copy-update of the daemon config to replace the embedded mutex for synchronization
    • Reading from daemon config is request-scoped, like contexts: any operations which need to access the daemon config capture a reference to an immutable snapshot and pass it into any callees
  • Implemented two-phase commit to manage the side effects of config reloading
    • Side effects such as registryService.ReplaceConfig() are not necessarily atomic w.r.t. the reload transaction as they can take effect before the updated config is applied to configStore
  • Modified how runtimes wrapper scripts are written so that existing wrapper scripts are no longer deleted on reload
    • reloads do not invalidate old versions of the config that concurrent daemon operations might be holding references to, which might reference old wrapper scripts
    • FS errors while updating the wrapper scripts can no longer leave containers or the daemon in a bad state

- How to verify it
CI

- Description for the changelog

  • Configuration reloading has been made more robust: in the event that an error is encountered during the reload process, there is no longer a chance for some of the options to be applied anyway.

- A picture of a cute animal (not mandatory but encouraged)

@corhere corhere changed the title Rcu daemon config daemon: read-copy-update the daemon config Aug 17, 2022
@corhere corhere force-pushed the rcu-daemon-config branch 2 times, most recently from 6f1ae80 to 50eaf9f Compare August 19, 2022 00:41
@corhere corhere force-pushed the rcu-daemon-config branch 4 times, most recently from 556565e to 9f1d591 Compare August 31, 2022 22:28
@corhere corhere marked this pull request as ready for review August 31, 2022 22:28
@corhere corhere requested a review from cpuguy83 as a code owner August 31, 2022 22:28
@corhere corhere added this to the v-next milestone Sep 1, 2022
@corhere corhere added the kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. label Sep 1, 2022
@rumpl rumpl modified the milestones: 24.0.0, v-future Apr 24, 2023
Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation details look pretty hairy, but it does seem to be pretty comprehensive (including new tests 👀)

)

// RegistryHosts returns registry configuration in containerd resolvers format.
func (conf *Config) RegistryHosts() docker.RegistryHosts {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer's note: this is just moved from daemon/daemon.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, I had to move it back. On a positive note, reloaded registry configuration will now be picked up by Buildkit!

Comment on lines +110 to +179
// A non-standard Base32 encoding which lacks vowels to avoid accidentally
// spelling naughty words. Don't use this to encode any data which requires
// compatibility with anything outside of the currently-running process.
var base32Disemvoweled = base32.NewEncoding("0123456789BCDFGHJKLMNPQRSTVWXYZ-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂 ❤️

(Should this also remove "vowel looking" numbers like 0, 1, 3, 4?)

@cpuguy83
Copy link
Member

This LGTM, sadly a nasty rebase.
Sorry for the long wait on reviewing this.

@corhere corhere force-pushed the rcu-daemon-config branch 2 times, most recently from 4be5647 to 9007e62 Compare May 24, 2023 20:47
@corhere
Copy link
Contributor Author

corhere commented May 24, 2023

It was a nasty rebase, though on the bright side, I spotted and fixed some mistakes in the process and made some improvements.

@corhere corhere force-pushed the rcu-daemon-config branch 2 times, most recently from b0c1a58 to a8fb1df Compare May 25, 2023 15:00
backend Backend
daemon experimentalProvider
routes []router.Route
features *map[string]bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 🤦
I totally missed that this was happening.

@thaJeztah
Copy link
Member

discussing with @corhere - he wants to look at splitting out one fix from this PR to make it back-portable, so let's wait with merging for that.

@corhere corhere force-pushed the rcu-daemon-config branch from 5c45f9e to 2340c39 Compare May 25, 2023 21:15
@corhere
Copy link
Contributor Author

corhere commented May 25, 2023

@thaJeztah all done: the first two commits are now the backport-able fixes. I went through the diffs again to confirm that those are the only changes worth backporting.

@corhere corhere force-pushed the rcu-daemon-config branch 2 times, most recently from 27fc3db to c1caa19 Compare May 26, 2023 20:00
corhere added 6 commits June 1, 2023 14:43
Passing around a bare pointer to the map of configured features in order
to propagate to consumers changes to the configuration across reloads is
dangerous. Map operations are not atomic, so concurrently reading from
the map while it is being updated is a data race as there is no
synchronization. Use a getter function to retrieve the current features
map so the features can be retrieved race-free.

Remove the unused features argument from the build router.

Signed-off-by: Cory Snider <[email protected]>
Historically, daemon.RegistryHosts() has returned a docker.RegistryHosts
callback function which closes over a point-in-time snapshot of the
daemon configuration. When constructing the BuildKit builder at daemon
startup, the return value of daemon.RegistryHosts() has been used.
Therefore the BuildKit builder would use the registry configuration as
it was at daemon startup for the life of the process, even if the
registry configuration is changed and the configuration reloaded.
Provide BuildKit with a RegistryHosts callback which reflects the
live daemon configuration after reloads so that registry operations
performed by BuildKit always use the same configuration as the rest of
the daemon.

Signed-off-by: Cory Snider <[email protected]>
Config reloading has interleaved validations and other fallible
operations with mutating the live daemon configuration. The daemon
configuration could be left in a partially-reloaded state if any of the
operations returns an error. Mutating a copy of the configuration and
atomically swapping the config struct on success is not currently an
option as config values are not copyable due to the presence of
sync.Mutex fields. Introduce a two-phase commit protocol to defer any
mutations of the daemon state until after all fallible operations have
succeeded.

Reload transactions are not yet entirely hermetic. The platform
reloading logic for custom runtimes on *nix could still leave the
directory of generated runtime wrapper scripts in an indeterminate state
if an error is encountered.

Signed-off-by: Cory Snider <[email protected]>
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <[email protected]>
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <[email protected]>
The daemon has made a habit of mutating the DefaultRuntime and Runtimes
values in the Config struct to merge defaults. This would be fine if it
was a part of the regular configuration loading and merging process,
as is done with other config options. The trouble is it does so in
surprising places, such as in functions with 'verify' or 'validate' in
their name. It has been necessary in order to validate that the user has
not defined a custom runtime named "runc" which would shadow the
built-in runtime of the same name. Other daemon code depends on the
runtime named "runc" always being defined in the config, but merging it
with the user config at the same time as the other defaults are merged
would trip the validation. The root of the issue is that the daemon has
used the same config values for both validating the daemon runtime
configuration as supplied by the user and for keeping track of which
runtimes have been set up by the daemon. Now that a completely separate
value is used for the latter purpose, surprising contortions are no
longer required to make the validation work as intended.

Consolidate the validation of the runtimes config and merging of the
built-in runtimes into the daemon.setupRuntimes() function. Set the
result of merging the built-in runtimes config and default default
runtime on the returned runtimes struct, without back-propagating it
onto the config.Config argument.

Signed-off-by: Cory Snider <[email protected]>
Iv3-0

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/daemon Core Engine impact/changelog kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/4-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants