Skip to content

Fix concurrency issue in dind#48850

Merged
thaJeztah merged 1 commit intomoby:masterfrom
JSchltggr:fix-sed-nix-dind
Feb 1, 2025
Merged

Fix concurrency issue in dind#48850
thaJeztah merged 1 commit intomoby:masterfrom
JSchltggr:fix-sed-nix-dind

Conversation

@JSchltggr
Copy link
Copy Markdown

When dind is run alongside with other processes, it might happen that the first xargs call fails (ignored) but then also the call to sed fails. Leading to the following error messages and dind terminating (set -e) :

echo: write error: No such process
sed: write error

This fixes the behavior by ignoring non 0 exit codes from sed

- What I did
Start a dind container which spawns some short living processes. Sometimes the call to sed will fail, because the process is already killed.

- How I did it
Create a simple Dockerfile and entrypoint:

Dockerfile

FROM docker:27.1.2-dind

RUN apk add --no-cache bash

COPY --chmod=755 entrypoint.sh /app/bin/entrypoint.sh
WORKDIR /app/bin
ENTRYPOINT ["/app/bin/entrypoint.sh"]

entrypoint.sh

#!/bin/bash

WAIT_MAX_RETRIES=60
DOCKER_SOCKET=/var/run/docker.sock

# fork a command
fork() { (setsid "$@" &); }

is_docker_ready() {
    [[ -S "${DOCKER_SOCKET}" ]] && [[ -r "${DOCKER_SOCKET}" ]] && docker stats --no-stream &>/dev/null
}

wait_for(){
    check_function="${1}" && shift

    # Initialize retry counter
    retry_count=0

    # Loop until check_function is ready or max retries is reached
    while ! "${check_function}"; do
        if [[ "${retry_count}" -ge "${WAIT_MAX_RETRIES}" ]]; then
            echo "${check_function} was not ready after ${WAIT_MAX_RETRIES} seconds."
            exit 1
        fi

        echo "Waiting for ${check_function}... Attempt $((retry_count+1))/${WAIT_MAX_RETRIES}"
        sleep 1
        ((retry_count++))
    done

    echo "${check_function} is ready!"
}

#
# Main
#

# Create some short living processes
fork bash -c 'while true; do ls; done &>/dev/null'

# start docker
if ! is_docker_ready; then
    (
        # dockerd-entrypoint always spawns a tini processes
        # which, if TINI_SUBREAPER is not defined, expects to
        # to be PID=1 and hence failes sometimes
        # (possibly due concurrency issues)
        # This is why we tell tini here to register as child
        # subreaper. Which will make tini reap only processes
        # belonging to its subtree and allows it to run
        # as PID!=1. Note this only works on kernel >= 3.4
        #
        # ref:
        # - https://github.com/krallin/tini
        # - https://man7.org/linux/man-pages/man2/pr_set_child_subreaper.2const.html
        export TINI_SUBREAPER=1
        fork bash -x dockerd-entrypoint.sh
    )
fi

wait_for is_docker_ready

- How to verify it
Build the container

docker build -t dind .

Run the container in a loop and wait for the error to occur

while docker run --privileged dind true; do true; done

- Description for the changelog

ignore non 0 exit codes from sed

- A picture of a cute animal (not mandatory but encouraged)

@JSchltggr JSchltggr requested a review from tianon as a code owner November 11, 2024 16:06
@thaJeztah thaJeztah added dco/no Automatically set by a bot when one of the commits lacks proper signature area/contrib labels Nov 11, 2024
@JSchltggr JSchltggr force-pushed the fix-sed-nix-dind branch 2 times, most recently from 978045f to 51f4c9d Compare November 13, 2024 08:47
@tianon
Copy link
Copy Markdown
Member

tianon commented Nov 13, 2024

I've run into and tried to fix this exact same race several times, and it's complex because if this code fails, I think Docker-in-Docker will fail too, so it has to succeed and simply ignoring failure isn't correct. 😅

Every time I run into this, I fix it by checking whether dockerd is ready yet from a separate container. I bet you could emulate something similar by doing something like unshare --pid dind dockerd instead so it gets started in a totally new PID namespace. 🤔

@JSchltggr
Copy link
Copy Markdown
Author

JSchltggr commented Nov 14, 2024

Just to give you a bit of background I am running minikube with dind in a gitlab pipeline.

I think Docker-in-Docker will fail too, so it has to succeed and simply ignoring failure isn't correct.

Since I started ignoring sed failures by patching the dind script, docker in docker spins up successfully and I've never encountered the problem again and my pipelines run just fine.

Every time I run into this, I fix it by checking whether dockerd is ready yet from a separate container.

I can do this, but I don't understand how this would fix my problem. As dind terminates when sed returns with a non zero exit code. Which means dockerd never actually starts / becomes ready. If this is a race and sed needs to run successfully, should dind then maybe retry the sed line until it succeeds?

setup_cgroup_controllers(){
    sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
        > /sys/fs/cgroup/cgroup.subtree_control \
        &>/dev/null
}
# busy wait for setup_cgroup_controllers to complete
while ! setup_cgroup_controllers; do sleep 0; done

What do you think @tianon ?

@tianon
Copy link
Copy Markdown
Member

tianon commented Nov 16, 2024

Hmm, that's not a bad idea, although it would need to include the xargs line above it too, I think (since that's the line that moves all the processes such that the following line can succeed). 🤔

I don't think we need a function, and something like this could probably work (although needs some testing):

	mkdir -p /sys/fs/cgroup/init
	# this happens in a loop because things like "docker exec" on our dind
	# container will create new processes, which creates a race between our
	# moving everything to "init" and enabling subtree_control
	while ! {
		# move the processes from the root group to the /init group,
		# otherwise writing subtree_control fails with EBUSY.
		# An error during moving non-existent process (i.e., "cat") is ignored.
		xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
		# enable controllers
		sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
			> /sys/fs/cgroup/cgroup.subtree_control
	}; do true; done

(I don't think we need to worry too much about the error messages here -- this basically just puts the existing code into a tight little loop, as you've proposed.)

I forgot what I'd done previously to reproduce this reliably (I recall having a ~100% failure rate in the past), but with the following I managed to get something that does reproduce the race/failure some percentage of the time, and after applying the above change I was no longer able to reproduce:

#!/usr/bin/env bash
set -Eeuo pipefail -x

docker run -di --rm --name dind --mount "type=bind,src=$PWD/hack/dind,dst=/dind,ro" --entrypoint /dind --privileged docker:dind dockerd --log-level error
trap 'docker rm -vf dind' EXIT
docker logs -f dind &

while state="$(docker container inspect --format '{{ .State.Status }}' dind)" && [ "$state" = 'running' ] && ! docker exec dind docker version; do true; done

@Enteee
Copy link
Copy Markdown

Enteee commented Nov 18, 2024

@tianon : Awesome !

A few questions:

  • Are you aware then set -e is not inherited into the { ... }? This will print Should never get here:
#!/usr/bin/env bash
set -e
while ! {
    false
    true
}; do sleep 0; done
echo "Should never get here"
  • Did you keep the errorignore (... || :) in the xargs line on purpose?
  • If sed fails the xargs line will run again: Does this not cause side-effects?

@tianon
Copy link
Copy Markdown
Member

tianon commented Nov 21, 2024

  • Are you aware then set -e is not inherited into the { ... }? This will print Should never get here:

Yes, that's not a big deal in our case because the xargs line includes || : (my original version used && isntead of { ...; } for this very reason, but I realized the delta/diff would be smaller and easier to maintain if it was "just" indentation changes because it doesn't matter for us)

  • Did you keep the errorignore (... || :) in the xargs line on purpose?
  • If sed fails the xargs line will run again: Does this not cause side-effects?

Yes, in fact we need the xargs line to run again if the sed fails -- the xargs line is the one that "moves" processes from the root cgroup into the (newly created) init cgroup, and new processes showing up is why the sed fails, so while it doesn't matter as much for your specific case, if we want to solve for the general problem we have to assume the new processes that have shown up are not necessarily short-lived (and thus need to also be moved before we try again).

@Enteee
Copy link
Copy Markdown

Enteee commented Nov 26, 2024

ok, @tianon . Glad you considered all these things. 🚀
Then I'd say this is looking good. What does now need to happen for this to be merged?

Comment thread hack/dind Outdated
Comment on lines +65 to +68
xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
# enable controllers
sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
> /sys/fs/cgroup/cgroup.subtree_control
> /sys/fs/cgroup/cgroup.subtree_control || :
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#48850 (comment)

Suggested change
xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
# enable controllers
sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
> /sys/fs/cgroup/cgroup.subtree_control
> /sys/fs/cgroup/cgroup.subtree_control || :
# this happens in a loop because things like "docker exec" on our dind
# container will create new processes, which creates a race between our
# moving everything to "init" and enabling subtree_control
while ! {
# move the processes from the root group to the /init group,
# otherwise writing subtree_control fails with EBUSY.
# An error during moving non-existent process (i.e., "cat") is ignored.
xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
# enable controllers
sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
> /sys/fs/cgroup/cgroup.subtree_control
}; do true; done

(I'm happy to update this PR on your behalf if you'd prefer! ❤️)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thank you! @JSchltggr can you add this change? Otherwise, I can also open a new merge request.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot tianon!
Unfortunately it was not possible to commit the suggestion directly. I've added the changes to another commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably ok to squash the 2 commits if possible to keep a cleaner git history

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah can you not just squash merge?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We disabled squash merge on this repository as it discards GPG signing of the PR author, and generally prefer the contributor to squash commits (or organise PRs into logical commits, which is part of the review process), so if @JSchltggr is able to do a squash, that'd be great, otherwise I can manually squash and push to the branch through my maintainer permssions

Copy link
Copy Markdown

@Enteee Enteee Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two weeks without response from @JSchltggr , can you maybe go forward and squash merge this @thaJeztah ? Thank you lots 🙏

@Enteee
Copy link
Copy Markdown

Enteee commented Dec 20, 2024

@tianon : changes were added by @JSchltggr 🚀 I guess this can be merged now?

@Enteee
Copy link
Copy Markdown

Enteee commented Jan 29, 2025

@thaJeztah we got the squash 🎉 . Thank you @JSchltggr . Can this be merged now?

Copy link
Copy Markdown
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!!

@thaJeztah thaJeztah added status/4-merge and removed dco/no Automatically set by a bot when one of the commits lacks proper signature labels Feb 1, 2025
@thaJeztah thaJeztah added this to the 28.0.0 milestone Feb 1, 2025
@thaJeztah thaJeztah added the kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. label Feb 1, 2025
@thaJeztah thaJeztah merged commit 96ded2a into moby:master Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/contrib kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/4-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants