Description
We're running docker:dind in a Kubernetes pod for CI. With the Docker 20 version of dind (which now uses the v2 shim) this is now broken and errors with a message similar to...
docker: Error response from daemon: io.containerd.runc.v2: failed to adjust OOM score for shim: set shim OOM score: write /proc/211/oom_score_adj: invalid argument
The valid range for oom_score_adj is between -1000 and 1000. By default Kubernetes uses 1000 for BestEffort. What's happening is the logic in https://github.com/containerd/containerd/blob/master/runtime/v2/shim/util_unix.go#L62 in this set-up sets the value to 1001 resulting in the invalid argument. (NOTE: This is not reproducible with Docker Desktop where it uses -500 for BestEffort!)
When the oom_score_adj is already set to 1000 / best effort it does not make sense to add 1. We should consider having a check for that case in AdjustOOMScore
Steps to reproduce the issue:
kubectl run mydind --privileged --image docker:dind --- or specifically docker:20.10.0-dind
kubectl exec mydind -- docker run hello-world
kubectl exec mydind -- sh -c 'echo 1001 > /proc/1/oom_score_adj'
kubectl delete pod mydind
Describe the results you received:
Container fails with ...
$ kubectl exec mydind -- docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
0e03bdcc26d7: Pulling fs layer
0e03bdcc26d7: Verifying Checksum
0e03bdcc26d7: Download complete
0e03bdcc26d7: Pull complete
Digest: sha256:1a523af650137b8accdaed439c17d684df61ee4d74feac151b5b337bd29e7eec
Status: Downloaded newer image for hello-world:latest
docker: Error response from daemon: io.containerd.runc.v2: failed to adjust OOM score for shim: set shim OOM score: write /proc/211/oom_score_adj: invalid argument
: exit status 1: unknown.
time="2020-12-13T17:17:00Z" level=error msg="error waiting for container: context canceled"
command terminated with exit code 125
Output of containerd --version:
containerd github.com/containerd/containerd v1.4.3 269548fa27e0089a8b8278fc4fc781d7f65a939b
Description
We're running
docker:dindin a Kubernetes pod for CI. With the Docker 20 version of dind (which now uses the v2 shim) this is now broken and errors with a message similar to...The valid range for oom_score_adj is between -1000 and 1000. By default Kubernetes uses 1000 for BestEffort. What's happening is the logic in https://github.com/containerd/containerd/blob/master/runtime/v2/shim/util_unix.go#L62 in this set-up sets the value to 1001 resulting in the
invalid argument. (NOTE: This is not reproducible with Docker Desktop where it uses -500 for BestEffort!)When the oom_score_adj is already set to 1000 / best effort it does not make sense to add 1. We should consider having a check for that case in
AdjustOOMScoreSteps to reproduce the issue:
kubectl run mydind --privileged --image docker:dind--- or specifically docker:20.10.0-dindkubectl exec mydind -- docker run hello-worldkubectl exec mydind -- sh -c 'echo 1001 > /proc/1/oom_score_adj'kubectl delete pod mydindDescribe the results you received:
Container fails with ...
Output of
containerd --version: