Skip to content

rootless: fix getCurrentOOMScoreAdj (fix rootless docker in kubernetes)#42189

Merged
tiborvass merged 1 commit intomoby:masterfrom
AkihiroSuda:specconv-fix-trimspace
Apr 1, 2021
Merged

rootless: fix getCurrentOOMScoreAdj (fix rootless docker in kubernetes)#42189
tiborvass merged 1 commit intomoby:masterfrom
AkihiroSuda:specconv-fix-trimspace

Conversation

@AkihiroSuda
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda commented Mar 23, 2021

- What I did

Fix #40068: rootless docker in kubernetes: "getting the final child's pid from pipe caused \"EOF\": unknown"

- How I did it

Fixed getCurrentOOMScoreAdj() to trim "\n" before calling strconv.Atoi.

- How to verify it
rootless.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: rootless
spec:
  containers:
    # dind:local is from "docker:20.10.5-dind" but the following files are updated:
    # - /usr/local/bin/dockerd     (for this PR)
    # - /usr/local/bin/containerd* (for https://github.com/containerd/containerd/pull/4845)
    - image: dind:local
      imagePullPolicy: Never
      name: rootless
      env:
        - name: DOCKER_HOST
          value: unix:///run/user/1000/docker.sock
      securityContext:
        privileged: true
$ kind create cluster
$ kubectl apply -f rootless.yaml
$ kubectl exec rootless -- docker run hello-world
...
Hello from Docker!

- Description for the changelog

Fix `rootless docker in kubernetes: "getting the final child's pid from pipe caused "EOF": unknown"

- A picture of a cute animal (not mandatory but encouraged)
🐧

@AkihiroSuda AkihiroSuda changed the title rootless: fix getCurrentOOMScoreAdj rootless: fix getCurrentOOMScoreAdj (fix rootless docker in kubernetes) Mar 23, 2021
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering; if the issue is that we're trying to set an OOM score that's out of range; should we check limits in toRootless, and clip to max / min (-1000 .... 1000)?

if spec.Process.OOMScoreAdj != nil && *spec.Process.OOMScoreAdj < currentOOMScoreAdj {
*spec.Process.OOMScoreAdj = currentOOMScoreAdj
}

Comment thread rootless/specconv/specconv_linux.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know under what circumstances this fails? Would it be an os.IsNotExist()? (in that case, should we just ignore?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never happens, IIUC

Comment thread rootless/specconv/specconv_linux.go Outdated
Comment on lines 27 to 30
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log the value before trimming? (also simplifies the code a bit)

Suggested change
s := string(b)
i, err := strconv.Atoi(strings.TrimSpace(s))
if err != nil {
logrus.WithError(err).Warn("failed to parse /proc/self/oom_score_adj (%q)", s)
i, err := strconv.Atoi(strings.TrimSpace(string(b)))
if err != nil {
logrus.WithError(err).Warn("failed to parse /proc/self/oom_score_adj (%q)", b)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already logging untrimmed string

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! 🤦 yes, it's just the extra variable to convert []byte to string

@AkihiroSuda
Copy link
Copy Markdown
Member Author

No, not out of range.

Docker always attempts to launch a container with oomScoreAdj=0, but the daemon's oomScoreAdj is set to 1000 by Kube, so we need to adjust the oomScoreAdj of the container being created from 0 to 1000.

`getCurrentOOMScoreAdj()` was broken because `strconv.Atoi()` was called
without trimming "\n".

Fix issue 40068: `rootless docker in kubernetes: "getting the final child's pid from
pipe caused \"EOF\": unknown"

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda AkihiroSuda force-pushed the specconv-fix-trimspace branch from da44f09 to d6ddfb6 Compare March 24, 2021 05:42
@thaJeztah
Copy link
Copy Markdown
Member

Docker always attempts to launch a container with oomScoreAdj=0

I recall changing the daemon itself to not adjust if no custom oom-score is set (#41527); in this case, the daemon runs inside a container with an OOM-score "X", but the daemon itself does not have an --oom-score-adj set (so we don't adjust); shouldn't we do the same for containers running in that environment? i.e., we didn't adjust their score, so inherit the current value (which on a host would be "don't adjust", but in this container would be "x")? I would consider the oom-score to be "relative to" the current environment. Would that work?

@AkihiroSuda
Copy link
Copy Markdown
Member Author

I think that's possible by changing OomScoreAdj int to OomScoreAdj *int for allowing nil:

OomScoreAdj int // Container preference for OOM-killing

But that doesn't work with existing client implementations.

So I suggest merging this PR, and looking into possibility of the API change in a separate PR.

@thaJeztah
Copy link
Copy Markdown
Member

But that doesn't work with existing client implementations.
So I suggest merging this PR, and looking into possibility of the API change in a separate PR.

Yes, we can fix that in a follow-up; I went looking for the original commit that added --oom-score-adj; d3af7f2 (#16277). From that PR it looks like we just added 0 as default; given that that is the default value (and not a choice of the user), and effectively should mean "don't adjust", I think we may even be able to change without changing it to a pointer (just consider 0 to mean "don't adjust"); I see we currently don't have an option to update the value in docker update, so from that perspective, it should not be ambiguous as well (if 0 could mean "reset to default" - not sure if even possible)

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah
Copy link
Copy Markdown
Member

@cpuguy83 PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rootless docker in kubernetes: "getting the final child's pid from pipe caused \"EOF\"": unknown"

3 participants