testing golangci-lint by kolyshkin · Pull Request #1 · kolyshkin/runc

kolyshkin · 2020-09-30T18:25:44Z

No description provided.

In case it takes more than 1 second for systemd to create a unit, startUnit() times out with a warning and then runc proceeds (to create cgroups using fs manager and so on). Now runc and systemd are racing, and multiple scenarios are possible. In one such scenario, by the time runc calls systemd manager's Apply() the unit is not yet created, the dbusConnection.SetUnitProperties() call fails with "unit xxx.scope not found", and the whole container start also fails. To eliminate the race, we need to return an error in case the timeout is hit. To reduce the chance to fail, increase the timeout from 1 to 30 seconds, to not error out too early on a busy/slow system (and times like 3-5 seconds are not unrealistic). While at it, as the timeout is quite long now, make sure to not leave a stray timer. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 3844789) Signed-off-by: Kir Kolyshkin <[email protected]>

It is not a good practice to have the name `err` for the error returned by a function. Switch to `retErr`. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 8699596) Signed-off-by: Kir Kolyshkin <[email protected]>

In case cgroup configuration is invalid (some parameters can't be set etc.), p.manager.Set fails, the error is returned, and then we try to remove cgroups (by calling p.manager.Destroy) in a defer. The problem is, the container init is not yet killed (as it is killed in the caller, i.e. (*linuxContainer).start), so cgroup removal fails like this: > time="2020-09-26T07:46:25Z" level=warning msg="Failed to remove cgroup (will retry)" error="rmdir /sys/fs/cgroup/net_cls,net_prio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod28ce6e74_694c_4b77_a953_dc01e182ac76.slice/crio-f6984c5eeb6c6b49ff3f036bdcb9ded317b3d0b2469ebbb35705442a2afd98c2.scope: device or resource busy" > ... > time="2020-09-26T07:46:27Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/net_cls,net_prio/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod28ce6e74_694c_4b77_a953_dc01e182ac76.slice/crio-f6984c5eeb6c6b49ff3f036bdcb9ded317b3d0b2469ebbb35705442a2afd98c2.scope: device or resource busy" The above is repeated for every controller, and looks quite scary. To fix, move the init termination to the abovementioned defer. Do the same for (*setnsProcess).start() for uniformity. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit dc42459) Signed-off-by: Kir Kolyshkin <[email protected]>

When we call terminate(), we kill the process, and wait returns the error indicating the process was killed. This is exactly what we expect here, so there is no reason to treat it as an error. Before this patch, when a container with invalid cgroup parameters is started: > WARN[0000] unable to terminate initProcess error="signal: killed" > ERRO[0000] container_linux.go:366: starting container process caused: process_linux.go:495: container init caused: process_linux.go:458: setting cgroup config for procHooks process caused: failed to write "555": open /sys/fs/cgroup/blkio/user.slice/xx33/blkio.weight: permission denied After: > ERRO[0000] container_linux.go:366: starting container process caused: process_linux.go:495: container init caused: process_linux.go:458: setting cgroup config for procHooks process caused: failed to write "555": open /sys/fs/cgroup/blkio/user.slice/xx33/blkio.weight: permission denied I.e. the useless warning is gone. NOTE this breaks a couple of integration test cases, since they were expecting a particular message in the second line, and now due to "signal: killed" removed it's in the first line. Fix those, too. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit a4d5e8a) Signed-off-by: Kir Kolyshkin <[email protected]>

Aleksa Sarai (2): VERSION: back to development VERSION: release 1.0.0~rc93 Vote: +6 -0 #1 Closes opencontainers#2784

When a hook has failed, the error message looks like this: > error running hook: error running hook #1: exit status 1, stdout: ... The two problems here are: 1. it is impossible to know what kind of hook it was; 2. "error running hook" stuttering; Change that to > error running createContainer hook #1: exit status 1, stdout: ... Signed-off-by: Kir Kolyshkin <[email protected]>

MAINTAINERS: add Rodrigo Campos LGTMs: crosbymichael mrunalp dqminh hqhq cyphar AkihiroSuda kolyshkin thaJeztah lifubang Vote: +8 -0 #1

kolyshkin added 4 commits September 29, 2020 19:53

libct/(*setnsProcess).Start: use retErr

c12b927

It is not a good practice to have the name `err` for the error returned by a function. Switch to `retErr`. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 8699596) Signed-off-by: Kir Kolyshkin <[email protected]>

kolyshkin mentioned this pull request Sep 30, 2020

CI: add golangci-lint via github actions opencontainers/runc#2618

Merged

kolyshkin closed this Sep 30, 2020

kolyshkin reopened this Sep 30, 2020

kolyshkin force-pushed the master branch from ba64137 to cf6c074 Compare November 3, 2020 19:36

kolyshkin force-pushed the master branch 6 times, most recently from e78b8fd to ba0c20e Compare November 17, 2020 02:11

kolyshkin force-pushed the master branch from ba0c20e to 2a50985 Compare December 2, 2020 03:38

kolyshkin pushed a commit that referenced this pull request Feb 4, 2021

merge branch 'pr-2784'

7c11557

Aleksa Sarai (2): VERSION: back to development VERSION: release 1.0.0~rc93 Vote: +6 -0 #1 Closes opencontainers#2784

kolyshkin closed this Jun 1, 2021

kolyshkin added a commit that referenced this pull request Jun 23, 2024

Merge pull request opencontainers#4314 from kolyshkin/maint-rata

90ebcca

MAINTAINERS: add Rodrigo Campos LGTMs: crosbymichael mrunalp dqminh hqhq cyphar AkihiroSuda kolyshkin thaJeztah lifubang Vote: +8 -0 #1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing golangci-lint#1

testing golangci-lint#1
kolyshkin wants to merge 4 commits intomasterfrom
bz1883640

kolyshkin commented Sep 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kolyshkin commented Sep 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant