Fix a race in tty code leading to \r in output#2723
Merged
mrunalp merged 4 commits intoopencontainers:masterfrom Jan 13, 2021
Merged
Fix a race in tty code leading to \r in output#2723mrunalp merged 4 commits intoopencontainers:masterfrom
mrunalp merged 4 commits intoopencontainers:masterfrom
Conversation
kolyshkin
commented
Jan 7, 2021
|
|
||
| // Repeat to increase chances to catch a race; see | ||
| // https://github.com/opencontainers/runc/issues/2425. | ||
| for i := 0; i < 300; i++ { |
Contributor
Author
There was a problem hiding this comment.
Now, this commit is totally optional -- I am not sure myself I want it, since the bug is fixed and it takes about 5s on my laptop for this test now.
The TestExecInTTY test case is sometimes failing like this: > execin_test.go:332: unexpected carriage-return in output "PID USER TIME COMMAND\r\n 1 root 0:00 cat\r\n 7 root 0:00 ps\r\n" or this: > execin_test.go:332: unexpected carriage-return in output "PID USER TIME COMMAND\r\n 1 root 0:00 cat\n 7 root 0:00 ps\n" (this is easy to repro with `go test -run TestExecInTTY -count 1000`). This is caused by a race between - an Init() (in this case it is is (*linuxSetnsInit.Init(), but (*linuxStandardInit).Init() is no different in this regard), which creates a pty pair, sends pty master to runc, and execs the container process, and - a parent runc process, which receives the pty master fd and calls ClearONLCR() on it. One way of fixing it would be to add a synchronization mechanism between these two, so Init() won't exec the process until the parent sets the flag. This seems excessive, though, as we can just move the ClearONLCR() call to Init(), putting it right after console.NewPty(). Note that bug only happens in the TestExecInTTY test case, but from looking at the code it seems like it can happen in runc run or runc exec, too. Signed-off-by: Kir Kolyshkin <[email protected]>
There's no need to check err for nil. Signed-off-by: Kir Kolyshkin <[email protected]>
Simplify the tty code by using 1 goroutine instead of 2. Improve error reporting by wrapping the errors. Signed-off-by: Kir Kolyshkin <[email protected]>
This is to increase the chance to hit the recently fixed race. Signed-off-by: Kir Kolyshkin <[email protected]>
Contributor
Author
|
@cyphar PTAL |
This was referenced Jan 8, 2021
Contributor
Author
|
@AkihiroSuda @mrunalp PTAL This fixes a (now) frequent test failure which I am tired of seeing. |
AkihiroSuda
approved these changes
Jan 12, 2021
mrunalp
approved these changes
Jan 13, 2021
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The TestExecInTTY test case is sometimes failing like this:
or this:
(this is easy to repro with
go test -run TestExecInTTY -count 1000).This is caused by a race between
Init()(in this case it is is(*linuxSetnsInit.Init(), but(*linuxStandardInit).Init()is no different in this regard),which creates a pty pair, sends pty master to runc, and execs
the container process,
and
ClearONLCR()on it.One way of fixing it would be to add a synchronization mechanism
between these two, so
Init()won't exec the process until the parentsets the flag. This seems excessive, though, as we can just move
the
ClearONLCR()call toInit(), putting it right afterconsole.NewPty().Note that bug was only seen in the
TestExecInTTYtest case, butfrom looking at the code it seems like it can also happen during
runc runorrunc exec.While at it:
Fixes: #2425