Skip to content

fail on file not found for shim reconnect on containerd restart#3659

Merged
crosbymichael merged 1 commit intocontainerd:masterfrom
katiewasnothere:shimreconnectupstream
Sep 18, 2019
Merged

fail on file not found for shim reconnect on containerd restart#3659
crosbymichael merged 1 commit intocontainerd:masterfrom
katiewasnothere:shimreconnectupstream

Conversation

@katiewasnothere
Copy link
Copy Markdown

Previously for windows, when containerd restarts and attempts to reconnect to shim pipes, if the pipe file is not found for a shim, containerd would wait up to 5 seconds before returning a failure. If containerd is still attempting to connect to shims after 30 seconds, the windows service manager will kill containerd. This work breaks up the cases of connecting to shims into two cases: shim starting and shim reconnecting. In the former, containerd will still wait up to 5 seconds for a pipe to be served by a starting shim. In the latter, containerd will return file not found immediately when attempting to connect to a nonexistent or dead shim.

Signed-off-by: Kathryn Baldauf [email protected]

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Sep 17, 2019

Build succeeded.

@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #3659 into master will decrease coverage by 0.04%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3659      +/-   ##
==========================================
- Coverage   42.39%   42.34%   -0.05%     
==========================================
  Files         127      127              
  Lines       14048    14063      +15     
==========================================
  Hits         5955     5955              
- Misses       7193     7208      +15     
  Partials      900      900
Flag Coverage Δ
#linux 45.9% <0%> (-0.01%) ⬇️
#windows 37.23% <0%> (-0.05%) ⬇️
Impacted Files Coverage Δ
runtime/v2/shim/util_windows.go 0% <0%> (ø) ⬆️
runtime/v2/shim/util_unix.go 0% <0%> (ø) ⬆️
runtime/v2/shim_unix.go 77.77% <0%> (ø) ⬆️
runtime/v2/shim_windows.go 17.77% <0%> (ø) ⬆️
runtime/v2/shim.go 1.18% <0%> (ø) ⬆️
runtime/v2/binary.go 0% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87bff67...b4211d9. Read the comment docs.

@crosbymichael
Copy link
Copy Markdown
Member

LGTM

Copy link
Copy Markdown
Contributor

@jterry75 jterry75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crosbymichael crosbymichael merged commit 324a947 into containerd:master Sep 18, 2019
@katiewasnothere katiewasnothere deleted the shimreconnectupstream branch December 19, 2019 23:42
eginez added a commit to eginez/containerd that referenced this pull request Apr 9, 2026
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per containerd#3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.
eginez added a commit to eginez/containerd that referenced this pull request Apr 9, 2026
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per containerd#3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.

Signed-off-by: Esteban Ginez <[email protected]>
eginez added a commit to eginez/containerd that referenced this pull request Apr 10, 2026
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per containerd#3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.

Signed-off-by: Esteban Ginez <[email protected]>
kairosci pushed a commit to kairosci/containerd that referenced this pull request May 1, 2026
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per containerd#3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.

Signed-off-by: Esteban Ginez <[email protected]>
kairosci pushed a commit to kairosci/containerd that referenced this pull request May 2, 2026
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per containerd#3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.

Signed-off-by: Esteban Ginez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants