Skip to content

[27.x backport] Clear RWLayer reference under container lock#49239

Merged
thaJeztah merged 1 commit intomoby:27.xfrom
vvoland:49228-27.x
Jan 8, 2025
Merged

[27.x backport] Clear RWLayer reference under container lock#49239
thaJeztah merged 1 commit intomoby:27.xfrom
vvoland:49228-27.x

Conversation

@vvoland
Copy link
Copy Markdown
Contributor

@vvoland vvoland commented Jan 8, 2025

This is a fix for #49227

- What I did

- How I did it

- How to verify it

I tested it manually with steps mentioned in Reproduce in #49227:

  • added artifical delay after nil check in getInspectData, compiled with make dynbinary and run

    --- a/daemon/inspect.go
    +++ b/daemon/inspect.go
    @@ -179,6 +179,9 @@ func (daemon *Daemon) getInspectData(daemonCfg *config.Config, container *contai
                    return nil, errdefs.System(errors.New("RWLayer of container " + container.ID + " is unexpectedly nil"))
            }
    
    +       for i := 0; i < 1000000000; i++ {
    +       }
    +
            graphDriverData, err := container.RWLayer.Metadata()
            if err != nil {
                    if container.Dead {
  • run inspecting a container in background: while true; do docker inspect test; done

  • tried starting and stopping a container: docker run --rm --name test ubuntu:24.04 'echo test'

Without the fix goroutine handling inspect call panics while holding container lock. With the fix there is no panic. I haven't tested other places where nil check was added in #36242 (there could be a similar issue there).

Goroutine stacktrace observed without the fix
2025/01/07 15:25:42 http: panic serving @: runtime error: invalid memory address or nil pointer dereference
goroutine 1393 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1947 +0xbe
panic({0x560e6e161840?, 0x560e6fbad0d0?})
	/usr/local/go/src/runtime/panic.go:785 +0x132
github.com/docker/docker/layer.(*referencedRWLayer).Metadata(0xc0011a2100?)
	<autogenerated>:1 +0x17
github.com/docker/docker/daemon.(*Daemon).getInspectData(0xc0005c2008, 0xc00053f708, 0xc000ed0588)
	/go/src/github.com/docker/docker/daemon/inspect.go:185 +0xf5b
github.com/docker/docker/daemon.(*Daemon).ContainerInspect(0xc0005c2008, {0x560e6e6275a0, 0xc0011a0270}, {0xc000642166?, 0x10?}, {0x50?})
	/go/src/github.com/docker/docker/daemon/inspect.go:33 +0xd7
github.com/docker/docker/api/server/router/container.(*containerRouter).getContainersByName(0xc000e91440, {0x560e6e6275a0, 0xc0011a0270}, {0x560e6e620500, 0xc001238060}, 0xc0012521f0?, 0xc0011a0090)
	/go/src/github.com/docker/docker/api/server/router/container/inspect.go:20 +0xc4
github.com/docker/docker/api/server/middleware.(*ExperimentalMiddleware).WrapHandler.ExperimentalMiddleware.WrapHandler.func1({0x560e6e6275a0, 0xc0011a0270}, {0x560e6e620500, 0xc001238060}, 0xc001136a00, 0xc0011a0090)
	/go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0xa3
github.com/docker/docker/api/server/middleware.(*VersionMiddleware).WrapHandler.VersionMiddleware.WrapHandler.func1({0x560e6e6275a0, 0xc0011a0240}, {0x560e6e620500, 0xc001238060}, 0xc001136a00, 0xc0011a0090)
	/go/src/github.com/docker/docker/api/server/middleware/version.go:84 +0x2b3
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1({0x560e6e6275a0, 0xc0011a0240}, {0x560e6e620500, 0xc001238060}, 0xc001136a00, 0xc0011a0090)
	/go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x34a
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1({0x560e6e620500, 0xc001238060}, 0xc0011368c0)
	/go/src/github.com/docker/docker/api/server/server.go:55 +0x1ae
net/http.HandlerFunc.ServeHTTP(0x560e6e6275a0?, {0x560e6e620500?, 0xc001238060?}, 0x560e6bede5b8?)
	/usr/local/go/src/net/http/server.go:2220 +0x29
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP(0xc000459180, {0x560e6e618f40, 0xc0011340e0}, 0xc001136780, {0x560e6e5fa1c0, 0xc000e0dde8})
	/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:218 +0x1083
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1({0x560e6e618f40?, 0xc0011340e0?}, 0x560e6bed6601?)
	/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:74 +0x35
net/http.HandlerFunc.ServeHTTP(0x560e6e6275a0?, {0x560e6e618f40?, 0xc0011340e0?}, 0x560e6bed6680?)
	/usr/local/go/src/net/http/server.go:2220 +0x29
net/http.HandlerFunc.ServeHTTP(0xc0011363c0?, {0x560e6e618f40?, 0xc0011340e0?}, 0x560e6c01ee89?)
	/usr/local/go/src/net/http/server.go:2220 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0002909c0, {0x560e6e618f40, 0xc0011340e0}, 0xc001136140)
	/go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:212 +0x1e2
net/http.serverHandler.ServeHTTP({0xc001262330?}, {0x560e6e618f40?, 0xc0011340e0?}, 0x6?)
	/usr/local/go/src/net/http/server.go:3210 +0x8e
net/http.(*conn).serve(0xc000a86000, {0x560e6e6275a0, 0xc000af0330})
	/usr/local/go/src/net/http/server.go:2092 +0x5d0
created by net/http.(*Server).Serve in goroutine 420
	/usr/local/go/src/net/http/server.go:3360 +0x485

- Description for the changelog

Fix a potential race condition error when deleting a container.

Previously the RWLayer reference was cleared without holding the
container lock. This could lead to goroutine panics in various places
that use the container.RWLayer because nil checks introduced in moby#36242
where not sufficient as the reference could change right before the use.

Fixes moby#49227

Signed-off-by: Tadeusz Dudkiewicz <[email protected]>
(cherry picked from commit 97dc305)
Signed-off-by: Paweł Gronowski <[email protected]>
@vvoland vvoland added this to the 27.5.0 milestone Jan 8, 2025
@vvoland vvoland self-assigned this Jan 8, 2025
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants