Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd 1.7.24 crash due to "fatal error: concurrent map iteration and map write" in CRI plugin #11302

Closed
austinvazquez opened this issue Jan 22, 2025 · 7 comments
Labels
area/cri Container Runtime Interface (CRI) kind/bug priority/P1

Comments

@austinvazquez
Copy link
Member

Description

Similar to #7797, I have a report of containerd crash for a k8s workload on a recent build of containerd v1.7.24. From the stack trace, the issue stems from this line where HTTP stream handler connection (h.conn) (type interface) is provided as value to klog.ErrorS which results in a fmt.Sprintf("%+v", value) on this line.

All other references of h.conn are using %p to avoid formatting of the connection.

Steps to reproduce the issue

Describe the results you received and expected

Stack trace

fatal error: concurrent map iteration and map write
goroutine 585192 [running]:
reflect.mapiternext(0x55a990967469?)
        /usr/libexec/go-1.22/src/runtime/map.go:1397 +0x13
reflect.(*MapIter).Next(0xc0008eb428?)
        /usr/libexec/go-1.22/src/reflect/value.go:2005 +0x74
internal/fmtsort.Sort({0x55a99201f620?, 0xc0011c8878?, 0xc0011c8870?})
        /usr/libexec/go-1.22/src/internal/fmtsort/sort.go:62 +0x1d3
fmt.(*pp).printValue(0xc000bfe410, {0x55a99201f620?, 0xc0011c8878?, 0x2c?}, 0x76, 0x2)
        /usr/libexec/go-1.22/src/fmt/print.go:816 +0x988
fmt.(*pp).printValue(0xc000bfe410, {0x55a9921b1a80?, 0xc0011c8870?, 0x55a9909bec90?}, 0x76, 0x1)
        /usr/libexec/go-1.22/src/fmt/print.go:853 +0x11be
fmt.(*pp).printValue(0xc000bfe410, {0x55a99218a440?, 0xc0011c8870?, 0x0?}, 0x76, 0x0)
        /usr/libexec/go-1.22/src/fmt/print.go:921 +0xaf4
fmt.(*pp).printArg(0xc000bfe410, {0x55a99218a440, 0xc0011c8870}, 0x76)
        /usr/libexec/go-1.22/src/fmt/print.go:759 +0x4bc
fmt.(*pp).doPrintf(0xc000bfe410, {0x55a991b85625, 0x3}, {0xc0008ebc50, 0x1, 0x1})
        /usr/libexec/go-1.22/src/fmt/print.go:1075 +0x37e
fmt.Sprintf({0x55a991b85625, 0x3}, {0xc0008ebc50, 0x1, 0x1})
        /usr/libexec/go-1.22/src/fmt/print.go:239 +0x53
k8s.io/klog/v2/internal/serialize.Formatter.AnyToString({0xc0023c2b60?}, {0x55a99218a440?, 0xc0011c8870?})
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/internal/serialize/keyvalues.go:250 +0x5f
k8s.io/klog/v2/internal/serialize.Formatter.KVFormat({0x1?}, 0xc0023c2b60, {0x55a991f45a20, 0x55a9922b46a0}, {0x55a99218a440, 0xc0011c8870})
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/internal/serialize/keyvalues.go:237 +0x1e6
k8s.io/klog/v2/internal/serialize.Formatter.KVListFormat({0xc0023c2b60?}, 0xc0023c2b60, {0xc0022fd560, 0x6, 0xf?})
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/internal/serialize/keyvalues.go:166 +0x3b
k8s.io/klog/v2/internal/serialize.KVListFormat(...)
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/internal/serialize/keyvalues.go:171
k8s.io/klog/v2.(*loggingT).printS(0x55a993103100, {0x55a9922b8f48, 0xc000e89c80}, 0x2, 0x1, {0x55a991b9d3a2, 0xf}, {0xc0022fd560, 0x6, 0x6})
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/klog.go:798 +0x17d
k8s.io/klog/v2.(*loggingT).errorS(0x55a991e310d8?, {0x55a9922b8f48, 0xc000e89c80}, 0x7fa3f869c400?, {0x0?, 0x0?}, 0x55a9918604e0?, {0x55a991b9d3a2?, 0xc001582ea0?}, {0xc0022fd560, ...})
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/klog.go:771 +0x165
k8s.io/klog/v2.ErrorS(...)
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/vendor/k8s.io/klog/v2/klog.go:1584
github.com/containerd/containerd/pkg/cri/streaming/portforward.(*httpStreamHandler).portForward(0xc0012e2cb0, 0xc001c783c0)
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/pkg/cri/streaming/portforward/httpstream.go:282 +0x6cd
created by github.com/containerd/containerd/pkg/cri/streaming/portforward.(*httpStreamHandler).run in goroutine 571738
        /home/builder/rpmbuild/BUILD/containerd-1.7.24/GOPATH/src/github.com/containerd/containerd/pkg/cri/streaming/portforward/httpstream.go:237 +0x44c

What version of containerd are you using?

containerd v1.7.24

Any other relevant information

Although report is from containerd v1.7.24, containerd v1.6 is almost certainly affected. Currently unsure if containerd v2.0 is affected as the CRI plugin has been refactored to use k8s.io/kubelet/pkg/cri/streaming/portforward instead of instead.

Show configuration if it is related to CRI plugin.

No response

@dosubot dosubot bot added the area/cri Container Runtime Interface (CRI) label Jan 22, 2025
@austinvazquez
Copy link
Member Author

Perhaps a short term fix would be to have containerd's portforward package fmt.Sprintf("%p", h.conn) use before calling klog.ErrorS.

@ningmingxiao
Copy link
Contributor

#11306 will fix

@austinvazquez
Copy link
Member Author

After doing some more research, this issue probably does not affect mainline or v2.x+. klog module has moved to using JSON encoder for the default formatter of non-default types (see kubernetes/klog@d731661).

From a proof of concept https://go.dev/play/p/Xmg1mYMSWHF, maps without JSON annotation would not be iterated.

Going down the rabbit hole, I do not really see implementations of httpstream.Connection annotating their fields to JSON.
e.g. https://github.com/moby/spdystream/blob/77eb080123d208697674a07b74ceaf94c98bee8b/connection.go#L192

@austinvazquez
Copy link
Member Author

As testing of this issue is difficult, I wrote a small program to reproduce the error and show the proposed fix will resolve the issue.

See https://go.dev/play/p/_GaPVoSOr_7.

Also slight correction of my earlier comment for JSON encoder. After further research, the JSON encoder can still exhibit the crash if the field is exported and not labeled with json:"-" to ignore the field during encoding. I attempted to follow through the implementations of Connection interface to find if any structures exported a map; however, I was unable to find anything concrete. Perhaps it is hidden through some embedding; regardless no reason to block the proposed fix.

@djdongjin
Copy link
Member

@austinvazquez I believe this is fixed by #11306?

@austinvazquez
Copy link
Member Author

Yes thanks for reminding me. Closing.

@github-project-automation github-project-automation bot moved this from Todo to Done in Issue Management Mar 1, 2025
@ningmingxiao
Copy link
Contributor

#11319 @djdongjin this pr should merge into 1.6.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) kind/bug priority/P1
Projects
Archived in project
Development

No branches or pull requests

4 participants