-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Discussed in #11927
Originally posted by nealharris January 12, 2024
This feels like a bug to me, but I'm not exactly sure how to reproduce it, so I'm not yet willing to open this as an issue in the project.
We're currently running Linkerd 2.14.1 in production. We try to keep up to date, and always run the latest stable release. However, for each stable release after 2.14.1, we've had some trouble when trying to upgrade.
Specifically, when we upgrade in our staging environment, after a few days, we notice restarts in the destination controller. When looking at logs, I see things like
ERROR 2024-01-12T08:00:46.599129659Z [resource.labels.containerName: destination] fatal error: concurrent map iteration and map write
ERROR 2024-01-12T08:00:46.602373609Z [resource.labels.containerName: destination] {}
ERROR 2024-01-12T08:00:46.602422320Z [resource.labels.containerName: destination] goroutine 824961 [running]:
ERROR 2024-01-12T08:00:46.602427273Z [resource.labels.containerName: destination] github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).add(...)
ERROR 2024-01-12T08:00:46.602430571Z [resource.labels.containerName: destination] /linkerd-build/controller/api/destination/endpoint_translator.go:192
ERROR 2024-01-12T08:00:46.602434365Z [resource.labels.containerName: destination] github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).processUpdate(0xc0043d8ea0, {0x1b12480?, 0xc000e1e2e8?})
ERROR 2024-01-12T08:00:46.602436814Z [resource.labels.containerName: destination] /linkerd-build/controller/api/destination/endpoint_translator.go:183 +0x12d
ERROR 2024-01-12T08:00:46.602439176Z [resource.labels.containerName: destination] github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Start.func1()
ERROR 2024-01-12T08:00:46.602441799Z [resource.labels.containerName: destination] /linkerd-build/controller/api/destination/endpoint_translator.go:167 +0x37
ERROR 2024-01-12T08:00:46.602444278Z [resource.labels.containerName: destination] created by github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Start
ERROR 2024-01-12T08:00:46.602446869Z [resource.labels.containerName: destination] /linkerd-build/controller/api/destination/endpoint_translator.go:163 +0x56
ERROR 2024-01-12T08:00:46.602449332Z [resource.labels.containerName: destination] {}
ERROR 2024-01-12T08:00:46.602479161Z [resource.labels.containerName: destination] goroutine 1 [chan receive, 2169 minutes]:
ERROR 2024-01-12T08:00:46.602482727Z [resource.labels.containerName: destination] github.com/linkerd/linkerd2/controller/cmd/destination.Main({0xc0000500e0, 0xa, 0xa})
ERROR 2024-01-12T08:00:46.602486313Z [resource.labels.containerName: destination] /linkerd-build/controller/cmd/destination/main.go:162 +0xf65
ERROR 2024-01-12T08:00:46.602496863Z [resource.labels.containerName: destination] main.main()
ERROR 2024-01-12T08:00:46.602499788Z [resource.labels.containerName: destination] /linkerd-build/controller/cmd/main.go:23 +0x191
ERROR 2024-01-12T08:00:46.602502201Z [resource.labels.containerName: destination] {}
ERROR 2024-01-12T08:00:46.602505639Z [resource.labels.containerName: destination] goroutine 48 [select]:
ERROR 2024-01-12T08:00:46.602508155Z [resource.labels.containerName: destination] go.opencensus.io/stats/view.(*worker).start(0xc000146200)
ERROR 2024-01-12T08:00:46.602513801Z [resource.labels.containerName: destination] /go/pkg/mod/[email protected]/stats/view/worker.go:292 +0xad
ERROR 2024-01-12T08:00:46.602516690Z [resource.labels.containerName: destination] created by go.opencensus.io/stats/view.init.0
ERROR 2024-01-12T08:00:46.602521661Z [resource.labels.containerName: destination] /go/pkg/mod/[email protected]/stats/view/worker.go:34 +0x8d
ERROR 2024-01-12T08:00:46.602524146Z [resource.labels.containerName: destination] {}
ERROR 2024-01-12T08:00:46.602526628Z [resource.labels.containerName: destination] goroutine 185 [sync.Cond.Wait, 10 minutes]:
ERROR 2024-01-12T08:00:46.602533518Z [resource.labels.containerName: destination] sync.runtime_notifyListWait(0xc000612028, 0xa90)
ERROR 2024-01-12T08:00:46.602543572Z [resource.labels.containerName: destination] /usr/local/go/src/runtime/sema.go:517 +0x14c
ERROR 2024-01-12T08:00:46.602548176Z [resource.labels.containerName: destination] sync.(*Cond).Wait(0xc0042b0800?)
ERROR 2024-01-12T08:00:46.602551037Z [resource.labels.containerName: destination] /usr/local/go/src/sync/cond.go:70 +0x8c
ERROR 2024-01-12T08:00:46.602558126Z [resource.labels.containerName: destination] k8s.io/client-go/tools/cache.(*DeltaFIFO).Pop(0xc000612000, 0xc000616010)
ERROR 2024-01-12T08:00:46.602560684Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/tools/cache/delta_fifo.go:575 +0x256
ERROR 2024-01-12T08:00:46.602569140Z [resource.labels.containerName: destination] k8s.io/client-go/tools/cache.(*controller).processLoop(0xc000618000)
ERROR 2024-01-12T08:00:46.602571907Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:192 +0x36
ERROR 2024-01-12T08:00:46.602592218Z [resource.labels.containerName: destination] k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x40d91f?)
ERROR 2024-01-12T08:00:46.602612454Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:226 +0x3e
ERROR 2024-01-12T08:00:46.602621037Z [resource.labels.containerName: destination] k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xb331c5?, {0x22a2200, 0xc000614030}, 0x1, 0x0)
ERROR 2024-01-12T08:00:46.602624141Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:227 +0xb6
ERROR 2024-01-12T08:00:46.602627223Z [resource.labels.containerName: destination] k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000618078?, 0x3b9aca00, 0x0, 0x0?, 0x8bb2c97000?)
ERROR 2024-01-12T08:00:46.602629739Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:204 +0x89
ERROR 2024-01-12T08:00:46.602632922Z [resource.labels.containerName: destination] k8s.io/apimachinery/pkg/util/wait.Until(...)
ERROR 2024-01-12T08:00:46.602635384Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:161
ERROR 2024-01-12T08:00:46.602649519Z [resource.labels.containerName: destination] k8s.io/client-go/tools/cache.(*controller).Run(0xc000618000, 0x0)
ERROR 2024-01-12T08:00:46.602666866Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:163 +0x385
ERROR 2024-01-12T08:00:46.602693511Z [resource.labels.containerName: destination] k8s.io/client-go/tools/cache.(*sharedIndexInformer).Run(0xc00051a210, 0xc000129fd0?)
ERROR 2024-01-12T08:00:46.602705904Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:503 +0x542
ERROR 2024-01-12T08:00:46.602710075Z [resource.labels.containerName: destination] k8s.io/client-go/informers.(*sharedInformerFactory).Start.func1()
ERROR 2024-01-12T08:00:46.602713741Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/informers/factory.go:150 +0x6b
ERROR 2024-01-12T08:00:46.602717251Z [resource.labels.containerName: destination] created by k8s.io/client-go/informers.(*sharedInformerFactory).Start
ERROR 2024-01-12T08:00:46.602721222Z [resource.labels.containerName: destination] /go/pkg/mod/k8s.io/[email protected]/informers/factory.go:148 +0x22a
The above are from our most recent attempt to upgrade, which is for 2.14.8.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels