Describe the bug
I observed several possibly related issues while using the ingress-sds
- on larger cluster the ingress-sds seems to leak memory, in magnitude of 2-3 GBs over the span of 3 or 4 days; this seems to correlated with bursts of messages
2019-04-09T23:39:34.206560Z info SDS: push key/cert pair from node agent to proxy: "router~10.24.24.97~istio-ingressgateway-598b5d6fdc-7qrgj.istio-system~istio-system.svc.cluster.local-39832"
10s in a second, thousands a day
k logs -n istio-system istio-ingressgateway-598b5d6fdc-7qrgj ingress-sds --tail 10000 | grep "push key/cert pair from node agent to proxy" | grep "2019-04-09T23:39:34" | wc -l
38
k logs -n istio-system istio-ingressgateway-598b5d6fdc-7qrgj ingress-sds | grep "push key/cert pair from node agent to proxy" | grep "2019-04-09" | wc -l
21975
- I was able to reproduce similar behavior locally, if I create secret with an empty cert. In such case, the loop never ends, and the process quickly drains memory, until the secret is removed.
(The leak from above can be to some extent reproduced by having this running for a time period, however fixing the cert results in most of the memory released after period of time)
k logs -n istio-system istio-ingressgateway-6796fc8477-d48qv ingress-sds | grep "2019-04-09T23:52:19" | grep "push key/cert pair from node agent to proxy" | wc -l
554
- In at least one case, on removal of the secret in question, the sds agent died
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x11d88db]
goroutine 631 [running]:
istio.io/istio/security/pkg/nodeagent/sds.pushSDS(0xc4297c4200, 0x15db980, 0x0)
/workspace/go/src/istio.io/istio/security/pkg/nodeagent/sds/sdsservice.go:344 +0x1db
istio.io/istio/security/pkg/nodeagent/sds.(*sdsservice).StreamSecrets(0xc4203f7040, 0x168ab00, 0xc421028100, 0x0, 0x0)
/workspace/go/src/istio.io/istio/security/pkg/nodeagent/sds/sdsservice.go:213 +0x7aa
istio.io/istio/vendor/github.com/envoyproxy/go-control-plane/envoy/service/discovery/v2._SecretDiscoveryService_StreamSecrets_Handler(0x13b9180, 0xc4203f7040, 0x1686480, 0xc420224580, 0x1f55830, 0xc4233ee300)
/workspace/go/src/istio.io/istio/vendor/github.com/envoyproxy/go-control-plane/envoy/service/discovery/v2/sds.pb.go:175 +0xb2
istio.io/istio/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc420311680, 0x168c7e0, 0xc42065a900, 0xc4370be000, 0xc420406720, 0x1f1f7e0, 0x0, 0x0, 0x0)
/workspace/go/src/istio.io/istio/vendor/google.golang.org/grpc/server.go:1124 +0x911
istio.io/istio/vendor/google.golang.org/grpc.(*Server).handleStream(0xc420311680, 0x168c7e0, 0xc42065a900, 0xc4370be000, 0x0)
/workspace/go/src/istio.io/istio/vendor/google.golang.org/grpc/server.go:1212 +0x12b1
istio.io/istio/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc4200445c0, 0xc420311680, 0x168c7e0, 0xc42065a900, 0xc4370be000)
/workspace/go/src/istio.io/istio/vendor/google.golang.org/grpc/server.go:686 +0x9f
created by istio.io/istio/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
/workspace/go/src/istio.io/istio/vendor/google.golang.org/grpc/server.go:684 +0xa1
Expected behavior
The sds should not leak memory/fail on secret with missing certificate.
Steps to reproduce the bug
follow https://istio.io/docs/tasks/traffic-management/secure-ingress/sds/
when creating the secret, pass an empty file for the cert
touch empty.cert.pem
kubectl create -n istio-system secret generic httpbin-credential \
--from-file=key=httpbin.example.com/3_application/private/httpbin.example.com.key.pem \
--from-file=cert=empty.cert.pem
Version
bin/istioctl version --remote
client version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.1"}
citadel version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
galley version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
ingressgateway version: version.BuildInfo{Version:"bdda7cfcf5ba1397e6e0e2629d53114c9ea8fc14", GitRevision:"bdda7cfcf5ba1397e6e0e2629d53114c9ea8fc14", User:"mjog", Host:"devinstance.c.mixologist-142215.internal", GolangVersion:"go1.10.1", DockerHub:"docker.io/istio", BuildStatus:"Clean", GitTag:"1.1.0-snapshot.4-592-gbdda7cf"}
pilot version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
policy version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
sidecar-injector version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
telemetry version: version.BuildInfo{Version:"1.1.2", GitRevision:"2b1331886076df103179e3da5dc9077fed59c989-dirty", User:"root", Host:"35adf5bb-5570-11e9-b00d-0a580a2c0205", GolangVersion:"go1.10.4", DockerHub:"docker.io/istio", BuildStatus:"Modified", GitTag:"1.1.1"}
kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T21:04:45Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T20:56:12Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Installation
helm template install/kubernetes/helm/istio --set global.mtls.enabled=true --set gateways.istio-egressgateway.enabled=false --set gateways.istio-ingressgateway.sds.enabled=true --name istio --namespace istio-system > ./istio.yaml
kubectl apply -f istio.yaml
Environment
docker edge for mac (local)
custom build cluster over azure nodes (remote, issue 1))
Cluster state
empty cluster with freshly installed istio (for the local reproduction)
cluster served by CI pipeline with handful of random applications (remote)
Describe the bug
I observed several possibly related issues while using the ingress-sds
10s in a second, thousands a day
(The leak from above can be to some extent reproduced by having this running for a time period, however fixing the cert results in most of the memory released after period of time)
Expected behavior
The sds should not leak memory/fail on secret with missing certificate.
Steps to reproduce the bug
follow https://istio.io/docs/tasks/traffic-management/secure-ingress/sds/
when creating the secret, pass an empty file for the cert
Version
Installation
Environment
docker edge for mac (local)
custom build cluster over azure nodes (remote, issue 1))
Cluster state
empty cluster with freshly installed istio (for the local reproduction)
cluster served by CI pipeline with handful of random applications (remote)