Skip to content

Memory leak with distributed tracing enabled #13990

@baryluk

Description

@baryluk

What happened?

Adding --experimental-enable-distributed-tracing works, but causes a memory leak, of about 1GB per hour in our setup. Instead of expected ~2GB, it got to about 12GB in 7 hours.

What did you expect to happen?

Stable memory usage around 1.8-2.0 GB of RSS.

How can we reproduce it (as minimally and precisely as possible)?

Run with --experimental-enable-distributed-tracing for few hours. It is sufficient to enable it on one member.

Anything else we need to know?

The tracing collector endpoint doesn't need to be configured or listening. Having otelcol on 4317 doesn't change anything (beyond actually making tracing work).

Etcd version (please run commands below)

Details
$ etcd --version
etcd Version: 3.5.0
Git SHA: f99cada05
Go Version: go1.16.6
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.5.0
API version: 3.5

Etcd configuration (command line flags or environment variables)

etcd, Kubernetes, OKD / Openshift 4.9, 3 members.

etcd --experimental-enable-distributed-tracing --logger=zap --log-level=info --initial-advertise-peer-urls=https://10.10.0.102:2380 --cert-file=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-serving-master-1.example.com.crt --key-file=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-serving-master-1.example.com.key --trusted-ca-file=/etc/kubernetes/static-pod-certs/configmaps/etcd-serving-ca/ca-bundle.crt --client-cert-auth=true --peer-cert-file=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-master-1.example.com..crt --peer-key-file=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-master-1.example.com.key --peer-trusted-ca-file=/etc/kubernetes/static-pod-certs/configmaps/etcd-peer-client-ca/ca-bundle.crt --peer-client-cert-auth=true --advertise-client-urls=https://10.10.0.102:2379 --listen-client-urls=https://0.0.0.0:2379,unixs://10.10.0.102:0 --listen-peer-urls=https://0.0.0.0:2380 --metrics=extensive --listen-metrics-urls=https://0.0.0.0:9978

running in cri-o

Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)

Details
[root@master-1 /]# etcdctl member list -w table
+------------------+---------+------------------------------+--------------------------+--------------------------+------------+
|        ID        | STATUS  |             NAME             |        PEER ADDRS        |       CLIENT ADDRS       | IS LEARNER |
+------------------+---------+------------------------------+--------------------------+--------------------------+------------+
| 10f8cf6269xxx | started | master-2.example.com | https://10.10.0.103:2380 | https://10.10.0.103:2379 |      false |
| a2bbe7149xxx | started | master-1.example.com | https://10.10.0.102:2380 | https://10.10.0.102:2379 |      false |
| acb2c160xxx | started | master-0.example.com | https://10.10.0.101:2380 | https://10.10.0.101:2379 |      false |
+------------------+---------+------------------------------+--------------------------+--------------------------+------------+

Relevant log output

No fatal issues in the logs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions