Skip to content

Excessive memory usage since upgrade to stable-2.7.0 #4069

@tomstreet

Description

@tomstreet

We have been running a Kubernetes cluster (v1.11.5) in AKS for a year or so using linkerd stable-2.0.0. Recently we have built a new AKS cluster (v1.15.7) and upgraded linkerd to stable-2.7.0 but since upgrading we are seeing really high memory usage from the linkerd proxies.

We have a service that sits on the edge and proxies the to the correct service in the cluster. In the old cluster each linkerd proxy in each pod for this service was using roughly 13MiB of memory, but in the new cluster we are seeing it rise to over 1GiB within roughly 30 minutes and it just keeps going. This cluster just has light testing traffic so the requests per second are in the single digits, for the old cluster we see roughly 20rps at peak times.

It is the same edge service version in both clusters.

Azure tells me the OS on all the nodes for the new cluster is:
SKU: aks-ubuntu-1604-201912
Version: 2019.12.11

linkerd check output - for the new cluster

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-identity
----------------
√ certificate config is valid
√ trust roots are using supported crypto algorithm
√ trust roots are within their validity period
√ trust roots are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust root

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ tap api service is running

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

Status check results are √

Not really sure what else I can check? Any help appreciated.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions