-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
We have been running a Kubernetes cluster (v1.11.5) in AKS for a year or so using linkerd stable-2.0.0. Recently we have built a new AKS cluster (v1.15.7) and upgraded linkerd to stable-2.7.0 but since upgrading we are seeing really high memory usage from the linkerd proxies.
We have a service that sits on the edge and proxies the to the correct service in the cluster. In the old cluster each linkerd proxy in each pod for this service was using roughly 13MiB of memory, but in the new cluster we are seeing it rise to over 1GiB within roughly 30 minutes and it just keeps going. This cluster just has light testing traffic so the requests per second are in the single digits, for the old cluster we see roughly 20rps at peak times.
It is the same edge service version in both clusters.
Azure tells me the OS on all the nodes for the new cluster is:
SKU: aks-ubuntu-1604-201912
Version: 2019.12.11
linkerd check output - for the new cluster
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist
linkerd-identity
----------------
√ certificate config is valid
√ trust roots are using supported crypto algorithm
√ trust roots are within their validity period
√ trust roots are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust root
linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ tap api service is running
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match
Status check results are √
Not really sure what else I can check? Any help appreciated.