-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Bug Report
From time to time we run into connection problems, when we enabled the debug logs, we noticed that the proxy was trying to connect to outdated IP addresses.
What is the issue?
The proxy is trying to connect to the stale IP address of the endpoint.
How can it be reproduced?
We are still trying to reproduce it, but so far with no success. One thing we've observed that could be a potential trigger for this problem is fast/huge scaling down the target PODs from ~10 instances to 2.
Logs, error output, etc
10.208.6.184 destination SVC IP
Logs related to this svc IP - unfortunately they are in JSON format since I exported them from GCP.
I also have all the logs with TRACE and DEBUG for ~20 minutes after the problem occurred (~60K entries). I can share them too if you need them!
linkerd check output
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
linkerd-api
-----------
√ control plane pods are ready
√ can initialize the client
√ can query the control plane API
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match
Status check results are √
Linkerd extensions checks
=========================
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ can initialize the client
√ viz extension self-check
Status check results are √
Environment
- Kubernetes Version:
v1.20.8-gke.2100 - Cluster Environment:
GKEwithkube-proxy - Linkerd version:
stable-2.10.2