Linkerd fails during node outage

## Bug Report

### What is the issue?
During node outage where some of linkerd components “linkerd-destination” and “linkerd-identity” were running, it looks like linkerd keeps sending traffic to pods on the failed node.  In my cluster Linkerd is installed in **HA mode**. 

### How can it be reproduced?
1.	Installed linkerd in HA mode
2.	Mesh two apps, where app A talks to app B or vise versa 
3.	Stop node where one of linkerd component "linkerd-destination" was running (The goal is to simulate a node outage which has happened to us and lead us to seeing this bug) . During this time the pod will wait for 5 minutes as per eviction timeout and get rescheduled in a new node after 5 minutes.
4.	Randomly one or two of app pods will fail to make call to another app
5.	This issue happens only when there is ungraceful shutdown of nodes 

### linker-proxy logs

```2020-06-18T20:48:30.790810742Z [ 14263.548533981s]  WARN outbound:accept{peer.addr=172.21.14.116:54590}:source{target.addr=172.17.119.128:80}: linkerd2_app_core::errors: Failed to proxy request: request timed out
2020-06-18T20:48:30.798501272Z [ 14263.555872710s]  WARN outbound:accept{peer.addr=172.21.14.116:52074}:source{target.addr=172.17.119.128:80}: linkerd2_app_core::errors: Failed to proxy request: request timed out
2020-06-18T20:48:31.317647919Z [ 14264.75155158s]  WARN outbound:accept{peer.addr=172.21.14.116:52074}:source{target.addr=172.17.119.128:80}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
2020-06-18T20:48:31.317687819Z [ 14264.75275658s]  WARN outbound:accept{peer.addr=172.21.14.116:54590}:source{target.addr=172.17.119.128:80}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast
2020-06-18T20:48:32.322513181Z [ 14265.80191420s]  WARN outbound:accept{peer.addr=172.21.14.116:52074}:source{target.addr=172.17.119.128:80}: linkerd2_app_core::errors: Failed to proxy request: Service in fail-fast```

```2020-06-18T22:36:56.281636047Z [   110.160896474s]  WARN outbound:accept{peer.addr=172.21.14.119:50512}:source{target.addr=172.17.140.43:80}:logical{addr=commandproxy-svc.commandproxy:80}:profile:balance{addr=commandproxy-svc.commandproxy.svc.cluster.local:80}:endpoint{peer.addr=172.21.1.17:80}: rustls::session: Sending fatal alert BadCertificate
2020-06-18T22:36:56.785530039Z [   110.664675566s]  WARN outbound:accept{peer.addr=172.21.14.119:50512}:source{target.addr=172.17.140.43:80}:logical{addr=commandproxy-svc.commandproxy:80}:profile:balance{addr=commandproxy-svc.commandproxy.svc.cluster.local:80}:endpoint{peer.addr=172.21.1.17:80}: rustls::session: Sending fatal alert BadCertificate
2020-06-18T22:36:57.288315626Z [   111.167593054s]  WARN outbound:accept{peer.addr=172.21.14.119:50512}:source{target.addr=172.17.140.43:80}:logical{addr=commandproxy-svc.commandproxy:80}:profile:balance{addr=commandproxy-svc.commandproxy.svc.cluster.local:80}:endpoint{peer.addr=172.21.1.17:80}: rustls::session: Sending fatal alert BadCertificate
2020-06-18T22:36:57.79501223Z [   111.674226057s]  WARN outbound:accept{peer.addr=172.21.14.119:50512}:source{target.addr=172.17.140.43:80}:logical{addr=commandproxy-svc.commandproxy:80}:profile:balance{addr=commandproxy-svc.commandproxy.svc.cluster.local:80}:endpoint{peer.addr=172.21.1.17:80}: rustls::session: Sending fatal alert BadCertificate
```

#### `linkerd check` output

```kubernetes-api                                                                      
--------------                                                                      
√ can initialize the client                                                         
√ can query the Kubernetes API                                                      
                                                                                    
kubernetes-version                                                                  
------------------                                                                  
√ is running the minimum Kubernetes API version                                     
√ is running the minimum kubectl version                                            
                                                                                    
linkerd-existence                                                                   
-----------------                                                                   
√ 'linkerd-config' config map exists                                                
√ heartbeat ServiceAccount exist                                                    
√ control plane replica sets are ready                                              
√ no unschedulable pods                                                             
√ controller pod is running                                                         
√ can initialize the client                                                         
√ can query the control plane API                                                   
                                                                                    
linkerd-config                                                                      
--------------                                                                      
√ control plane Namespace exists                                                    
√ control plane ClusterRoles exist                                                  
√ control plane ClusterRoleBindings exist                                           
√ control plane ServiceAccounts exist                                               
√ control plane CustomResourceDefinitions exist                                     
√ control plane MutatingWebhookConfigurations exist                                 
√ control plane ValidatingWebhookConfigurations exist                               
√ control plane PodSecurityPolicies exist                                           
                                                                                    
linkerd-identity                                                                    
----------------                                                                    
√ certificate config is valid                                                       
√ trust roots are using supported crypto algorithm                                  
√ trust roots are within their validity period                                      
√ trust roots are valid for at least 60 days                                        
√ issuer cert is using supported crypto algorithm                                   
√ issuer cert is within its validity period                                         
√ issuer cert is valid for at least 60 days                                         
√ issuer cert is issued by the trust root                                           
                                                                                    
linkerd-api                                                                         
-----------                                                                         
√ control plane pods are ready                                                      
√ control plane self-check                                                          
√ [kubernetes] control plane can talk to Kubernetes                                 
√ [prometheus] control plane can talk to Prometheus                                 
√ tap api service is running                                                        
                                                                                    
linkerd-version                                                                     
---------------                                                                     
√ can determine the latest version                                                  
‼ cli is up-to-date                                                                 
    is running version 2.7.1 but the latest stable version is 2.8.1                 
    see https://linkerd.io/checks/#l5d-version-cli for hints                        
                                                                                    
control-plane-version                                                               
---------------------                                                               
‼ control plane is up-to-date                                                       
    is running version 2.7.1 but the latest stable version is 2.8.1                 
    see https://linkerd.io/checks/#l5d-version-control for hints                    
√ control plane and cli versions match                                              
                                                                                    
linkerd-ha-checks                                                                   
-----------------                                                                   
√ pod injection disabled on kube-system                                             
                                                                                    
Status check results are √ 
```
### Environment
- Kubernetes Version:  1.16.9
- Cluster Environment:  AKS
- Linkerd version:  stable-2.7.1

### Possible solution

### Additional context
 There is similar open [github ](https://github.com/linkerd/linkerd2/issues/3854)issue, I am not 100 percent sure they are the same 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linkerd fails during node outage #4674

Bug Report

What is the issue?

How can it be reproduced?

linker-proxy logs

`linkerd check` output

Environment

Possible solution

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Linkerd fails during node outage #4674

Description

Bug Report

What is the issue?

How can it be reproduced?

linker-proxy logs

linkerd check output

Environment

Possible solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`linkerd check` output