-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
linkerd/linkerd2-proxy
#1341Milestone
Description
What is the issue?
While testing gRPC retries using the emojivoto application, I found that the gRPC requests between the web and voting services are not retryable because the gRPC protocol doesn't have a content-length header, which is part of the logic that is used to determine whether a POST request is retryable.
How can it be reproduced?
- Deploy linkerd 2.11 to a cluster (assumes the CLI is installed):
linkerd install | kubectl apply -f - - Deploy emojivoto:
kubectl apply -f https://run.linkerd.io/emojivoto.yml - Scale vote-bot to 0 replicas, because we want to target only the VoteDoughnut route
- Deploy the voting ServiceProfile with isRetryable enabled for all routes:
kubectl apply -f https://raw.githubusercontent.com/BuoyantIO/emojivoto/main/training/service-profiles/voting-svc-profile.yml - Set the proxy-log-level to
linkerd=trace,infofor the web pod: docs - Port forward to the web service:
kubectl port-forward svc/web -n emojivoto 8080:80 - Make a request to vote for the doughnut:
curl http://localhost:8080/api/vote?choice=:doughnut: - View the proxy logs for the web pod and you will see something like the output below
Logs, error output, etc
2021-10-21T17:33:24.352503923Z [ 564.043062s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}: linkerd_stack_tracing: service ready=true ok=true
2021-10-21T17:33:24.352518037Z [ 564.043072s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}: linkerd_stack_tracing: service request=Request { method: POST, uri: http://voting-svc.emojivoto:8080/emojivoto.v1.VotingService/VoteDoughnut, version: HTTP/2.0, headers: {"content-type": "application/grpc", "user-agent": "grpc-go/1.29.1", "te": "trailers", "grpc-trace-bin": "AACG+V5sxnQNK5cC/n1VpjOjAU3DqwK1qCUqAgA"}, body: BoxBody }
2021-10-21T17:33:24.352667528Z [ 564.043231s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}: linkerd_service_profiles::http::route_request: Using configured route condition=All([Method(POST), Path(^/emojivoto\.v1\.VotingService/VoteDoughnut$)])
2021-10-21T17:33:24.352675121Z [ 564.043242s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}: linkerd_retry: retryable=true
2021-10-21T17:33:24.352679109Z [ 564.043246s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}: linkerd_app_core::retry: not retryable req.has_body=true req.content_length=None
2021-10-21T17:33:24.352696058Z [ 564.043270s] TRACE ThreadId(01) outbound:accept{client.addr=10.42.0.91:41620}:server{orig_dst=10.43.129.115:8080}:profile:http{v=h2}:logical{dst=voting-svc.emojivoto.svc.cluster.local:8080}:concrete{addr=voting-svc.emojivoto.svc.cluster.local:8080}:endpoint{server.addr=10.42.0.75:8080}: linkerd_reconnect: Ready
linkerd check output
Linkerd core checks
===================
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match
linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match
Status check results are √
Linkerd extensions checks
=========================
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ can initialize the client
√ viz extension self-check
Status check results are √
Environment
- Kubernetes Version: v1.21.3+k3s1
- Cluster Environment: k3d
- Host OS: PopOS
- Linkerd version: 2.11
Possible solution
I'm still researching the gRPC protocol, so no proposed solution. After discussing with the Linkerd maintainers, there are a few things to take into consideration for a solution:
- The current implementation is for POST requests, which do have a a content-length header, so the solution should preserve support for POST bodies
- gRPC can be Unary or Streaming, which should be taken into consideration
- The gRPC protocol includes a
Message-Lengthfield as part of a DATA frame, but there may be a more reliable way to cover both H1 POST requests and gRPC requests
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels