Skip to content

proxy: metrics, tap do not classify errors as failures #2934

@olix0r

Description

@olix0r

When a proxy observe (i.e. socket) errors, it does not record any response_total metrics, nor does it emit response tap events. It should do both of these, interacting with the profile-provided classifier.


I have a test environment configured as follows:

  • An injected client pod that repeatedly tries to send HTTP requests.
  • An uninjected server pod that accepts connections, reads a few bytes, and closes the connection, causing errors.

We see that request_total counters are incremented, without corresponding response_total metrics:

:; linkerd metrics deploy/client | grep -e ^request_total -e ^response_total
request_total{authority="target.default.svc.cluster.local:3000",direction="outbound",dst_deployment="server",dst_namespace="default",dst_pod="server-748ddd886d-2xwt4",dst_pod_template_hash="748ddd886d",dst_service="target",dst_serviceaccount="default",tls="no_identity",no_tls_reason="not_provided_by_service_discovery"} 4240

Tap behaves similarly:

:; linkerd tap deploy/client --max-rps 10
req id=1:0 proxy=out src=10.0.0.19:53496 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:1 proxy=out src=10.0.0.19:53512 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:2 proxy=out src=10.0.0.19:53494 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:3 proxy=out src=10.0.0.19:53504 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:4 proxy=out src=10.0.0.19:53498 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:5 proxy=out src=10.0.0.19:53500 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
end id=1:0 proxy=out src=10.0.0.19:53496 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
req id=1:6 proxy=out src=10.0.0.19:53502 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
req id=1:7 proxy=out src=10.0.0.19:53506 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
end id=1:2 proxy=out src=10.0.0.19:53494 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:3 proxy=out src=10.0.0.19:53504 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:1 proxy=out src=10.0.0.19:53512 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
req id=1:8 proxy=out src=10.0.0.19:53510 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
end id=1:5 proxy=out src=10.0.0.19:53500 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:4 proxy=out src=10.0.0.19:53498 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:6 proxy=out src=10.0.0.19:53502 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:7 proxy=out src=10.0.0.19:53506 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
req id=1:9 proxy=out src=10.0.0.19:53508 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery :method=GET :authority=target:3000 :path=/
end id=1:8 proxy=out src=10.0.0.19:53510 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B
end id=1:9 proxy=out src=10.0.0.19:53508 dst=10.0.0.55:3000 tls=not_provided_by_service_discovery duration=0µs response-length=0B

Here, we see that end events are emitted, but they don't properly reflect the latency or response status, we don't see any rsp events, and the tap stream hangs waiting for additional messages.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions