Skip to content

Expand 'linkerd edges' to work with TCP connections#5040

Merged
alpeb merged 2 commits intomainfrom
alpeb/edges-tcp
Oct 12, 2020
Merged

Expand 'linkerd edges' to work with TCP connections#5040
alpeb merged 2 commits intomainfrom
alpeb/edges-tcp

Conversation

@alpeb
Copy link
Member

@alpeb alpeb commented Oct 5, 2020

Fixes #4999

Before:

$ bin/linkerd edges po -owide
SRC                                   DST                                    SRC_NS    DST_NS    CLIENT_ID   SERVER_ID   SECURED
linkerd-prometheus-764ddd4f88-t6c2j   rabbitmq-controller-5c6cf7cc6d-8lxp2   linkerd   default                           √
linkerd-prometheus-764ddd4f88-t6c2j   temp                                   linkerd   default                           √

After:

$ bin/linkerd edges po -owide
SRC                                   DST                                    SRC_NS    DST_NS    CLIENT_ID         SERVER_ID         SECURED
temp                                  rabbitmq-controller-5c6cf7cc6d-5fpsc   default   default   default.default   default.default   √
linkerd-prometheus-66fb97b7fc-vpnxf   rabbitmq-controller-5c6cf7cc6d-5fpsc   linkerd   default                                       √
linkerd-prometheus-66fb97b7fc-vpnxf   temp                                   linkerd   default                                       √

With the latest proxy upgrade to v2.113.0 (#5037), the tcp_open_total metric now contains the client_id label so that we can replace the http-only metric response_total with this one to determine edges for TCP-only connections.

This change basically performs the same query as before, but two times, one for response_total and another for tcp_open_total. For each resulting entry, the latter is kept if client_id is present, otherwise the former is used (if present at all). That way things keep on working for older proxies.

Disclaimers:

  • This doesn't fix Client ID in edges command can be wrong #3706: if two sources connect to the same destination there's no way to tell them appart from the metrics perspective and their edges can get mangled. To fix that, the proxy would have to expose src_resource labels in the tcp_open_total total inbound metric.
  • Note connections coming from prometheus are still unidentified. The reason is those hit the proxy's admin server (instead of the main container) which doesn't expose metrics.

Fixes #4999

Before:
```
$ bin/linkerd edges po -owide
SRC                                   DST                                    SRC_NS    DST_NS    CLIENT_ID   SERVER_ID   SECURED
linkerd-prometheus-764ddd4f88-t6c2j   rabbitmq-controller-5c6cf7cc6d-8lxp2   linkerd   default                           √
linkerd-prometheus-764ddd4f88-t6c2j   temp                                   linkerd   default                           √

```

After:
```
$ bin/linkerd edges po -owide
SRC                                   DST                                    SRC_NS    DST_NS    CLIENT_ID         SERVER_ID         SECURED
temp                                  rabbitmq-controller-5c6cf7cc6d-5fpsc   default   default   default.default   default.default   √
linkerd-prometheus-66fb97b7fc-vpnxf   rabbitmq-controller-5c6cf7cc6d-5fpsc   linkerd   default                                       √
linkerd-prometheus-66fb97b7fc-vpnxf   temp                                   linkerd   default                                       √
```

With the latest proxy upgrade to v2.113.0 (#5037), the `tcp_open_total` metric now contains the `client_id` label so that we can replace the http-only metric `response_total` with this one to determine edges for TCP-only connections.

This change basically performs the same query as before, but two times, one for `response_total` and another for `tcp_open_total`. For each resulting entry, the latter is kept if `client_id` is present, otherwise the former is used (if present at all). That way things keep on working for older proxies.

Disclaimers:
- This doesn't fix #3706: if two sources connect to the same destination there's no way to tell them appart from the metrics perspective and their edges can get mangled. To fix that, the proxy would have to expose `src_resource` labels in the `tcp_open_total` total inbound metric.
- Note connections coming from prometheus are still unidentified. The reason is those hit the proxy's admin server (instead of the main container) which doesn't expose metrics.
@alpeb alpeb requested a review from a team as a code owner October 5, 2020 21:24
Copy link
Member

@zaharidichev zaharidichev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@alpeb alpeb merged commit 777b06a into main Oct 12, 2020
@alpeb alpeb deleted the alpeb/edges-tcp branch October 12, 2020 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mTLS for TCP not working as expected with AMQP traffic Client ID in edges command can be wrong

3 participants