Conversation
Signed-off-by: Alex Leong <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #6574 +/- ##
==========================================
- Coverage 45.39% 45.33% -0.06%
==========================================
Files 182 182
Lines 24364 24272 -92
Branches 290 290
==========================================
- Hits 11059 11004 -55
+ Misses 12525 12491 -34
+ Partials 780 777 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Signed-off-by: Alex Leong <[email protected]>
Signed-off-by: Alex Leong <[email protected]>
Signed-off-by: Alex Leong <[email protected]>
Signed-off-by: Alex Leong <[email protected]>
alpeb
left a comment
There was a problem hiding this comment.
Very nice refactor, this simplified approach makes a lot of better sense, plus I see now some edges that were missing before 👍
I'm curious about the replacement of tcp_open_total with tcp_open_connections; does this mean we only care about current live connections and not about all the connections during the pods lifetimes?
| clientID, err := s.getPodIdentity(string(sample.Metric[model.LabelName("pod")]), key.srcNs) | ||
| if err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
I think we should just log a warning and skip the sample in case the pod referred in a metric is no longer found, instead of erroring. Even if tcp_open_connections means current connections I did see errors after scaling down pods and calling linkerd viz edges shortly after.
There was a problem hiding this comment.
Everything looks good, and also makes edges work with namespaces, etc as expected.
but there seems to be a bit of staleness issue. (was this already known?)
I installed the emojivoto application, and scaled up the web deployment. Edges resources appear as we would expect.
on ⛵ kind-kind linkerd2 on 🌱 alex [📦📝🤷] via 🐼 v1.16.6 via
➜ k -n emojivoto scale deploy/web --replicas 3 ~/work/linkerd2
deployment.apps/web scaled
on ⛵ kind-kind linkerd2 on 🌱 alex [📦📝🤷] via 🐼 v1.16.6 via
➜ ./bin/go-run cli viz edges po -n emojivoto -owide ~/work/linkerd2
SRC DST SRC_NS DST_NS CLIENT_ID SERVER_ID SECURED
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-5f9k4 emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-6r7gd emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-7wbsg emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-h6z7m emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-p4kpl emojivoto emojivoto default.emojivoto web.emojivoto √
web-5f86686c4d-5f9k4 emoji-696d9d8f95-vvkjb emojivoto emojivoto web.emojivoto emoji.emojivoto √
web-5f86686c4d-5f9k4 voting-ff4c54b8d-746nc emojivoto emojivoto web.emojivoto voting.emojivoto √
web-5f86686c4d-6r7gd emoji-696d9d8f95-vvkjb emojivoto emojivoto web.emojivoto emoji.emojivoto √
web-5f86686c4d-6r7gd voting-ff4c54b8d-746nc emojivoto emojivoto web.emojivoto voting.emojivoto √
web-5f86686c4d-7wbsg emoji-696d9d8f95-vvkjb emojivoto emojivoto web.emojivoto emoji.emojivoto √
web-5f86686c4d-7wbsg voting-ff4c54b8d-746nc emojivoto emojivoto web.emojivoto voting.emojivoto √
prometheus-759b9b47f8-d5g7j emoji-696d9d8f95-vvkjb linkerd-viz emojivoto prometheus.linkerd-viz emoji.emojivoto √
prometheus-759b9b47f8-d5g7j vote-bot-6d7677bb68-zwc2v linkerd-viz emojivoto prometheus.linkerd-viz default.emojivoto √
prometheus-759b9b47f8-d5g7j voting-ff4c54b8d-746nc linkerd-viz emojivoto prometheus.linkerd-viz voting.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-5f9k4 linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-6r7gd linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-7wbsg linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-h6z7m linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-p4kpl linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √Now, after scaling down, The client side metrics for web pods are gone but where its the server side, they still continue exist.
on ⛵ kind-kind linkerd2 on 🌱 alex [📦📝🤷] via 🐼 v1.16.6 via
➜ k -n emojivoto scale deploy/web --replicas 1
on ⛵ kind-kind linkerd2 on 🌱 alex [📦📝🤷] via 🐼 v1.16.6 via
❯ ./bin/go-run cli viz edges po -n emojivoto -owide ~/work/linkerd2
SRC DST SRC_NS DST_NS CLIENT_ID SERVER_ID SECURED
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-5f9k4 emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-6r7gd emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-7wbsg emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-h6z7m emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-p4kpl emojivoto emojivoto default.emojivoto web.emojivoto √
web-5f86686c4d-5f9k4 emoji-696d9d8f95-vvkjb emojivoto emojivoto web.emojivoto emoji.emojivoto √
web-5f86686c4d-5f9k4 voting-ff4c54b8d-746nc emojivoto emojivoto web.emojivoto voting.emojivoto √
prometheus-759b9b47f8-d5g7j emoji-696d9d8f95-vvkjb linkerd-viz emojivoto prometheus.linkerd-viz emoji.emojivoto √
prometheus-759b9b47f8-d5g7j vote-bot-6d7677bb68-zwc2v linkerd-viz emojivoto prometheus.linkerd-viz default.emojivoto √
prometheus-759b9b47f8-d5g7j voting-ff4c54b8d-746nc linkerd-viz emojivoto prometheus.linkerd-viz voting.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-5f9k4 linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-6r7gd linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-7wbsg linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-h6z7m linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-p4kpl linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √This also seems to be a problem with current linkerd binary
on ⛵ kind-kind linkerd2 on 🌱 alex [📦📝🤷] via 🐼 v1.16.6 via
➜ linkerd viz edges po -n emojivoto -owide ~/work/linkerd2
SRC DST SRC_NS DST_NS CLIENT_ID SERVER_ID SECURED
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-5f9k4 emojivoto emojivoto default.emojivoto web.emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-6r7gd emojivoto emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-7wbsg emojivoto emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-h6z7m emojivoto emojivoto √
vote-bot-6d7677bb68-zwc2v web-5f86686c4d-p4kpl emojivoto emojivoto √
web-5f86686c4d-5f9k4 emoji-696d9d8f95-vvkjb emojivoto emojivoto web.emojivoto emoji.emojivoto √
web-5f86686c4d-5f9k4 voting-ff4c54b8d-746nc emojivoto emojivoto web.emojivoto voting.emojivoto √
prometheus-759b9b47f8-d5g7j emoji-696d9d8f95-vvkjb linkerd-viz emojivoto prometheus.linkerd-viz emoji.emojivoto √
prometheus-759b9b47f8-d5g7j vote-bot-6d7677bb68-zwc2v linkerd-viz emojivoto prometheus.linkerd-viz default.emojivoto √
prometheus-759b9b47f8-d5g7j voting-ff4c54b8d-746nc linkerd-viz emojivoto prometheus.linkerd-viz voting.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-5f9k4 linkerd-viz emojivoto prometheus.linkerd-viz web.emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-6r7gd linkerd-viz emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-7wbsg linkerd-viz emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-h6z7m linkerd-viz emojivoto √
prometheus-759b9b47f8-d5g7j web-5f86686c4d-p4kpl linkerd-viz emojivoto √Should we maybe skip the metrics if we know that the pod isn't existing, or we continue to show edges from all the metric history even if they might not exist as of now (In which case, we should somehow also be showing the client side edges for the removed resource? 🤔
dadjeibaah
left a comment
There was a problem hiding this comment.
This looks great! Much easier to reason about. I left a few TIOLI and LGTM modulo other reviewers comments.
| } | ||
|
|
||
| edges = append(edges, edgesHTTP...) | ||
| edges := []*pb.Edge{} |
There was a problem hiding this comment.
TIOLI: Since we already know the length of the edges we need to add to the array, we could just initialize an array of (len(edgesMap) and then use the index in the range to populate edges.
There was a problem hiding this comment.
You're right that it's more efficient to initialize the array to the final size but I find the append style to be more readable so I prefer it in cases where the performance difference isn't significant.
| var l5dns string | ||
| var l5dtrustdomain string |
There was a problem hiding this comment.
TIOLI: could also be written as:
| var l5dns string | |
| var l5dtrustdomain string | |
| var l5dns, l5dtrustdomain string |
Signed-off-by: Alex Leong <[email protected]>
|
I've updated this to use |
Signed-off-by: Alex Leong <[email protected]>
Pothulapati
left a comment
There was a problem hiding this comment.
LGTM! Also, Nice fix on using sum instead! ![]()
kleimkuhler
left a comment
There was a problem hiding this comment.
This is a great simplification of how edges are collected. It works well and glad to see this test back 👍
| // This test has been disabled because it can fail due to | ||
| // https://github.com/linkerd/linkerd2/issues/3706 | ||
| // This test should be updated and re-enabled when that issue is addressed. | ||
| /* |
…ng information. (#6627) After #6574 recently merging the `TestDirectEdges` test has been flaky. Occasionally the test fails for the following error: ``` 2021-08-05T22:08:06.2860090Z --- FAIL: TestDirectEdges (73.39s) 2021-08-05T22:08:06.2860719Z edges_test.go:161: Expected output: 2021-08-05T22:08:06.2861172Z \[ 2021-08-05T22:08:06.2861480Z \{ 2021-08-05T22:08:06.2861884Z "src": "prometheus", 2021-08-05T22:08:06.2862642Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2863356Z "dst": "slow-cooker", 2021-08-05T22:08:06.2864211Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2865267Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2866374Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2867120Z "no_tls_reason": "" 2021-08-05T22:08:06.2867492Z \}, 2021-08-05T22:08:06.2867804Z \{ 2021-08-05T22:08:06.2868211Z "src": "prometheus", 2021-08-05T22:08:06.2868933Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2869481Z "dst": "terminus", 2021-08-05T22:08:06.2870311Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2871318Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2872435Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2873180Z "no_tls_reason": "" 2021-08-05T22:08:06.2873563Z \}, 2021-08-05T22:08:06.2873888Z \{ 2021-08-05T22:08:06.2874421Z "src": "slow-cooker", 2021-08-05T22:08:06.2875282Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2875936Z "dst": "terminus", 2021-08-05T22:08:06.2876751Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2877906Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879151Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879887Z "no_tls_reason": "" 2021-08-05T22:08:06.2880258Z \} 2021-08-05T22:08:06.2880579Z \] 2021-08-05T22:08:06.2880885Z 2021-08-05T22:08:06.2881227Z actual: 2021-08-05T22:08:06.2881560Z [ 2021-08-05T22:08:06.2881876Z { 2021-08-05T22:08:06.2882429Z "src": "slow-cooker", 2021-08-05T22:08:06.2883268Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2883925Z "dst": "terminus", 2021-08-05T22:08:06.2884759Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2885901Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887153Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887896Z "no_tls_reason": "" 2021-08-05T22:08:06.2888250Z } 2021-08-05T22:08:06.2888572Z ] 2021-08-05T22:08:06.2888870Z ``` What is happening is Prometheus has not scraped workloads yet, so there is no edge to or from Prometheus and slow-cooker. While it's remains unclear why exactly this happens, this change addresses two potential problems: 1. Print the cluster's pods so that we can determine if the Prometheus pod is injected. If it's not injected, we'll be able to debug further with that information. 2. Introduce retries to the `TestEdges` test. As we already have retries in the `TestDirectEdges`, this makes a similar change to `TestEdges`. Signed-off-by: Kevin Leimkuhler <[email protected]>
Fixes linkerd#3706 The implementation of the `linkerd viz edges` command works by gathering http and tcp metrics in both the inbound and outbound directions and combining this data in dubious ways. We make the implementation simpler and more correct by instead doing the following: * Gather tcp metrics only * (this drops support for very old proxy versions which do not expose the `tcp_open_connections` metric) * Gather outbound metrics only * (all meshed edges will have a src in the mesh and will be present in the outbound metrics) * Outbound metrics do not have a `client_id` label, so we fill in this missing data by inspecting the source pod via the k8s api and reconstruct that pod's TLS identity based on it's service account name and namespace. Signed-off-by: Alex Leong <[email protected]>
…ng information. (linkerd#6627) After linkerd#6574 recently merging the `TestDirectEdges` test has been flaky. Occasionally the test fails for the following error: ``` 2021-08-05T22:08:06.2860090Z --- FAIL: TestDirectEdges (73.39s) 2021-08-05T22:08:06.2860719Z edges_test.go:161: Expected output: 2021-08-05T22:08:06.2861172Z \[ 2021-08-05T22:08:06.2861480Z \{ 2021-08-05T22:08:06.2861884Z "src": "prometheus", 2021-08-05T22:08:06.2862642Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2863356Z "dst": "slow-cooker", 2021-08-05T22:08:06.2864211Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2865267Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2866374Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2867120Z "no_tls_reason": "" 2021-08-05T22:08:06.2867492Z \}, 2021-08-05T22:08:06.2867804Z \{ 2021-08-05T22:08:06.2868211Z "src": "prometheus", 2021-08-05T22:08:06.2868933Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2869481Z "dst": "terminus", 2021-08-05T22:08:06.2870311Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2871318Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2872435Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2873180Z "no_tls_reason": "" 2021-08-05T22:08:06.2873563Z \}, 2021-08-05T22:08:06.2873888Z \{ 2021-08-05T22:08:06.2874421Z "src": "slow-cooker", 2021-08-05T22:08:06.2875282Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2875936Z "dst": "terminus", 2021-08-05T22:08:06.2876751Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2877906Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879151Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879887Z "no_tls_reason": "" 2021-08-05T22:08:06.2880258Z \} 2021-08-05T22:08:06.2880579Z \] 2021-08-05T22:08:06.2880885Z 2021-08-05T22:08:06.2881227Z actual: 2021-08-05T22:08:06.2881560Z [ 2021-08-05T22:08:06.2881876Z { 2021-08-05T22:08:06.2882429Z "src": "slow-cooker", 2021-08-05T22:08:06.2883268Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2883925Z "dst": "terminus", 2021-08-05T22:08:06.2884759Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2885901Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887153Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887896Z "no_tls_reason": "" 2021-08-05T22:08:06.2888250Z } 2021-08-05T22:08:06.2888572Z ] 2021-08-05T22:08:06.2888870Z ``` What is happening is Prometheus has not scraped workloads yet, so there is no edge to or from Prometheus and slow-cooker. While it's remains unclear why exactly this happens, this change addresses two potential problems: 1. Print the cluster's pods so that we can determine if the Prometheus pod is injected. If it's not injected, we'll be able to debug further with that information. 2. Introduce retries to the `TestEdges` test. As we already have retries in the `TestDirectEdges`, this makes a similar change to `TestEdges`. Signed-off-by: Kevin Leimkuhler <[email protected]>
Fixes linkerd#3706 The implementation of the `linkerd viz edges` command works by gathering http and tcp metrics in both the inbound and outbound directions and combining this data in dubious ways. We make the implementation simpler and more correct by instead doing the following: * Gather tcp metrics only * (this drops support for very old proxy versions which do not expose the `tcp_open_connections` metric) * Gather outbound metrics only * (all meshed edges will have a src in the mesh and will be present in the outbound metrics) * Outbound metrics do not have a `client_id` label, so we fill in this missing data by inspecting the source pod via the k8s api and reconstruct that pod's TLS identity based on it's service account name and namespace. Signed-off-by: Alex Leong <[email protected]> Signed-off-by: Sanni Michael <[email protected]>
…ng information. (linkerd#6627) After linkerd#6574 recently merging the `TestDirectEdges` test has been flaky. Occasionally the test fails for the following error: ``` 2021-08-05T22:08:06.2860090Z --- FAIL: TestDirectEdges (73.39s) 2021-08-05T22:08:06.2860719Z edges_test.go:161: Expected output: 2021-08-05T22:08:06.2861172Z \[ 2021-08-05T22:08:06.2861480Z \{ 2021-08-05T22:08:06.2861884Z "src": "prometheus", 2021-08-05T22:08:06.2862642Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2863356Z "dst": "slow-cooker", 2021-08-05T22:08:06.2864211Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2865267Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2866374Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2867120Z "no_tls_reason": "" 2021-08-05T22:08:06.2867492Z \}, 2021-08-05T22:08:06.2867804Z \{ 2021-08-05T22:08:06.2868211Z "src": "prometheus", 2021-08-05T22:08:06.2868933Z "src_namespace": "external\-prometheus", 2021-08-05T22:08:06.2869481Z "dst": "terminus", 2021-08-05T22:08:06.2870311Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2871318Z "client_id": "prometheus.external\-prometheus", 2021-08-05T22:08:06.2872435Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2873180Z "no_tls_reason": "" 2021-08-05T22:08:06.2873563Z \}, 2021-08-05T22:08:06.2873888Z \{ 2021-08-05T22:08:06.2874421Z "src": "slow-cooker", 2021-08-05T22:08:06.2875282Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2875936Z "dst": "terminus", 2021-08-05T22:08:06.2876751Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2877906Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879151Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2879887Z "no_tls_reason": "" 2021-08-05T22:08:06.2880258Z \} 2021-08-05T22:08:06.2880579Z \] 2021-08-05T22:08:06.2880885Z 2021-08-05T22:08:06.2881227Z actual: 2021-08-05T22:08:06.2881560Z [ 2021-08-05T22:08:06.2881876Z { 2021-08-05T22:08:06.2882429Z "src": "slow-cooker", 2021-08-05T22:08:06.2883268Z "src_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2883925Z "dst": "terminus", 2021-08-05T22:08:06.2884759Z "dst_namespace": "linkerd-direct-edges-test", 2021-08-05T22:08:06.2885901Z "client_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887153Z "server_id": "default.linkerd-direct-edges-test", 2021-08-05T22:08:06.2887896Z "no_tls_reason": "" 2021-08-05T22:08:06.2888250Z } 2021-08-05T22:08:06.2888572Z ] 2021-08-05T22:08:06.2888870Z ``` What is happening is Prometheus has not scraped workloads yet, so there is no edge to or from Prometheus and slow-cooker. While it's remains unclear why exactly this happens, this change addresses two potential problems: 1. Print the cluster's pods so that we can determine if the Prometheus pod is injected. If it's not injected, we'll be able to debug further with that information. 2. Introduce retries to the `TestEdges` test. As we already have retries in the `TestDirectEdges`, this makes a similar change to `TestEdges`. Signed-off-by: Kevin Leimkuhler <[email protected]> Signed-off-by: Sanni Michael <[email protected]>
Fixes #3706
The implementation of the
linkerd viz edgescommand works by gathering http and tcp metrics in both the inbound and outbound directions and combining this data in dubious ways.We make the implementation simpler and more correct by instead doing the following:
tcp_open_connectionsmetric)client_idlabel, so we fill in this missing data by inspecting the source pod via the k8s api and reconstruct that pod's TLS identity based on it's service account name and namespace.Signed-off-by: Alex Leong [email protected]