Skip to content

outbound: initial tests for TCP mTLS#658

Closed
hawkw wants to merge 10 commits intomainfrom
eliza/more-tests
Closed

outbound: initial tests for TCP mTLS#658
hawkw wants to merge 10 commits intomainfrom
eliza/more-tests

Conversation

@hawkw
Copy link
Contributor

@hawkw hawkw commented Sep 16, 2020

This branch uses the new test-support code from #655 to add some initial
tests for TCP mTLS on the outbound router. The new tests use the mock IO
types and service discovery resolver added in #655, and test the
outbound TCP stack in isolation. Therefore, we don't run the rest of the
proxy, open any sockets, or spawn multiple threads.

In order to write these tests it was necessary to do some additional
internal refactoring. In particular, we were previously passing a mock
connect service into the outbound TCP stack. However, the connect stack
is responsible for handling TLS handshakes, so in order to run tests
with TLS, it was necessary to add a new function for building the TCP
connect stack with an arbitrary base service passed in, in order to
build the TCP connect stack around a mock connector.

Additionally, I've added a vendored version of the
tokio::io::DuplexStream type, which implements AsyncRead and
AsyncWrite for an in-memory channel. This exists upstream, but has yet
to be released, and the upstream implementation only exists for Tokio
0.3, so I just copied and pasted it. This was necessary to perform TLS
handshakes on a mock IO. The pre-configured tokio-test::io mock IOs
don't work in this case, since the tokio-rustls API is proactive
rather than reactive: it provides a future that performs a handshake on
an IO type, rather than an IO type that performs a handshake when
written to.

Finally, it was necessary to add a test feature to the linkerd2-io
crate that enables using the test IO Mock and DuplexStream types as
BoxedIos.

This branch adds tests for the following:

  • the proxy uses mTLS for client-first and server-first TCP connections
    when it receives a TLS identity from service discovery
  • the proxy uses mTLS for client-first and server-first TCP connections
    only if it receives a TLS identity from service discovery ---
    plaintext ports on the same IP are still plaintext
  • if a connection is accepted when an address does not have a TLS hint,
    and then one is received before the upstream connection is opened, it
    will use TLS

Signed-off-by: Eliza Weisman <[email protected]>
(also: make everything not be server-first!)

Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I have two nonblocking comments, but otherwise the tests seem good to me.

self.build_tcp_connect_with(connect, local_identity, metrics)
}

pub fn build_tcp_connect_with<C>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessaary? Can we provide a service and check its target?

In other words, why do we actually care about validating the socket/connection behavior? It comes at the cost of... a lot of verbosity.

self.build_tcp_connect_with(connect, local_identity, metrics)
}

pub fn build_tcp_connect_with<C>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do need, let's call it build_tcp_endpoint or something... and i think we shuold have a build_tcp_connect that only produces the connect layer...

C: tower::Service<TcpEndpoint, Error = std::io::Error> + Clone + Send + Unpin,
<C as tower::Service<TcpEndpoint>>::Future: Send + 'static,
<C as tower::Service<TcpEndpoint>>::Response:
tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin + Send + 'static,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do need to keep this, let's please use another intermediary type attr inttead of having to do all of the as tower::Service noise

olix0r pushed a commit that referenced this pull request Oct 1, 2020
In order to have `linkerd edges` return non-empty values for a raw TCP
connection's `CLIENT_ID`, the proxy's `tcp_open_total` metric needs to
include the `client_id` label for inbound connections, like the
`request_total` metrics for http connections does.

This PR changes the `TlsStatus` metric label type to include a peer
identity in the `Conditional::Some` case, rather than `()`. This means
that all metrics with TLS labels will now include the peer identity as
a label.

I've manually verified that this works by running Linkerd locally and
scraping the metrics:

For example, here's an excerpt from Grafana:
```
tcp_open_total{peer="src",direction="inbound",tls="no_identity",no_tls_reason="no_tls_from_remote"} 44
tcp_open_total{peer="dst",direction="inbound",tls="no_identity",no_tls_reason="loopback"} 2
tcp_open_total{peer="src",direction="inbound",tls="true",client_id="linkerd-prometheus.linkerd.serviceaccount.identity.linkerd.cluster.local"}
1
```
And from Prometheus
```
tcp_open_total{peer="dst",authority="10.42.0.25:4191",direction="outbound",dst_control_plane_ns="linkerd",dst_deployment="linkerd-grafana",dst_namespace="linkerd",dst_pod="linkerd-grafana-65597cf467-vq456",dst_pod_template_hash="65597cf467",dst_serviceaccount="linkerd-grafana",tls="true",server_id="linkerd-grafana.linkerd.serviceaccount.identity.linkerd.cluster.local"} 1
tcp_open_total{peer="dst",authority="10.42.0.25:3000",direction="outbound",dst_control_plane_ns="linkerd",dst_deployment="linkerd-grafana",dst_namespace="linkerd",dst_pod="linkerd-grafana-65597cf467-vq456",dst_pod_template_hash="65597cf467",dst_serviceaccount="linkerd-grafana",tls="true",server_id="linkerd-grafana.linkerd.serviceaccount.identity.linkerd.cluster.local"} 1
```

I'd like to have automated tests for this, but I'd prefer to not have to
write them in the integration style, and use the isolated mock service
style instead. So, tests can be added once #658 lands.

Refs: linkerd/linkerd2#4999
Fixes: linkerd/linkerd2#5031
@hawkw hawkw closed this in #693 Oct 6, 2020
hawkw added a commit that referenced this pull request Oct 6, 2020
This branch introduces a second pass at unit tests for TCP mTLS in the
outbound proxy, without the complexity of actually performing handshakes
on mock IOs (as proposed in #658). The new tests just rely on assertions
that the connect stack receives the expected peer identity metadata. We
can test that the handshake is performed correctly in separate tests for
_just_ the TLS client layer, while avoiding the complexity necessary to
use mock IOs in the existing connect stack. This also means we don't 
have to actually load and parse all the test key material we use in the
integration tests.

If this approach seems better, I'll open further PRs to add more tests
in this style.

Closes #658 

Signed-off-by: Eliza Weisman <[email protected]>
@olix0r olix0r deleted the eliza/more-tests branch May 25, 2021 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants