Skip to content

Conversation

@hawkw
Copy link
Contributor

@hawkw hawkw commented Nov 15, 2022

This branch implements header-based routing using the new client policy
API added in linkerd/linkerd2-proxy-api#165.

Overview

In particular, this branch does the following:

Client policy discovery:

  • Add a new linkerd-client-policy crate with internal types
    representing the client policies and code for converting protobuf
    policy messages to those representations
  • Add a new policy module in linkerd-app-outbound with an API client
  • Add client policy discovery to the outbound proxy's push_discover
    stack.

This means that profile discovery and client policy discovery are:

  • Stored in the same cache per original destination IP address.
  • Currently always occur for all destinations, although the client policy
    discovery is dropped when no profile is resolved or if the profile is a
    direct endpoint.
  • Occur in roughly the same place in the proxy and don't require
    propagating original destination addresses to new parts of the outbound
    stack.

Client policy HTTPRoute matching:

  • Add a router implementation in linkerd-client-policy for recognizing
    an HTTPRoute defined in a client policy
  • Add a new stack in the outbound proxy's push_http_logical stack that
    is switched to if a client policy is present, rather than the current
    Service Profile-based HTTP logical stack
  • Add the HTTPRoute router to the client policy stack

Client policy traffic splitting:

  • Move the traffic split implementation from linkerd-service-profiles
    into linkerd-client-policy, and update linkerd-service-profiles to
    depend on it from client-policy.
  • Separate the traffic split middleware into fixed and Dynamic
    (updated from discovery) traffic splits, because updates to the set of
    routes in a client policy will be handled by the policy router.
  • Build fixed traffic splits for each set of backends defined in an
    HTTPRoute rule in the client policy stack.

Not Currently Implemented

  • backendRefs that are a direct endpoint rather than a logical
    name are currently ignored
  • integration tests for client policy have yet to be written
  • only traffic splitting is currently implemented (no other policies
    can be set)
  • per route metrics?
  • client policy and Service Profile API lookups are currently run
    sequentially, but could be parallelized (with some rewriting of the
    current profile discovery code)

@adleong
Copy link
Member

adleong commented Nov 16, 2022

The API response for server policies includes metadata for both the
entire server policy and for the individual routes in that policy.
On the other hand, the client policy API response only includes
metadata for HTTP routes. Presumably we will want to add metadata for
the whole policy message as well?

It looks like in the "inbound" API, there is metadata for each authorization and each httproute. This makes sense because they correspond to k8s resources (e.g. to AuthorizationPolicy and HttpRoute). For "outbound", we attach HttpRoute resources to Services directly. We could have a top level metadata field which describes the Service resource, but this doesn't seem super useful to me.

@hawkw
Copy link
Contributor Author

hawkw commented Nov 16, 2022

We could have a top level metadata field which describes the Service resource, but this doesn't seem super useful to me.

hmm, yeah, i suppose the service would already be recorded in other labels..

@adleong
Copy link
Member

adleong commented Nov 28, 2022

Food for thought:

If I understand correctly, in the proxy we do a profile lookup in order to get a logical addr (i.e. Service DNS name), but then throw that away and instead do a client policy lookup using the orig dst. The controller uses that target ip address and does a lookup in a Service index to find the Service name with that IP. This is redundant.

Would it make sense to move the client policy lookup earlier in the proxy: before the lookup of the logical addr? I think this helps us move toward a world where the service profile API isn't used at all when a client policy exists.

hawkw added a commit that referenced this pull request Dec 6, 2022
Depends on #1992

This branch adds a very rough implementation of traffic splitting based
on client policy backends.

Client policy traffic splitting is implemented by moving the existing
traffic split implementation into the `linkerd-client-policy` crate, and
changing it to operate on the `Backend` type. The `Target` struct stored
in a ServiceProfile are replaced with the `Backend` type, so that the
same traffic split layer can work with a list of backends provided by
either a client policy or a ServiceProfile. The `Logical` target type
has a new impl of `Param<Vec<Backend>>` and `Param<BackendStream>` that
chooses whether to return backends from the client policy discovery or
the ServiceProfile based on whether or not a client policy was
discovered for that logical destination.

This could use some polish, and some cases aren't yet implemented
(client policy backends may be either a named address _or_ a direct
endpoint address, which is not supported by ServiceProfiles, so we'll
need to add additional switching code to handle that case). However,
this should be a decent working steelthread.
@hawkw hawkw changed the title rough draft of client policy plumbing add header-based routing using the outbound policy API Dec 6, 2022
hawkw added 22 commits December 6, 2022 13:02
Signed-off-by: Eliza Weisman <[email protected]>
these are served on separate ports
Depends on #1992

This branch adds a very rough implementation of traffic splitting based
on client policy backends.

Client policy traffic splitting is implemented by moving the existing
traffic split implementation into the `linkerd-client-policy` crate, and
changing it to operate on the `Backend` type. The `Target` struct stored
in a ServiceProfile are replaced with the `Backend` type, so that the
same traffic split layer can work with a list of backends provided by
either a client policy or a ServiceProfile. The `Logical` target type
has a new impl of `Param<Vec<Backend>>` and `Param<BackendStream>` that
chooses whether to return backends from the client policy discovery or
the ServiceProfile based on whether or not a client policy was
discovered for that logical destination.

This could use some polish, and some cases aren't yet implemented
(client policy backends may be either a named address _or_ a direct
endpoint address, which is not supported by ServiceProfiles, so we'll
need to add additional switching code to handle that case). However,
this should be a decent working steelthread.
This branch adds a rough draft of a router for selecting client policy
routes. Right now, once a route is selected, it just gets logged as a
demo that route matching works. A follow-up PR will implement HTTPRoute
traffic splitting once a route is matched.

Depends on #2021
This branch adds a rough draft of a router for selecting client policy
routes. Right now, once a route is selected, it just gets logged as a
demo that route matching works. A follow-up PR will implement HTTPRoute
traffic splitting once a route is matched.

Depends on #2021
@hawkw
Copy link
Contributor Author

hawkw commented Dec 7, 2022

updated the PR description to reflect the current state of this branch

Comment on lines 43 to 47
impl<T, N> NewService<T> for NewServiceRouter<N>
where
T: Param<Option<Receiver>> + Clone,
N: NewService<(Option<RoutePolicy>, T)> + Clone,
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to handle these as optional? We should use the stack to enforce that these are present; and all requests must have routes that go through this service, so I'd rather not promulgate optionality to the inner stack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is different from how it's handled in service profiles; but I think we want to drive towards a state where there is a route description on all requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the route is no longer optional as of fc32de7; the policy rx is still an Option, but i could change this to operate on a separate target type (there's already a TODO about this)...

@hawkw hawkw force-pushed the eliza/client-policy-api branch from f70e578 to 7255fcd Compare December 9, 2022 23:49
hawkw added 2 commits December 9, 2022 15:50
This reverts commit b2b0bd1. That was a
work-in-progress change that I didn't mean to commit, my bad!
Comment on lines 178 to 184
// for now, the client-policy route stack is just a fixed traffic split
let policy = concrete
.check_new_service::<(ConcreteAddr, Logical), _>()
.push_map_target(|(concrete, PolicyRoute { route: _, logical }): (ConcreteAddr, PolicyRoute)| {
(concrete, logical)
})
.push(policy::split::layer())
Copy link
Member

@olix0r olix0r Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong, but my reading of this is that the policy split creates an inner stack for every backend of every route. Does this mean that multiple routes to a single backend service do not share load balancers/connections? Or where are the load balancers cached if not here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct; we should probably add a caching layer here so that each concrete address's client stack is shared across multiple routes that share that backend.

When we add more forms of per-route policy, though (e.g. different retry policies, HTTPRoute filters, and per route metrics), we'll want to ensure that the portions of that stack that may differ based on which route is used are before the client stack cache, though.

Comment on lines +128 to +154
// Any per-route policy has to go between the split layer and
// the concrete stack cache, since we _don't_ want to share
// client policy across routes that have the same backend!
.push_map_target(
|(concrete, PolicyRoute { route: _, logical }): (ConcreteAddr, PolicyRoute)| {
(concrete, logical)
},
)
.push(policy::split::layer())
.push_on_service(
svc::layers()
.push(svc::layer::mk(svc::SpawnReady::new))
.push(
rt.metrics
.proxy
.stack
.layer(stack_labels("http", "HTTPRoute")),
)
.push(svc::FailFast::layer("HTTPRoute", dispatch_timeout))
.push_spawn_buffer(buffer_capacity),
)
.check_new_clone::<PolicyRoute>()
.push_map_target(|(route, logical)| {
tracing::debug!(?route);
PolicyRoute { route, logical }
})
.push(policy::http::NewServiceRouter::layer())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the concrete stack is cached, can we omit the cache/buffer around the split so that every route gets its own split?

If so, perhaps it makes sense to pull it all into one layer that is Logical-outside and Concrete-inside... We'll need to consider how/where per-route functionality is added for things like retries/tmeouts.

@hawkw hawkw closed this Feb 22, 2023
@olix0r olix0r deleted the eliza/client-policy-api branch March 7, 2023 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants