Skip to content

Use authority override from metadata#458

Merged
olix0r merged 7 commits intomasterfrom
zd/use-auth-override
Apr 21, 2020
Merged

Use authority override from metadata#458
olix0r merged 7 commits intomasterfrom
zd/use-auth-override

Conversation

@zaharidichev
Copy link
Member

@zaharidichev zaharidichev commented Mar 10, 2020

Overwrite the authority of the outbound req based on the data received from dst service. Whenever we averwrite the URI auth, we also overwrite the canonical_dst header and set the request to have an absolute form instead of origin one.

Signed-off-by: Zahari Dichev [email protected]

@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch 2 times, most recently from da211a7 to 2eba86d Compare March 12, 2020 19:41
@olix0r
Copy link
Member

olix0r commented Mar 18, 2020

I have some style comments... but before that: can you elaborate on why this is related to absolute-form? It’s not immediately obvious to me how the two concepts are related.

Cc @seanmonstar

@zaharidichev
Copy link
Member Author

@olix0r Maybe that should be made more explicit, but my reasoning is that if we overwrite the authority, we are sending that to a proxy (nginx). If that is the case the RFC states that:

The absoluteURI form is REQUIRED when the request is being made to a proxy. The proxy is requested to forward the request or service it from a valid cache, and return the response.

@olix0r
Copy link
Member

olix0r commented Mar 18, 2020

How does that interact with HTTP2 upgrading though?

My gut instinct is that we should decouple “overwrite the authority” and “force absolute form “...

If we just overwrite the header without forcing absolute form, do things still work?

Also keep in mind that the gateway is a linkerd implementation detail and not a generalized proxy

@zaharidichev
Copy link
Member Author

How does that interact with HTTP2 upgrading though?

As far as I understand, this only pertains to HTTP1. So if H1 is upgraded to H2, not much changes, then the request is downgraded on the other side it comes in abs form. Is that what you are asking about?

If we just overwrite the header without forcing absolute form, do things still work?

Yes.

Also keep in mind that the gateway is a linkerd implementation detail and not a generalized proxy

Yes that is correct. The only reason I did is because based on my conversations with @seanmonstar I reached the conclusion that setting to abs form is the semantically correct thing to do. That being said, this does not mean my conclusion was right. And I also think we can live without it.

@olix0r
Copy link
Member

olix0r commented Mar 18, 2020

@zaharidichev my concern about absolute-form is mostly about how it may interact with client caching (since absolute-form is a client-wide setting). Let me dig in a little more today so I can make a reasonable recommendation.

Copy link
Contributor

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to defer to Oliver about whether or absolute-form is the right thing here. In the mean time, I left some low-priority feedback to address if we decide to move forward with this approach.

Comment on lines 51 to 55
let abs_form = if endpoint.abs_form() {
true
} else {
endpoint.inner.settings.was_absolute_form()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think this is more concisely expressed as

Suggested change
let abs_form = if endpoint.abs_form() {
true
} else {
endpoint.inner.settings.was_absolute_form()
};
let abs_form = endpoint.abs_form() || endpoint.inner.settings.was_absolute_form();

trace!(
header = %orig_proto::L5D_ORIG_PROTO,
was_absolute,
abs_form,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since the name here will be the name of the structured field in the emitted trace event, i have a slight preference for keeping it was_absolute — the meaning is clearer IMO.

Comment on lines 84 to 88
let abs_form = if endpoint.abs_form() {
true
} else {
endpoint.inner.settings.was_absolute_form()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: similarly, this can be

Suggested change
let abs_form = if endpoint.abs_form() {
true
} else {
endpoint.inner.settings.was_absolute_form()
};
let abs_form = endpoint.abs_form() || endpoint.inner.settings.was_absolute_form();

trace!(
header = %orig_proto::L5D_ORIG_PROTO,
%was_absolute,
%abs_form,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: again, was_absolute will be clearer when reading logs, in my book...

can_upgrade,
inner,
was_absolute,
was_absolute: abs_form,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: and, if we kept the name was_absolute for the local variable, we could just use local variable shorthand here as we did previously...

Comment on lines 134 to 138
let abs_from = if target.abs_form() {
true
} else {
was_absolute_form
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
let abs_from = if target.abs_form() {
true
} else {
was_absolute_form
};
let abs_from = target.abs_form() || was_absolute_form;

}

pub trait ForceAbsForm {
fn abs_form(&self) -> bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: since this returns a boolean, i would prefer to call it something starting with is_ or should_, like

Suggested change
fn abs_form(&self) -> bool;
fn is_abs_form(&self) -> bool;

also, the meaning of the trait is not super clear — my understanding is that if this returns true, that means that the authority should be converted to absolute form, not that it already is in absolute form, is that correct? it would be nice if we could add comments , or, ideally, rework the naming a bit, to make this clearer?

@olix0r
Copy link
Member

olix0r commented Mar 23, 2020

@zaharidichev In order to get things slightly easier for me to move this forward, is it possible to split this PR such that we have 1 change that only wires the AuthorityOverride metadata through discovery? This change will be easy to ship, and it will also help me to review a narrower change about authority overwriting.

I think I remain -1 on coupling this feature with absolute-form, but it will be easier to propose an alternative once we've got the baseline changes onto master.

@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch from c490f34 to 0d69efc Compare March 24, 2020 11:18
@zaharidichev zaharidichev changed the base branch from master to zd/wire-auth-override March 24, 2020 11:18
@zaharidichev
Copy link
Member Author

@olix0r Thanks for the feedback. I have split it into two PRs. The first one that just wires the metadata coming from the dst API is #462. Then we have this which is a PR agains it that uses the change. I have also addressed the feedback from @hawkw

@zaharidichev zaharidichev force-pushed the zd/wire-auth-override branch from 31d96ec to e7aafd0 Compare March 24, 2020 12:41
@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch from 0d69efc to a35885d Compare March 24, 2020 12:41
#[derive(Clone, Debug)]
pub struct MakeSvc<E, H, M> {
extractor: E,
canonical_dst_header: H,
Copy link
Member

@olix0r olix0r Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising to me that this is coupled to the canonical dst header. Why is this necessary? Is it possible to move the layer that applies this header so that it's set after the overwrite is performed, for instance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So.. yes in theory we can do that. It just made sense to me that we have this separate layer that overwrites authoirty and the canonical_dst header as a separate step, even though the canonical_dst was already set. Also, seems to me the canonical_dst is set from the Target.Addr at the moment via header_from_target, which is not modified by the authority_overwrite layer. We can split all of that up. I am not sure coupling the authority overwrite to the canonical_dst troubles me so much as we actually want to overwrite both, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My view is that this module shouldn't know anything about canonical dst headers. What other headers does it need to know about? This shouldn't have to grow to know about arbitrary Linkerd headers. Should this module take a list of arbitrary headers to overwrite? How do we decouple this from this particular linkerd feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even appropriate to set the canonical dst header, since we're setting a canonical authority anyway? Is it more appropriate to clear it entirely? How do we expect service profiles to interact with this in general?

Copy link
Member Author

@zaharidichev zaharidichev Apr 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olix0r Good questions! So the reason I am overwriting the canonical dst and host at that point is because of the situation that arises when doing traffic splitting. What happens is the following:

Lets say we have a ts that looks like:

apex-svc: backend-zero-svc (on the local cluster)
leafsvc: backend-two-svc-remote (on the remote cluster).

If we do not set the canonical dst the request will leave the proxy with the floowing values for these headers:

l5d-dst-canonical: backend-zero-svc.default.svc.cluster.local:8888
host: backend-zero-svc:8888

So when this request arrives at the remote gateway, it cannot be routed. This is why I decided to overwrite these at that point. Maybe I am missing the context as to whether this is appropriate or not, but it seemed correct to me. Wrt to service profiles, I am not sure thie interferes with them too much. The profiles on the outbound side a resolved before this rewriting is applied. The profiles on the incoming side (the remote cluster) are resolved based on the canonical_dst that we set, which will be equal to the name of the service that actually exists on the remote cluster. So this seems correct to me. I guess the same thing will work if we blank the value of this header, because then the incoming Recognize will rely on the authority of the request. So any suggestion as to how we want to handle that without coupling are more than welcome as I might be missing a great deal of context here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I follow what you're saying. Here's what's tripping me up: within a cluster, service profiles are only ever applied on a logical name (i.e. in the face of a traffic split, the inbound proxy uses the canonical-form logical dst to resolve the service profile). Typically, both the outbound and inbound proxies refer to the same service profile data (though there's some added complexity about why that's not always the case). So, I guess my question is: as a user, what sort of service name would i reference when configuring a service profile for inbound traffic on the remote cluster? Do we have a concrete example of the routing logic in play for a traffic-split that goes through a gateway? That would help me be clearer on this.

My general concern about the coupling with canonical-dst is that if there's one such case, there are likely to be others. Does it make sense to take a list of headers to clear in this case? Or should we support adding a Proxy (as we do for service profile routes) to support adding arbitrary logic to gateway traffic?

@zaharidichev zaharidichev changed the base branch from zd/wire-auth-override to master April 9, 2020 08:38
@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch from a35885d to 2d4186f Compare April 9, 2020 08:42
@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch from 41cfb70 to d65261d Compare April 9, 2020 10:08
Signed-off-by: Zahari Dichev <[email protected]>
@zaharidichev zaharidichev force-pushed the zd/use-auth-override branch from ffd67d8 to dae1707 Compare April 14, 2020 09:41
olix0r and others added 2 commits April 20, 2020 22:44
* Remove the IntoAbsForm trait

This change modifies the `HasSettings` trait to return a `Settings`
(rather than a reference). The `Settings` type is a simple type with a
few bools, so it's safe to copy freely.

This enables us to implement `HasSettings` for `http::HttpEndpoint` by
overriding the `was_absolute_from` when appropriate without having to
introduce an additional trait

* Replace ExtractAuthority with CanOverrideAuthority (#481)

The extractor pattern is useful when we need to use additional
configuration or when we don't control the type over which we're
extracting a value (like an http::Request). But in the case of authority
overrides, we have all of the information we need on the target type.

This change replaces the ExtractAuthority strategy with a
CanOverrideAuthority trait that must be implemented by targets.

I've also modified logging to be at the debug level (request
modification seems useful to observe when debugging), and I've renamed
some variables to match the tracing output.

I think, as a follow up, I'd prefer that we call this module
`override_authority` (rather than "overwrite"), as this matches the
naming in the controller/API.

I've also moved `layer => Layer::new`, which is the newer idiom we're
targetting.
OrigProtoUpgrade shouldn't have to know about authority overrides. This
change uses the `HasSettings` implementation on endpoint to hide this
detail.
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking with this, @zaharidichev !

@olix0r olix0r requested review from a team and removed request for adleong April 21, 2020 16:51
} => Settings::Http1 {
keep_alive,
wants_h1_upgrade,
// Always use absolute form when an onverride is present.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo:

Suggested change
// Always use absolute form when an onverride is present.
// Always use absolute form when an override is present.

@olix0r olix0r merged commit c72c5f5 into master Apr 21, 2020
@olix0r olix0r deleted the zd/use-auth-override branch April 21, 2020 22:15
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Apr 22, 2020
This release introduces a per-endpoint authority-override feature. This
is driven by the destination controller and is needed to support
mutli-cluster gateways.

---

* Update to Rust 1.42.0 (linkerd/linkerd2-proxy#483)
* Adjust metric description. (linkerd/linkerd2-proxy#484)
* Use authority override from metadata (linkerd/linkerd2-proxy#458)
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Apr 22, 2020
This release introduces a per-endpoint authority-override feature. This
is driven by the destination controller and is needed to support
mutli-cluster gateways.

---

* Update to Rust 1.42.0 (linkerd/linkerd2-proxy#483)
* Adjust metric description. (linkerd/linkerd2-proxy#484)
* Use authority override from metadata (linkerd/linkerd2-proxy#458)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants