Skip to content

Introduce configurable values for protocol detection#11536

Merged
mateiidavid merged 8 commits intomainfrom
matei/detect-timeout
Nov 2, 2023
Merged

Introduce configurable values for protocol detection#11536
mateiidavid merged 8 commits intomainfrom
matei/detect-timeout

Conversation

@mateiidavid
Copy link
Member

Introduce configurable values for protocol detection

This change allows users to configure protocol detection timeout values
(outbound and inbound). Certain environments may find that protocol
detection inhibits debugging and makes it harder to reason with a
client's behaviour. In such cases (and not only) it may be deseriable to
change the default protocol detection timeout to a higher value than the
default 10s.

Through this change, users may configure their timeout values either
with install-time settings or through annotations; this follows our
usual proxy configuration model. The proxy uses different timeout values
for the inbound and outbound stacks (even though they use the same
default value) and this change respects that by adding two separate
fields.

Signed-off-by: Matei David [email protected]

This change allows users to configure protocol detection timeout values
(outbound and inbound). Certain environments may find that protocol
detection inhibits debugging and makes it harder to reason with a
client's behaviour. In such cases (and not only) it may be deseriable to
change the default protocol detection timeout to a higher value than the
default 10s.

Through this change, users may configure their timeout values either
with install-time settings or through annotations; this follows our
usual proxy configuration model. The proxy uses different timeout values
for the inbound and outbound stacks (even though they use the same
default value) and this change respects that by adding two separate
fields.

Signed-off-by: Matei David <[email protected]>
Signed-off-by: Matei David <[email protected]>
Signed-off-by: Matei David <[email protected]>
@mateiidavid mateiidavid requested a review from a team as a code owner October 26, 2023 13:12
@mateiidavid mateiidavid reopened this Oct 26, 2023
Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving users the opportunity to configure this timeout may encourage users to try tweaking this value, especially if they are reacting to seeing protocol detection timeouts. This may lead to a frustrating experience since there isn't really a "right" value for this timeout and changing it is unlikely to do what users want or expect.

What do you think, instead, of having this be a boolean helm value inboundDisableProtocolDetectionTimeout or something along those lines which simply sets the proxy's inbound protocol detection timeout to some absurdly high value (years? max_value? something like that). This should have the same effect but be less confusing.

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the unit test fixes 👍

Signed-off-by: Matei David <[email protected]>
@mateiidavid mateiidavid merged commit 1e6a019 into main Nov 2, 2023
@mateiidavid mateiidavid deleted the matei/detect-timeout branch November 2, 2023 14:03
olix0r added a commit that referenced this pull request Nov 2, 2023
This edge release fixes two bugs in the Destination controller that could cause
outbound connections to hang indefinitely.

* helm: Introduce configurable values for protocol detection ([#11536])
* destination: Fix GetProfiles error when address is opaque and unmeshed ([#11556])
* destination: Return NotFound for unknown pod names ([#11540])
* proxy: Log controller errors at WARN
* proxy: Fix grpc_status metric labels for inbound traffic

[#11536]: #11536
[#11556]: #11556
[#11540]: #11540
@olix0r olix0r mentioned this pull request Nov 2, 2023
olix0r added a commit that referenced this pull request Nov 2, 2023
This edge release fixes two bugs in the Destination controller that could cause
outbound connections to hang indefinitely.

* helm: Introduce configurable values for protocol detection ([#11536])
* destination: Fix GetProfiles error when address is opaque and unmeshed ([#11556])
* destination: Return NotFound for unknown pod names ([#11540])
* proxy: Log controller errors at WARN
* proxy: Fix grpc_status metric labels for inbound traffic

[#11536]: #11536
[#11556]: #11556
[#11540]: #11540
adleong pushed a commit that referenced this pull request Nov 16, 2023
This change allows users to configure protocol detection timeout values
(outbound and inbound). Certain environments may find that protocol
detection inhibits debugging and makes it harder to reason with a
client's behaviour. In such cases (and not only) it may be deseriable to
change the default protocol detection timeout to a higher value than the
default 10s.

Through this change, users may configure their timeout values either
with install-time settings or through annotations; this follows our
usual proxy configuration model. The proxy uses different timeout values
for the inbound and outbound stacks (even though they use the same
default value) and this change respects that by adding two separate
fields.

Signed-off-by: Matei David <[email protected]>
@adleong adleong mentioned this pull request Nov 16, 2023
adleong added a commit that referenced this pull request Nov 16, 2023
This stable release improves observability for the control plane by adding
additional logging to the destination controller and by adding histograms which
can detect Kubernetes informer lag. It also adds the ability to configure
protocol detection.

* Improved logging in the destination controller by adding the client pod's
  name to the logging context. This will improve visibility into the messages
  sent and received by the control plane from a specific proxy ([#11532])
* helm: Introduce configurable values for protocol detection ([#11536])
* Fixed an issue where the Destination controller could stop processing service
  profile updates, if a proxy subscribed to those updates stops reading them;
  this is a followup to the issue [#11491] fixed in [stable-2.14.2] ([#11546])
* In the Destination controller, added informer lag histogram metrics to track
  whenever the Kubernetes objects watched by the controller are falling behind
  the state in the kube-apiserver ([#11534])
* proxy: Fix grpc_status metric labels for inbound traffic

[stable-2.14.2]: https://github.com/linkerd/linkerd2/releases/tag/stable-2.14.2
[#11532]: #11532
[#11536]: #11536
[#11546]: #11546
[#11534]: #11534

---------

Signed-off-by: Matei David <[email protected]>
Signed-off-by: Alex Leong <[email protected]>
Co-authored-by: Matei David <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants