Skip to content

gate: Detect disconnected inner services in readiness#2491

Merged
olix0r merged 1 commit intomainfrom
ver/gatereadiness
Oct 25, 2023
Merged

gate: Detect disconnected inner services in readiness#2491
olix0r merged 1 commit intomainfrom
ver/gatereadiness

Conversation

@olix0r
Copy link
Member

@olix0r olix0r commented Oct 25, 2023

If Gate becomes ready, it assumes the inner service remains ready indefinitely.

Load balancers rely on lazy and redudant readiness checking to avoid disconnected endpoints.

This change fixes the Gate to ensure that the inner service is always polled whenever the gate is polled.

If `Gate` becomes ready, it assumes the inner service remains ready
indefinitely.

Load balancers rely on lazy and redudant readiness checking to avoid
disconnected endpoints.

This change fixes the Gate to ensure that the inner service is always
polled whenever the gate is polled.
@olix0r olix0r requested a review from a team as a code owner October 25, 2023 19:26
@olix0r
Copy link
Member Author

olix0r commented Oct 25, 2023

It looks like this behavior was specifically added in 172523f. Based on my reading of that commit message, it seems like this change fixes this by dropping the stored permit.

Comment on lines +176 to +178
// If we previously polled to ready and acquired a permit, clear it so
// we can reestablish readiness without holding it.
self.permit = Poll::Pending;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the comment, this makes sense to me.

}

#[tokio::test]
async fn gate_repolls_back_to_pending() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this behavior is complex enough that it would be nice to have a comment here explaining what specific properties we are testing for here?

@hawkw
Copy link
Contributor

hawkw commented Oct 25, 2023

@olix0r olix0r merged commit 4f68425 into main Oct 25, 2023
@olix0r olix0r deleted the ver/gatereadiness branch October 25, 2023 19:55
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Nov 2, 2023
This release includes several bugfixes. Notably, inbound proxies would
not properly reflect grpc-status in metrics by default.

Furthermore, proxies now long warnings when they receive unexpected
error responses from the control plane.

---

* chore: change `rust-toolchain` file to toml format (linkerd/linkerd2-proxy#2487)
* gate: Detect disconnected inner services in readiness (linkerd/linkerd2-proxy#2491)
* Bump ahash to v0.8.5 (linkerd/linkerd2-proxy#2498)
* gate: Fix readiness deadlock (linkerd/linkerd2-proxy#2493)
* Log a warning when the controller clients receive an error (linkerd/linkerd2-proxy#2499)
* inbound: Fix gRPC response classification (linkerd/linkerd2-proxy#2496)

Signed-off-by: Oliver Gould <[email protected]>
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Nov 2, 2023
This release includes several bugfixes. Notably, inbound proxies would
not properly reflect grpc-status in metrics by default.

Furthermore, proxies now long warnings when they receive unexpected
error responses from the control plane.

---

* chore: change `rust-toolchain` file to toml format (linkerd/linkerd2-proxy#2487)
* gate: Detect disconnected inner services in readiness (linkerd/linkerd2-proxy#2491)
* Bump ahash to v0.8.5 (linkerd/linkerd2-proxy#2498)
* gate: Fix readiness deadlock (linkerd/linkerd2-proxy#2493)
* Log a warning when the controller clients receive an error (linkerd/linkerd2-proxy#2499)
* inbound: Fix gRPC response classification (linkerd/linkerd2-proxy#2496)

Signed-off-by: Oliver Gould <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants