Skip reconnect for unrecoverable transport disconnects by kevinyang372 · Pull Request #10096 · warpdotdev/warp

kevinyang372 · 2026-05-04T22:04:04Z

Description

We are seeing ~850 Sentry errors in 2 days (WARP-CLIENT-BETA-STABLE-7JNH) from doomed reconnect attempts after SSH disconnects caused by system sleep / network loss.

When the Mac sleeps, the SSH TCP connection dies but the ControlMaster process stays alive locally (no keepalives configured). The remote-server-proxy reader task detects EOF and triggers reconnect, but reconnecting through the same ControlMaster is futile since its TCP connection is dead. Both attempts fail with "Response channel closed before receiving a reply" and the errors get reported to Sentry.

Root cause: the reconnect flow had no way to distinguish a recoverable disconnect (remote server process crashed, SSH connection still alive) from an unrecoverable one (SSH connection itself is dead).

Fix: Add is_reconnectable(exit_status) to the RemoteTransport trait so each transport can decide whether a reconnect is viable. SshTransport returns false when the exit code is 255 (SSH connection-level error) or the process was signal-killed, indicating the ControlMaster connection is dead. mark_session_disconnected consults the transport before entering the reconnect loop.

This is a required trait method (no default impl) so future transports must explicitly consider reconnectability.

Changes

RemoteTransport trait (transport.rs): Added required is_reconnectable method
SshTransport (ssh_transport.rs): Implements is_reconnectable — returns false for exit code 255 / signal kill
RemoteServerManager (manager.rs): mark_session_disconnected calls transport.is_reconnectable() before attempting reconnect; extracted finalize_disconnect helper to deduplicate the disconnect-and-emit pattern

Linked Issue

Sentry: WARP-CLIENT-BETA-STABLE-7JNH

Testing

cargo clippy passes on both remote_server and warp crates
All 48 remote_server tests pass

Agent Mode

Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Conversation

oz-for-oss · 2026-05-04T22:04:16Z

@kevinyang372

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

kevinyang372 · 2026-05-04T22:04:17Z

Skip reconnect for unrecoverable transport disconnects #10096 👈 (View in Graphite)
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

oz-for-oss

Overview

This PR adds a transport-level reconnectability check and skips the reconnect loop when SSH reports certain exit statuses.

Concerns

SSH exit status 255 is treated as always non-reconnectable, but OpenSSH also returns 255 when a remote command is killed by a signal or terminates without sending an exit-status. Those cases can happen when the remote proxy or daemon dies while the ControlMaster is still usable, so this change can skip the intended reconnect path for recoverable failures.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

oz-for-oss · 2026-05-04T22:37:35Z

+    /// ControlMaster is futile.
+    fn is_reconnectable(&self, exit_status: Option<&RemoteServerExitStatus>) -> bool {
+        match exit_status {
+            Some(s) => s.code != Some(255) && !s.signal_killed,


⚠️ [IMPORTANT] OpenSSH returns 255 not only for ControlMaster/socket failures, but also when the remote command dies by signal or no exit status is received; treating every 255 as non-reconnectable skips reconnect for recoverable proxy/daemon crashes.

) ## Description We are seeing ~850 Sentry errors in 2 days ([WARP-CLIENT-BETA-STABLE-7JNH](https://warpdotdev.sentry.io/issues/7456268110/)) from doomed reconnect attempts after SSH disconnects caused by system sleep / network loss. When the Mac sleeps, the SSH TCP connection dies but the ControlMaster process stays alive locally (no keepalives configured). The `remote-server-proxy` reader task detects EOF and triggers reconnect, but reconnecting through the same ControlMaster is futile since its TCP connection is dead. Both attempts fail with "Response channel closed before receiving a reply" and the errors get reported to Sentry. **Root cause**: the reconnect flow had no way to distinguish a recoverable disconnect (remote server process crashed, SSH connection still alive) from an unrecoverable one (SSH connection itself is dead). **Fix**: Add `is_reconnectable(exit_status)` to the `RemoteTransport` trait so each transport can decide whether a reconnect is viable. `SshTransport` returns `false` when the exit code is 255 (SSH connection-level error) or the process was signal-killed, indicating the ControlMaster connection is dead. `mark_session_disconnected` consults the transport before entering the reconnect loop. This is a required trait method (no default impl) so future transports must explicitly consider reconnectability. ### Changes - **`RemoteTransport` trait** (`transport.rs`): Added required `is_reconnectable` method - **`SshTransport`** (`ssh_transport.rs`): Implements `is_reconnectable` — returns `false` for exit code 255 / signal kill - **`RemoteServerManager`** (`manager.rs`): `mark_session_disconnected` calls `transport.is_reconnectable()` before attempting reconnect; extracted `finalize_disconnect` helper to deduplicate the disconnect-and-emit pattern ## Linked Issue - Sentry: [WARP-CLIENT-BETA-STABLE-7JNH](https://warpdotdev.sentry.io/issues/7456268110/) ## Testing - `cargo clippy` passes on both `remote_server` and `warp` crates - All 48 `remote_server` tests pass ## Agent Mode - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode [Conversation](https://staging.warp.dev/conversation/ead2a14e-5ddd-4fe6-9a04-5ce7f48ec84f)

Only reconnect on reconnect-able errors

48f9b97

cla-bot Bot added the cla-signed label May 4, 2026

kevinyang372 changed the title ~~Only reconnect on reconnect-able errors~~ Skip reconnect for unrecoverable transport disconnects May 4, 2026

kevinyang372 requested a review from moirahuang May 4, 2026 22:25

moirahuang approved these changes May 4, 2026

View reviewed changes

kevinyang372 merged commit 39ff0d2 into master May 4, 2026
36 checks passed

kevinyang372 deleted the kevin/only-attempt-reconnect-on-reconnectable-errors branch May 4, 2026 22:35

oz-for-oss Bot reviewed May 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip reconnect for unrecoverable transport disconnects#10096

Skip reconnect for unrecoverable transport disconnects#10096
kevinyang372 merged 1 commit intomasterfrom
kevin/only-attempt-reconnect-on-reconnectable-errors

kevinyang372 commented May 4, 2026 •

edited

Loading

Uh oh!

oz-for-oss Bot commented May 4, 2026 •

edited

Loading

Uh oh!

kevinyang372 commented May 4, 2026

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Uh oh!

oz-for-oss Bot May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kevinyang372 commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Linked Issue

Testing

Agent Mode

Uh oh!

oz-for-oss Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinyang372 commented May 4, 2026

Uh oh!

Uh oh!

oz-for-oss Bot left a comment

Choose a reason for hiding this comment

Overview

Concerns

Verdict

Uh oh!

oz-for-oss Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kevinyang372 commented May 4, 2026 •

edited

Loading

oz-for-oss Bot commented May 4, 2026 •

edited

Loading