fix(transport): add SSE keepalive and configure HTTP server timeouts for proxy chain

## Problem

SSE connections through the proxy chain (Cloudflare → Envoy → Node.js) are killed after ~125 seconds of idle time because no keepalive mechanism exists.

### Evidence from Production Logs

Envoy access logs show ALL SSE GET requests dying at exactly ~125s with `DR` (Downstream Reset) flag and **0 response body bytes**:

```
[2026-01-23T10:23:54] "GET / HTTP/2" 200 DR 0 0 125032 1 "claude-code/2.1.17"
[2026-01-23T10:26:01] "GET / HTTP/2" 200 DR 0 0 125014 2 "claude-code/2.1.17"
[2026-01-23T10:28:07] "GET / HTTP/2" 200 DR 0 0 125030 2 "claude-code/2.1.17"
[2026-01-23T10:30:13] "GET / HTTP/2" 200 DR 0 0 125009 1 "claude-code/2.1.17"
```

### Proxy Chain Configuration (Verified)

```
Client → Cloudflare (idle timeout ~100-125s) → Envoy (lb1) → Node.js (192.168.95.53:3333)
```

**Envoy config is correct** (no timeouts for this route):
- Route: `timeout: 0s`, `idle_timeout: 0s`
- HCM: `stream_idle_timeout: 86400s`
- Cluster: TCP keepalive configured (300s/30s/5 probes)

**Cloudflare is the bottleneck** — kills connections with no data after ~100-125s.

### Sub-Problems

#### 1. No SSE Heartbeat/Keepalive

The MCP SDK's `StreamableHTTPServerTransport` has no built-in ping/heartbeat mechanism. During long tool calls (up to 47s with retries in `enhancedFetch`), the SSE stream is completely idle.

SSE spec supports comment-based keepalives:
```
: ping\n\n
```

These are ignored by clients but keep the connection alive through proxies.

#### 2. Node.js HTTP Server Default Timeouts

No explicit timeout configuration on the HTTP server:
- `keepAliveTimeout` defaults to **5000ms** (Node.js 24)
- `headersTimeout` defaults to **60000ms**

Between tool calls on the same HTTP/1.1 connection, a 5-second gap causes connection reset.

### Proposed Solution

#### SSE Heartbeat (every 30s)

After SSE stream is established, send periodic keepalive comments:

```typescript
// In server.ts, after SSE connection is established
const heartbeatInterval = setInterval(() => {
  try {
    controller.enqueue(encoder.encode(": ping\n\n"));
  } catch {
    clearInterval(heartbeatInterval);
  }
}, 30000); // Every 30 seconds — well under Cloudflare's 100s limit

// Clean up on stream close
stream.on('close', () => clearInterval(heartbeatInterval));
```

#### HTTP Server Timeouts

```typescript
const httpServer = http.createServer(app);
httpServer.keepAliveTimeout = 620000; // 620s — above Cloudflare's max (600s Enterprise)
httpServer.headersTimeout = 625000;   // Must be > keepAliveTimeout
httpServer.timeout = 0;               // No socket timeout for streaming
```

### Fix Checklist

- [ ] Add SSE keepalive ping (`:ping\n\n` every 30s) to Streamable HTTP transport
- [ ] Configure `keepAliveTimeout` on HTTP server (620s for proxy compatibility)
- [ ] Configure `headersTimeout` on HTTP server (> keepAliveTimeout)
- [ ] Set `server.timeout = 0` for SSE streaming support
- [ ] Make heartbeat interval configurable via env var (`GITLAB_SSE_HEARTBEAT_MS`)
- [ ] Add integration test: verify SSE stream survives > 125s with heartbeat
- [ ] Document proxy chain timeout requirements

### Impact

**Serious** — All SSE connections die after ~2 minutes regardless of transport routing fix (#138). This can be developed in parallel with #138.

### Files to Modify

- `src/server.ts` — HTTP server timeout configuration + SSE heartbeat setup

### Relationship

Independent of #138 (transport routing). Can be developed and tested in parallel. Both fixes are required for stable production operation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transport): add SSE keepalive and configure HTTP server timeouts for proxy chain #139

Problem

Evidence from Production Logs

Proxy Chain Configuration (Verified)

Sub-Problems

1. No SSE Heartbeat/Keepalive

2. Node.js HTTP Server Default Timeouts

Proposed Solution

SSE Heartbeat (every 30s)

HTTP Server Timeouts

Fix Checklist

Impact

Files to Modify

Relationship

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix(transport): add SSE keepalive and configure HTTP server timeouts for proxy chain #139

Description

Problem

Evidence from Production Logs

Proxy Chain Configuration (Verified)

Sub-Problems

1. No SSE Heartbeat/Keepalive

2. Node.js HTTP Server Default Timeouts

Proposed Solution

SSE Heartbeat (every 30s)

HTTP Server Timeouts

Fix Checklist

Impact

Files to Modify

Relationship

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions