Bug Description
Claude Code sends SIGTERM to all stdio-based MCP servers simultaneously, 10–60 seconds after successful connection and handshake. No errors precede the kill — servers are healthy and actively responding to tool calls. The timeout interval shrinks over the session lifetime (60s → 30s → 10s). The only recovery is manual /mcp reconnection, which itself gets killed again.
This is a systemic issue affecting every stdio MCP server configured in the session. Cloud-hosted MCPs (Gmail, Google Calendar via claude.ai) are unaffected because they use a different transport.
Root Cause Analysis
I deployed three layers of instrumentation to trace the root cause:
1. strace on Claude Code's process tree
sudo strace -p <claude_pid> -e kill,tgkill -f -t
Captured:
1480540 21:57:45 kill(1501128, SIGINT) = 0 # Main Claude PID kills one MCP
1501518 21:57:57 kill(1501129, SIGTERM) = 0 # Child wrapper kills another MCP
1480540 21:58:02 kill(1501627, SIGINT) = 0 # Main Claude PID kills another
PID 1501518 is a short-lived Claude child process (MCP lifecycle wrapper). It spawns around each MCP server, and deliberately sends SIGTERM to kill it.
2. Watchdog process monitor
A polling script that tracks MCP child processes by PID, logs when they appear/disappear:
[21:40:25] NEW: PID=1480580 (chrome-devtools-mcp) fd0=socket fd1=/dev/null
[21:40:25] NEW: PID=1480584 (typst-mcp) fd0=socket fd1=socket
[21:40:25] NEW: PID=1480766 (fli-mcp) fd0=socket fd1=socket
[21:40:25] NEW: PID=1480803 (mcp-stdio-proxy.sh) fd0=socket fd1=socket
[21:40:25] NEW: PID=1480840 (outlook-owa) fd0=socket fd1=socket
[21:41:05] GONE: PID=1480584 (typst-mcp) — exit code: 127
[21:41:05] GONE: PID=1480580 (chrome-devtools-mcp) — exit code: 127
[21:41:05] GONE: PID=1480766 (fli-mcp) — exit code: 127
[21:41:05] GONE: PID=1480803 (mcp-stdio-proxy.sh) — exit code: 127
[21:41:05] GONE: PID=1480840 (outlook-owa) — exit code: 127
All 5 MCP servers killed at the exact same second, 40 seconds after startup.
3. JSON-RPC stdio proxy
A transparent bidirectional proxy that logs all JSON-RPC messages between Claude Code and an MCP server:
[21:40:20] C->S: initialize request
[21:40:20] S->C: initialize response (success, 16 tools listed)
[21:40:20] C->S: notifications/initialized
[21:40:20] C->S: tools/list
[21:40:20] S->C: tools/list response (success)
[21:41:01] PROXY: SIGTERM received
[21:41:01] PROXY: Server died with signal TERM (143)
No errors, no failed requests, no compaction event. Clean SIGTERM 41 seconds after a successful handshake.
Hypotheses Ruled Out
| Hypothesis |
Evidence |
Verdict |
| Context compaction |
No PostCompact hook fired; happens too early in session |
❌ Eliminated |
| Individual MCP crashes |
All 5 die simultaneously with same exit code |
❌ Eliminated |
| MCP server idle timeout |
Called tools right after reconnect — still killed 10s later |
❌ Eliminated |
| Hooks killing MCPs |
Audited all hooks in ~/.claude/hooks/ — none target MCPs |
❌ Eliminated |
| External process (cron/reaper) |
Only systemd timer runs at 2am, only targets orphans (PPID=1) |
❌ Eliminated |
Conclusion
Claude Code has an internal stdio timeout/lifecycle mechanism that kills healthy MCP servers. Evidence:
- strace confirms CC spawns a wrapper process per MCP that sends SIGTERM
- CC changelog mentions a fix for "MCP stdio server timeout not killing child process" — confirming this timeout exists by design
MCP_TIMEOUT env var exists to configure it, but the default appears too aggressive
- The timeout fires even when MCPs are healthy and actively responding
Impact
This effectively breaks the MCP extensibility model for power users. Anyone running multiple stdio MCPs (browser automation, email, calendars, databases, custom tools) loses their entire tool surface repeatedly throughout a session. The failure is silent — no error message, no warning. Tools simply stop being available.
Prior issues reporting this symptom were auto-closed as duplicates or for inactivity, but none identified the root cause:
Reproduction
- Configure 3+ stdio MCP servers in
~/.claude.json
- Start a Claude Code session
- Verify MCPs connect via
/mcp
- Wait 10–60 seconds
- All stdio MCPs will disconnect simultaneously
Reproduction instrumentation
mcp-watchdog.sh — polls child processes, detects when MCPs appear/disappear
#!/bin/bash
# Usage: mcp-watchdog.sh <claude_pid>
# Run in a separate terminal. Auto-stops when Claude exits.
LOG="$HOME/.claude/logs/mcp-disconnect-debug.log"
INTERVAL=10
CLAUDE_PID="${1:?Usage: mcp-watchdog.sh <claude_pid>}"
MCP_PATTERNS="chrome-devtools-mcp|typst-mcp|fli-mcp|google-tasks|outlook-owa/server|discord.*server"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] [watchdog] $*" >> "$LOG"; }
declare -A PREV_PIDS
log "=== WATCHDOG STARTED for Claude PID=$CLAUDE_PID ==="
while kill -0 "$CLAUDE_PID" 2>/dev/null; do
declare -A CURR_PIDS
for pid in $(pgrep -P "$CLAUDE_PID" 2>/dev/null); do
CMDLINE=$(cat /proc/$pid/cmdline 2>/dev/null | tr '\0' ' ' | head -c 200)
[ -z "$CMDLINE" ] && continue
echo "$CMDLINE" | grep -qE "$MCP_PATTERNS" || continue
FD0=$(readlink /proc/$pid/fd/0 2>/dev/null || echo "GONE")
FD1=$(readlink /proc/$pid/fd/1 2>/dev/null || echo "GONE")
STATE="$CMDLINE|$FD0|$FD1"
CURR_PIDS[$pid]="$STATE"
if [ -z "${PREV_PIDS[$pid]}" ]; then
SHORT=$(echo "$CMDLINE" | grep -oE '[^ ]*mcp[^ ]*|chrome-devtools|typst|google-tasks|outlook|discord' | head -1)
log " NEW: PID=$pid ($SHORT) fd0=$FD0 fd1=$FD1"
fi
done
for pid in "${!PREV_PIDS[@]}"; do
if [ -z "${CURR_PIDS[$pid]}" ]; then
SHORT=$(echo "${PREV_PIDS[$pid]}" | grep -oE '[^ ]*mcp[^ ]*|chrome-devtools|typst|google-tasks|outlook|discord' | head -1)
log " GONE: PID=$pid ($SHORT) — process disappeared!"
fi
done
unset PREV_PIDS; declare -A PREV_PIDS
for pid in "${!CURR_PIDS[@]}"; do PREV_PIDS[$pid]="${CURR_PIDS[$pid]}"; done
unset CURR_PIDS
sleep "$INTERVAL"
done
log "=== WATCHDOG STOPPED ==="
mcp-stdio-proxy.sh — logs all bidirectional JSON-RPC traffic between CC and an MCP server
#!/bin/bash
# Usage: mcp-stdio-proxy.sh <logfile> <command> [args...]
# Configure in ~/.claude.json as the MCP command, wrapping the real server.
LOGFILE="${1:?Usage: mcp-stdio-proxy.sh <logfile> <command> [args...]}"
shift; COMMAND="${1:?Missing command}"; shift
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $1: $2" >> "$LOGFILE"; }
log "PROXY" "=== PROXY STARTED (PID=$$, PPID=$PPID) ==="
log "PROXY" "Command: $COMMAND $*"
TMPDIR=$(mktemp -d /tmp/mcp-proxy-XXXXXX)
C2S="$TMPDIR/c2s"; S2C="$TMPDIR/s2c"
mkfifo "$C2S" "$S2C"
cleanup() {
log "PROXY" "=== CLEANUP (signal=${1:-EXIT}) ==="
if kill -0 "$SERVER_PID" 2>/dev/null; then
log "PROXY" "Server still alive at cleanup"
else
wait "$SERVER_PID" 2>/dev/null; ec=$?
[ "$ec" -gt 128 ] && log "PROXY" "Server died with signal $((ec-128)) ($(kill -l $((ec-128)) 2>/dev/null))"
[ "$ec" -le 128 ] && [ "$ec" -ne 0 ] && log "PROXY" "Server exited with code $ec"
fi
kill "$C2S_PID" "$S2C_PID" "$SERVER_PID" 2>/dev/null
rm -rf "$TMPDIR"
}
trap 'cleanup TERM' TERM; trap 'cleanup INT' INT
exec 3<&0; exec 4>&1
"$COMMAND" "$@" < "$C2S" > "$S2C" 2>> "$LOGFILE" &
SERVER_PID=$!
( while IFS= read -r line <&3; do log "C->S" "$line"; echo "$line"; done > "$C2S" ) &
C2S_PID=$!
( while IFS= read -r line; do log "S->C" "$line"; echo "$line" >&4; done < "$S2C" ) &
S2C_PID=$!
wait "$SERVER_PID" 2>/dev/null; SERVER_EXIT=$?
cleanup "server-exit"; exit "$SERVER_EXIT"
Expected Behavior
- Stdio MCP servers should remain connected for the lifetime of the session unless they crash or the user disconnects them
- If a timeout exists by design, it should only fire when the MCP server is genuinely unresponsive (not responding to ping/heartbeat), not on a wall-clock timer
MCP_TIMEOUT default should be documented
Environment
- Platform: Ubuntu Linux (x86_64)
- Claude Code: 2.1.86
- MCP servers tested: chrome-devtools-mcp, outlook-owa, google-tasks, typst-mcp, flights-mcp (all stdio)
- Not affected: Gmail, Google Calendar (cloud-hosted, different transport)
Bug Description
Claude Code sends SIGTERM to all stdio-based MCP servers simultaneously, 10–60 seconds after successful connection and handshake. No errors precede the kill — servers are healthy and actively responding to tool calls. The timeout interval shrinks over the session lifetime (60s → 30s → 10s). The only recovery is manual
/mcpreconnection, which itself gets killed again.This is a systemic issue affecting every stdio MCP server configured in the session. Cloud-hosted MCPs (Gmail, Google Calendar via claude.ai) are unaffected because they use a different transport.
Root Cause Analysis
I deployed three layers of instrumentation to trace the root cause:
1. strace on Claude Code's process tree
Captured:
PID 1501518 is a short-lived Claude child process (MCP lifecycle wrapper). It spawns around each MCP server, and deliberately sends SIGTERM to kill it.
2. Watchdog process monitor
A polling script that tracks MCP child processes by PID, logs when they appear/disappear:
All 5 MCP servers killed at the exact same second, 40 seconds after startup.
3. JSON-RPC stdio proxy
A transparent bidirectional proxy that logs all JSON-RPC messages between Claude Code and an MCP server:
No errors, no failed requests, no compaction event. Clean SIGTERM 41 seconds after a successful handshake.
Hypotheses Ruled Out
~/.claude/hooks/— none target MCPsConclusion
Claude Code has an internal stdio timeout/lifecycle mechanism that kills healthy MCP servers. Evidence:
MCP_TIMEOUTenv var exists to configure it, but the default appears too aggressiveImpact
This effectively breaks the MCP extensibility model for power users. Anyone running multiple stdio MCPs (browser automation, email, calendars, databases, custom tools) loses their entire tool surface repeatedly throughout a session. The failure is silent — no error message, no warning. Tools simply stop being available.
Prior issues reporting this symptom were auto-closed as duplicates or for inactivity, but none identified the root cause:
Reproduction
~/.claude.json/mcpReproduction instrumentation
mcp-watchdog.sh — polls child processes, detects when MCPs appear/disappear
mcp-stdio-proxy.sh — logs all bidirectional JSON-RPC traffic between CC and an MCP server
Expected Behavior
MCP_TIMEOUTdefault should be documentedEnvironment