-
-
Notifications
You must be signed in to change notification settings - Fork 69.1k
Health monitor false-positive stale-socket restarts on idle Discord channels #58339
Description
Problem
The channel health monitor restarts idle-but-connected Discord channels every ~35 minutes. When no dispatch events occur within the 30-minute staleEventThresholdMs window, evaluateChannelHealth() returns stale-socket — even though the WebSocket is alive.
Impact: 63 unnecessary restarts over 3 days (23 on Mar 29, 39 on Mar 30, 1 on Mar 31 before fix).
[health-monitor] [discord:default] health-monitor: restarting (reason: stale-socket)
# repeats every ~35 min (30 min threshold + 5 min check interval)
Root Cause
In channel-health-policy.ts, the eventAge > staleEventThresholdMs branch returns stale-socket without checking snapshot.connected:
if (eventAge > policy.staleEventThresholdMs) {
return { healthy: false, reason: "stale-socket" }; // no connected check
}This conflates "no user activity" with "dead socket". Checking snapshot.connected is sufficient because Discord Gateway uses a single WebSocket for both heartbeats and dispatch events (opcode 0). If the connection were truly dead, heartbeat ACKs would stop and discord.js would set connected = false. There is no scenario where heartbeats succeed but event delivery silently fails.
Fix
const eventAge = policy.now - snapshot.lastEventAt;
if (eventAge > policy.staleEventThresholdMs) {
+ if (snapshot.connected === true) {
+ return { healthy: true, reason: "healthy" };
+ }
return { healthy: false, reason: "stale-socket" };
}Result
After deploying at 2026-03-31T00:42Z: 0 stale-socket restarts in 9+ hours (previously ~2/hour). healthVersion reached 215+ confirming repeated healthy evaluations.