fix(gateway): distinguish disconnected from stuck in health-monitor restart reason#36436
Merged
steipete merged 2 commits intoopenclaw:mainfrom Mar 8, 2026
Merged
Conversation
Contributor
Greptile SummaryThis PR fixes a logging accuracy issue in the health-monitor restart path. The function Changes:
Verification:
Confidence Score: 5/5
Last reviewed commit: be13e1f |
be13e1f to
8cbb14f
Compare
steipete
added a commit
to Sid-Qin/openclaw
that referenced
this pull request
Mar 8, 2026
…estart reason resolveChannelRestartReason did not handle the "disconnected" evaluation reason explicitly, so it fell through to "stuck". This conflates a clean WebSocket drop (e.g. Discord 1006) with a genuinely stuck channel, making logs misleading and preventing future policy differentiation. Add "disconnected" to ChannelRestartReason and handle it before the catch-all "stuck" return. Closes openclaw#36404
8cbb14f to
c8e859e
Compare
Contributor
|
Landed via temp rebase onto main.
Thanks @Sid-Qin! |
openperf
pushed a commit
to openperf/moltbot
that referenced
this pull request
Mar 8, 2026
mcaxtr
pushed a commit
to mcaxtr/openclaw
that referenced
this pull request
Mar 8, 2026
Saitop
pushed a commit
to NomiciAI/openclaw
that referenced
this pull request
Mar 8, 2026
GordonSH-oss
pushed a commit
to GordonSH-oss/openclaw
that referenced
this pull request
Mar 9, 2026
jenawant
pushed a commit
to jenawant/openclaw
that referenced
this pull request
Mar 10, 2026
4 tasks
dhoman
pushed a commit
to dhoman/chrono-claw
that referenced
this pull request
Mar 11, 2026
senw-developers
pushed a commit
to senw-developers/va-openclaw
that referenced
this pull request
Mar 17, 2026
V-Gutierrez
pushed a commit
to V-Gutierrez/openclaw-vendor
that referenced
this pull request
Mar 17, 2026
alexey-pelykh
pushed a commit
to remoteclaw/remoteclaw
that referenced
this pull request
Mar 22, 2026
alexey-pelykh
added a commit
to remoteclaw/remoteclaw
that referenced
this pull request
Mar 22, 2026
…1796) * fix(ci): stabilize detect-secrets baseline (cherry picked from commit 08597e8) * fix(gateway): distinguish disconnected from stuck in health-monitor restart reason resolveChannelRestartReason did not handle the "disconnected" evaluation reason explicitly, so it fell through to "stuck". This conflates a clean WebSocket drop (e.g. Discord 1006) with a genuinely stuck channel, making logs misleading and preventing future policy differentiation. Add "disconnected" to ChannelRestartReason and handle it before the catch-all "stuck" return. Closes openclaw#36404 (cherry picked from commit 066d589) * fix: land health-monitor disconnected reason label (openclaw#36436) (thanks @Sid-Qin) (cherry picked from commit 1e05f14) * fix: restore Telegram webhook-mode health after restarts Landed from contributor PR openclaw#39313 by @fellanH. Co-authored-by: Felix Hellström <[email protected]> (cherry picked from commit 9d7d961) * fix(chat): preserve sender labels in dashboard history (cherry picked from commit 930caea) * refactor(channels): share native command session targets (cherry picked from commit e381ab6) --------- Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: SidQin-cyber <[email protected]> Co-authored-by: Felix Hellström <[email protected]> Co-authored-by: Ayaan Zaidi <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
resolveChannelRestartReasondoes not handledisconnectedevaluation reason explicitly — it falls through to"stuck", conflating a clean WebSocket drop (e.g. Discord code 1006) with a genuinely stalled channel.reason: stuckin logs for bots that simply lost their WebSocket connection. This makes triage harder and prevents future policy logic from treating disconnects differently (e.g. faster reconnect, skip cooldown)."disconnected"toChannelRestartReasonunion type and added an explicit branch inresolveChannelRestartReasonthat returns"disconnected"when the evaluation reason is"disconnected". Added a corresponding test case.evaluateChannelHealth, health-monitor scheduling, restart behavior, or any channel provider code. The restart still happens — only the logged/reported reason is now accurate.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
reason: disconnectedinstead ofreason: stuckwhen a channel's WebSocket connection drops while the bot is otherwise running normally.Security Impact (required)
NoNoNoNoNoRepro + Verification
Environment
Steps
Expected
reason: disconnectedActual
reason: stuckreason: disconnectedEvidence
TypeScript compiles cleanly. All 8 tests in
channel-health-policy.test.tspass:New test case:
Human Verification (required)
disconnectedevaluation now returns"disconnected"restart reason;stuckevaluation still returns"stuck";not-runningandstale-socketpaths unchanged.Compatibility / Migration
YesNoNoFailure Recovery (if this breaks)
"stuck"is restored.src/gateway/channel-health-policy.tsRisks and Mitigations
ChannelRestartReasonvalues. Mitigation: Grep confirmedreasonis only used in log messages (channel-health-monitor.tsline 153), not in conditional logic.