-
-
Notifications
You must be signed in to change notification settings - Fork 69.3k
[Bug]: Telegram polling loop hangs indefinitely when downloading large files (no download timeout) #40074
Description
Bug type
Crash (process/app exits or hangs)
Summary
When a large file (photo or document) is sent via Telegram,
the inbound polling loop hangs indefinitely and stops
processing all subsequent messages. The gateway process
stays alive but Billy (or any Telegram bot account) becomes
unresponsive until a full gateway restart.
Root cause (traced in source):
In fetchRemoteMedia → readResponseWithLimit, the response
body is read via a streaming reader.read() loop with no
timeout or AbortController signal. If Telegram's file
download stalls mid-stream (e.g., large file, network
congestion), the loop never resolves and permanently blocks
the Telegram polling loop for that account.
The overflow (maxBytes) check works correctly when
Content-Length is present — but Telegram does not always
send Content-Length for large files, so the size check is
skipped and the blocking download begins.
Relevant code: local-roots-ClyFgL1E.js →
readResponseWithLimit and fetchRemoteMedia
Expected behavior: The download should be aborted after a
reasonable timeout (e.g., 30s) and the message skipped with
a warning — not block the polling loop.
Workaround: Set channels.telegram.mediaMaxMb to a low value
(e.g., 4) so files with a Content-Length header are
rejected before download. Does not help when Content-Length
is absent.
Suggested fix: Pass an AbortSignal with a timeout (e.g.,
AbortSignal.timeout(30_000)) into the fetchWithSsrFGuard
call inside fetchRemoteMedia.
Steps to reproduce
- Set up OpenClaw with a Telegram bot account (polling
mode) - Send a photo or file ≥5MB (the default mediaMaxMb) to
the bot via Telegram — particularly as a document/file
rather than a compressed photo, so Telegram does not send a
Content-Length header - Observe that the bot stops responding to all subsequent
messages
Expected behavior
The file download should time out after a reasonable period
(e.g. 30s), the oversized/stalled message should be
skipped with a warning logged, and the polling loop should
continue processing subsequent messages normally. If
sendOversizeWarning is enabled, the user should receive a
reply like
Actual behavior
The polling loop for the affected bot account hangs
indefinitely. All subsequent Telegram messages — including
plain text — are never processed. The gateway process
remains alive and healthy (no crash, no OOM), so neither
systemd nor the health monitor can detect the hang. The
health monitor's stale-socket check eventually fires (after
~300s) and restarts the socket layer, but the polling loop
remains blocked, so the restart does not recover the bot.
A full gateway restart (openclaw gateway restart) is
required to recover.
OpenClaw version
2026.3.7 (and 2026.3.2)
Operating system
macos
Install method
npm
Logs, screenshots, and evidence
Last successful message processed:
2026-03-08T14:20:38.902Z [telegram] sendMessage ok
chat=8473334963 message=2884
Health monitor detected stale socket and attempted restart
~33 minutes later:
2026-03-08T14:53:04.753Z [health-monitor] [telegram:billy]
health-monitor: restarting (reason: stale-socket)
2026-03-08T14:53:04.818Z [telegram] [billy] starting
provider (@billybtng_bot)
No further activity from the Billy account for the next 83
minutes despite two plain-text "Ping?" messages being sent
by the user. Telegram API confirmed the messages were
queued but never consumed:
Pending updates: 2
update_id: 943947904 → text: "Ping?"
update_id: 943947905 → text: "Ping"
Update offset file (update-offset-billy.json) remained
unchanged from 14:20:38 until manual gateway restart at
16:17, confirming the polling loop was permanently blocked,
not retrying.Impact and severity
Severity: High
- Complete loss of Telegram responsiveness — all bots on
the affected account stop responding to every message type
(text, commands, media) until a manual gateway restart - Silent failure — no error is surfaced to the user via
Telegram, no alert is triggered, no log entry marks the
hang; the only signal is absence of responses - Health monitor ineffective — the existing recovery
mechanism (stale-socket restart) does not fix the
condition, giving a false sense of resilience - Data loss — Telegram updates consumed during the hung
polling cycle (in this case 3 updates: IDs
943947901–943947903) are permanently lost from the queue
and cannot be replayed after recovery - Reproducible with a single message — any user with access
to the bot can inadvertently (or deliberately) trigger a
full outage by sending one large file
Additional information
No response