Skip to content

[BUG]: fix telegram webhook should wait for abort signal instead of returning#20309

Closed
kesor wants to merge 2 commits intoopenclaw:mainfrom
kesor:fix/telegram-websocket-abort
Closed

[BUG]: fix telegram webhook should wait for abort signal instead of returning#20309
kesor wants to merge 2 commits intoopenclaw:mainfrom
kesor:fix/telegram-websocket-abort

Conversation

@kesor
Copy link
Copy Markdown
Contributor

@kesor kesor commented Feb 18, 2026

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: The monitorTelegramProvider function was returning immediately after starting the webhook server, causing the gateway to interpret this as a channel exit and trigger automatic restarts, leading to EADDRINUSE errors.
  • Why it matters: Webhook mode is unusable when the gateway continuously crashes trying to rebind the same port.
  • What changed: Added a Promise in the monitor that waits for the abort signal after starting the webhook, keeping the monitor function alive until shutdown. The webhook startup function itself returns immediately with its handle, avoiding deadlocks.
  • What did NOT change (scope boundary): Polling mode, bot creation, webhook registration with Telegram API, webhook startup function signature, or any other channel implementations.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

User-visible / Behavior Changes

Telegram webhook mode now works correctly without crashing. The gateway no longer attempts to restart the Telegram channel when using webhook mode, eliminating the EADDRINUSE port conflict errors.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Linux NixOS 25.11
  • Runtime/container: systemd
  • Model/provider: not relevant
  • Integration/channel (if any): Telegram
  • Relevant config (redacted):
{
 "channels": {
   "telegram": {
     "webhookUrl": "https://xxx.xxxx.xxx/telegram/webhook",
     "webhookSecret": "xxxxx",
     "webhookPath": "/telegram/webhook",
     "webhookHost": "127.0.0.1"
   }
 }
}

Steps

  1. Configure Telegram webhook listener (see configuration above)
  2. Start gateway
  3. Observe logs showing webhook starting successfully
  4. Gateway immediately logs "auto-restart attempt 1/10 in 5s"
  5. Second start attempt fails with EADDRINUSE: address already in use 127.0.0.1:8787

Expected

  • Webhook server starts once and stays running until gateway shutdown
  • No automatic restart attempts
  • Telegram messages are received and processed via webhook

Actual (before fix)

  • Webhook server starts successfully
  • Monitor function returns immediately
  • Gateway interprets this as channel exit and schedules restart
  • Second start attempt fails with port conflict
  • Gateway crashes or enters restart loop

Actual (after fix)

  • Webhook server starts and monitor stays running
  • No restart attempts
  • Telegram webhook receives and processes messages correctly

Evidence

Attach at least one:

  • Trace/log snippets
  • Test added

Before fix:

2026-02-18T14:09:00.683Z [telegram] [default] starting provider (@XxxXxxBot)
2026-02-18T14:09:01.076Z [telegram] webhook listening on https://xxx.xxxx.xxx/telegram/webhook
2026-02-18T14:09:01.077Z [telegram] [default] auto-restart attempt 1/10 in 5s
2026-02-18T14:09:07.019Z [telegram] [default] starting provider (@XxxXxxBot)
2026-02-18T16:09:07.396+02:00 [openclaw] Uncaught exception: Error: listen EADDRINUSE: address already in use 127.0.0.1:8787

After fix:

2026-02-18T16:15:00.123Z [telegram] [default] starting provider (@XxxXxxBot)
2026-02-18T16:15:00.456Z [telegram] webhook listening on https://xxx.xxxx.xxx/telegram/webhook
[no restart attempts, webhook stays running]

Test added:
Added test "webhook mode waits for abort signal before returning" in monitor.test.ts that verifies the monitor stays running until abort is triggered.

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • Configured Telegram webhook mode with external reverse proxy
    • Started gateway and confirmed webhook server starts and stays running
    • Verified no automatic restart attempts in logs
    • Confirmed Telegram messages are received via webhook
    • Ran existing tests to confirm no regressions
  • Edge cases checked:
    • Gateway shutdown properly stops webhook server
    • Abort signal correctly triggers cleanup
    • Webhook startup function returns handle immediately (no deadlock)
  • What you did not verify:
    • Behavior with multiple Telegram accounts (only tested single account)
    • Interaction with Telegram API rate limits during webhook registration

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Switch to polling mode by removing webhookUrl from Telegram config, or revert the commit
  • Files/config to restore: src/telegram/monitor.ts (primary change)
  • Known bad symptoms reviewers should watch for: Monitor not stopping on shutdown, webhook server not responding to Telegram updates, memory leak from unclosed connections

Risks and Mitigations

  • Risk: If abort signal handling is broken, monitor might not stop cleanly on gateway shutdown
    • Mitigation: Abort signal is checked before waiting; pattern matches other channels; test added to verify behavior
  • Risk: Promise never resolves if abort signal is never fired
    • Mitigation: Only waits when abortSignal is provided and not already aborted; gateway shutdown always fires abort signals

@openclaw-barnacle openclaw-barnacle bot added channel: telegram Channel integration: telegram size: XS labels Feb 18, 2026
@kesor kesor force-pushed the fix/telegram-websocket-abort branch from d026cfe to e0f4c6a Compare February 18, 2026 19:27
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@kesor kesor force-pushed the fix/telegram-websocket-abort branch from e0f4c6a to dade1d0 Compare February 18, 2026 19:28
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d026cfe337

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The webhook server starts successfully but the monitor function was
returning immediately, causing the gateway to think the channel exited
and trigger restarts.

The fix waits for the abort signal in the monitor (not in the webhook
startup function) so the webhook server can return its handle immediately
while the monitor stays running until shutdown.
@kesor kesor force-pushed the fix/telegram-websocket-abort branch from dade1d0 to 8af0292 Compare February 18, 2026 19:32
Add test to ensure the monitor function stays running in webhook mode
until the abort signal is fired, preventing premature returns that
would cause the gateway to restart the channel.
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Feb 28, 2026
@openclaw-barnacle
Copy link
Copy Markdown

Closing due to inactivity.
If you believe this PR should be revived, post in #pr-thunderdome-dangerzone on Discord to talk to a maintainer.
That channel is the escape hatch for high-quality PRs that get auto-closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: telegram Channel integration: telegram size: XS stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant