Skip to content

fix(telegram): preserve sticky IPv4 fallback across polling restarts#48184

Open
kiki830621 wants to merge 1 commit intoopenclaw:mainfrom
PsychQuant:fix/sticky-ipv4-polling-restart
Open

fix(telegram): preserve sticky IPv4 fallback across polling restarts#48184
kiki830621 wants to merge 1 commit intoopenclaw:mainfrom
PsychQuant:fix/sticky-ipv4-polling-restart

Conversation

@kiki830621
Copy link
Copy Markdown

Summary

  • Resolve resolveTelegramFetch() once in TelegramPollingSession constructor and reuse across bot recreations via new resolvedFetch option
  • The sticky IPv4 fallback state (closure variable stickyIpv4FallbackEnabled) now survives the full polling session lifetime instead of resetting on every restart
  • Pass account network config to polling session so transport is resolved with correct settings

Problem

When autoSelectFamily=true (Node 22+ default) and the host has unstable IPv6 routes to api.telegram.org, the fetch layer correctly detects failures and enables a sticky IPv4-only dispatcher. However, each polling restart calls createTelegramBot()resolveTelegramFetch(), creating a new fetch closure with stickyIpv4FallbackEnabled = false. This causes a repeating cycle:

IPv6 timeout → sticky IPv4 fallback → works → polling stall (~90s)
→ restart → new transport → sticky lost → IPv6 timeout again

Observed: 47 polling stalls in 20 hours on a host with residential ISP (unstable IPv6 to Telegram).

Fix

TelegramPollingSession now resolves the fetch function once in its constructor and passes it to each createTelegramBot() call via opts.resolvedFetch. The bot uses the pre-resolved fetch (preserving sticky state) when available, falling back to resolveTelegramFetch() for non-polling callers.

Test plan

  • Existing fetch.test.ts and polling-session.test.ts still pass
  • On a host with unstable IPv6: verify autoSelectFamily=true (default-node22) is logged only once per session (not after each restart)
  • On a host with unstable IPv6: verify polling stalls no longer trigger repeated IPv6 timeouts
  • Non-polling bot creation (webhook mode, one-shot) is unaffected (no resolvedFetch passed)

Fixes #48177

When autoSelectFamily=true (Node 22+ default) and the host has unstable
IPv6 routes to api.telegram.org, the fetch layer correctly detects
failures and enables a sticky IPv4-only dispatcher. However, each
polling restart recreates the bot via createTelegramBot(), which calls
resolveTelegramFetch() anew, resetting the sticky state. This causes
a repeating cycle: IPv6 timeout → fallback to IPv4 → polling stall →
restart → sticky lost → IPv6 timeout again.

Fix: resolve the Telegram fetch function once in TelegramPollingSession
and reuse it across bot recreations via a new `resolvedFetch` option on
createTelegramBot(). The sticky IPv4 fallback state now survives the
full polling session lifetime.

Fixes openclaw#48177
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR fixes a repeating IPv6-timeout / polling-stall cycle on Node 22+ hosts with unstable IPv6 routes to api.telegram.org by resolving the undici fetch dispatcher once per TelegramPollingSession lifetime instead of on every bot recreation.

Key changes:

  • TelegramPollingSession now calls resolveTelegramFetch() once in its constructor and stores the result in the private #sharedFetch field, so the closure variable stickyIpv4FallbackEnabled (and the lazily-created IPv4 dispatcher) survive across all polling restarts.
  • createTelegramBot accepts a new optional resolvedFetch parameter; when provided, it is used directly instead of calling resolveTelegramFetch() again — correctly short-circuiting the "reset sticky state" path.
  • monitor.ts forwards account.config.network to TelegramPollingSession so the shared fetch is resolved with the same network settings createTelegramBot would have used, keeping transport decisions consistent.
  • Non-polling callers (webhook mode, one-shot scripts) are entirely unaffected — they never pass resolvedFetch, so their code path is unchanged.

The implementation is minimal and well-scoped. The abort-signal wrapper and network-error tagging wrapper in bot.ts are correctly layered on top of whichever fetch is chosen, so the sticky dispatcher state in the shared fetch is preserved while per-request abort signals still work as expected.

Confidence Score: 5/5

  • This PR is safe to merge — the change is narrow, logically sound, and has no impact on non-polling code paths.
  • The fix directly models the correct lifetime for the fetch closure: one fetch instance per polling session rather than one per bot recreation. No existing logic is removed; resolvedFetch is purely additive and the existing ?? fallback guarantees backward compatibility for every non-polling caller. The only external-facing change in monitor.ts is passing a value that was already available on the account config, making the two resolution sites consistent. No edge cases were found that would alter existing behavior.
  • No files require special attention.

Last reviewed commit: 501e27c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sticky IPv4 fallback resets on polling restart, causing repeated IPv6 timeouts

1 participant