Skip to content

[Bug]: Gateway process ignores SIGTERM during Telegram polling recovery — requires SIGKILL #51100

@p3nchan

Description

@p3nchan

Description

When a gateway is restarted via launchctl kickstart -k (which sends SIGTERM) while the Telegram polling runner is in a stall/recovery state, the gateway process does not exit. A second SIGTERM also has no effect. Only kill -9 (SIGKILL) terminates the process.

This causes the new gateway process (spawned by launchd) to fail to bind the port, resulting in a period where the gateway is completely unreachable.

Steps to reproduce

  1. Run two gateways on the same Mac (both using Telegram channel)
  2. Wait for a Telegram polling stall to occur (happens every 1–2 hours in our environment)
  3. During or shortly after a stall recovery, restart the gateway via launchd:
    sudo launchctl kickstart -k system/ai.openclaw.gateway.nyagu
    
  4. Observe that the old process (e.g., PID 8096) does not exit
  5. The new process starts but cannot bind the port
  6. kill <old_pid> (SIGTERM) — still does not exit
  7. Only kill -9 <old_pid> (SIGKILL) works

Observed timeline

22:47:46  Telegram polling stall (113s), restarting in 6.86s
22:50:xx  launchctl kickstart -k → SIGTERM sent
22:50:xx  New process PID 40145 spawned, old PID 8096 still alive
22:51:xx  kill 8096 → no effect
22:51:xx  kill -9 8096 → process terminated
22:52:40  New gateway finally binds port and starts

Expected behavior

Gateway should handle SIGTERM gracefully within a bounded time (e.g., 30s). If the Telegram polling runner is blocking shutdown, it should be force-cancelled after a timeout, and the process should exit cleanly.

Suggested fix

Add a shutdown watchdog: if the process hasn't exited within N seconds (e.g., 30s) after receiving SIGTERM, force-exit with process.exit(1).

process.on('SIGTERM', () => {
  gracefulShutdown();
  setTimeout(() => {
    console.error('Shutdown timed out, forcing exit');
    process.exit(1);
  }, 30_000).unref();
});

Environment

  • openclaw v2026.3.13
  • macOS 15.4 (Darwin 25.3.0)
  • Node v24.14.0
  • Gateway managed by launchd (LaunchDaemon)
  • Telegram channel with network.autoSelectFamily: false

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions