[Bug]: health-monitor stale-socket breaks webhook mode with self-signed certs

### Bug type

Regression (worked before, now fails)

### Summary

Version: 2026.3.2 (stable), Node v22.22.1, arm64

Setup: 3 Telegram bots using webhook mode with self-signed cert (direct IP, nginx reverse proxy on port 8443)

Problem: health-monitor fires stale-socket every ~35 min even in webhook mode (no long-polling socket exists). When it restarts channels, it calls setWebhook WITHOUT re-uploading the self-signed certificate. Telegram then can't verify SSL → stops delivering webhooks → all bots go dead.

Logs (49 events in 7 hours):

[health-monitor] [telegram:default] restarting (reason: stale-socket)
[health-monitor] [telegram:dev] restarting (reason: stale-socket)
[health-monitor] [telegram:finance] restarting (reason: stale-socket)
[health-monitor] [whatsapp:default] restarting (reason: stale-socket)
(repeats every ~35 min)
After each restart: getWebhookInfo shows has_custom_certificate: false

Workaround: Cron job every 1 min checks has_custom_certificate and re-registers webhook with cert if needed.

Suggested fix (any of these):

1. Skip stale-socket detection in webhook mode (no socket to go stale)
2. Persist + re-upload self-signed cert on channel restart
3. Add config option webhookCertPath so cert is auto-included in every setWebhook call


### Steps to reproduce

1. Configure 3 Telegram bot accounts with webhook mode using self-signed certificate (direct IP, no domain)
2. Set webhookUrl, webhookPort, webhookPath per account in openclaw.json
3. Manually register webhooks via Telegram Bot API setWebhook with certificate parameter (self-signed PEM)
4. Wait ~35 minutes

### Expected behavior

In webhook mode, health-monitor should not perform stale-socket detection (there is no long-polling socket). If it must restart channels, it should re-upload the self-signed certificate when calling setWebhook.


### Actual behavior

health-monitor detects "stale-socket" every ~35 min even in webhook mode. When it restarts channels, it calls setWebhook WITHOUT re-uploading the self-signed certificate. Telegram then reports has_custom_certificate: false, SSL verification fails, and all webhook deliveries stop. All bots become unresponsive.

49 stale-socket events observed in 7 hours:

[health-monitor] [telegram:default] restarting (reason: stale-socket)
[health-monitor] [telegram:dev] restarting (reason: stale-socket)
[health-monitor] [telegram:finance] restarting (reason: stale-socket)
[health-monitor] [whatsapp:default] restarting (reason: stale-socket)
(repeats every ~35 min)


### OpenClaw version

2026.3.2 (build 85377a2)

### Operating system

Ubuntu 24.04 arm64 (AWS EC2 m7g.2xlarge)

### Install method

npm global

### Logs, screenshots, and evidence

```shell
# health-monitor fires stale-socket every ~35 min in webhook mode (49 events in 7 hours):

Mar 07 17:30:52 [health-monitor] [telegram:default] restarting (reason: stale-socket)
Mar 07 17:30:52 [health-monitor] [telegram:dev] restarting (reason: stale-socket)
Mar 07 17:30:52 [health-monitor] [telegram:finance] restarting (reason: stale-socket)
Mar 07 17:30:52 [health-monitor] [whatsapp:default] restarting (reason: stale-socket)
Mar 07 18:05:52 [health-monitor] [telegram:default] restarting (reason: stale-socket)
Mar 07 18:40:52 [health-monitor] [telegram:default] restarting (reason: stale-socket)
Mar 07 19:15:52 [health-monitor] [telegram:default] restarting (reason: stale-socket)
... (continues every ~35 min)

# After each restart, Telegram webhook cert is lost:
# getWebhookInfo shows has_custom_certificate: false
# Telegram stops delivering webhooks → all bots unresponsive

# Workaround: cron job every 1 min re-registers webhook with cert
```

### Impact and severity

Affected: All Telegram webhook users with self-signed certificates (direct IP, no domain)
Severity: High (blocks all message delivery)
Frequency: 100% repro — triggers every ~35 minutes automatically
Consequence: All Telegram bots become completely unresponsive after each health-monitor restart. Users receive no replies. Messages sent during the dead window are lost. In our case, 49 outages in 7 hours across 3 bot accounts.


### Additional information

First known bad version: 2026.3.2 (only version tested). The issue has two root causes: (1) stale-socket detection runs in webhook mode where no long-polling socket exists, and (2) channel restart calls setWebhook without re-uploading the self-signed certificate. Temporary workaround: a cron job running every 1 minute that checks getWebhookInfo for has_custom_certificate: false and re-registers the webhook with the cert file if needed. Suggested fix: add a webhookCertPath config option and include the cert in every setWebhook call, or skip stale-socket detection when webhook mode is active.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: health-monitor stale-socket breaks webhook mode with self-signed certs #39303

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: health-monitor stale-socket breaks webhook mode with self-signed certs #39303

Description

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions