-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
[Feature] Add /ready readiness endpoint distinct from /health #27139
Copy link
Copy link
Closed as not planned
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Problem
OpenClaw exposes /health and /healthz endpoints, but these only indicate that the gateway process is alive. They do not reflect whether the system is actually ready to process messages -- i.e., whether channel providers are connected, model providers are reachable, and the memory database is open.
This distinction matters for:
- Docker/Kubernetes deployments: liveness vs readiness probes are fundamentally different concepts. Using
/healthfor both means containers are marked "ready" before channels are connected, leading to dropped messages during startup. - LaunchAgent/systemd monitoring: Knowing the process is alive is not the same as knowing it can serve requests. A gateway with a crashed Discord provider but a live HTTP listener reports healthy but silently drops all Discord messages.
- Automated update pipelines: After
openclaw update, a post-update script needs to verify the gateway is fully operational, not just that it started without crashing.
Proposal
Add a GET /ready endpoint that returns:
{
"ready": true,
"channels": {
"discord": { "connected": true, "latencyMs": 45 },
"telegram": { "connected": true, "latencyMs": 120 }
},
"memory": {
"open": true,
"chunks": 1247
},
"uptime": 3600
}When any critical subsystem is not ready, return HTTP 503 with "ready": false and the failing component.
Design considerations
- Auth: Should require the same auth as other gateway RPC endpoints (not unauthenticated like
/healthz). - Lightweight: Must not trigger expensive operations (no embedding calls, no LLM probes). Just check connection state and database handle validity.
- Extensible: Plugins/extensions should be able to register their own readiness checks.
Use cases
- Kubernetes:
readinessProbe: httpGet: path: /readyensures pods only receive traffic when fully initialized. - Post-update verification:
curl -f http://localhost:PORT/readyin update scripts to confirm successful restart. - Monitoring dashboards: Display per-subsystem health status (memory DB, Discord connection, Telegram polling, etc.).
- Multi-agent setups: Orchestrator can check if a specific agent's gateway is ready before routing tasks.
References
- [Bug]:
/healthand/healthzcan return Control UI HTML200instead of machine health payload #18446 (/healthzreturns incorrect content type) - Docker deployment discussions in community Discord
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleMarked as stale due to inactivityMarked as stale due to inactivity
Type
Fields
Give feedbackNo fields configured for issues without a type.