-
-
Notifications
You must be signed in to change notification settings - Fork 69.2k
[Bug]: macOS launchd: gateway restart can leave stale parent process and runtime/RPC state inconsistent #39074
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingregressionBehavior that previously worked and now failsBehavior that previously worked and now fails
Description
Bug type
Regression (worked before, now fails)
Summary
On macOS, openclaw gateway can get into an inconsistent state after network disruption and LaunchAgent restarts.
openclaw gateway status reports Runtime: running, but RPC probe: failed, and the actual process listening on the gateway port is a different PID than the one reported as the runtime PID.
This appears to be a service restart / shutdown cleanup issue under launchd.
Steps to reproduce
- Install and run OpenClaw as a macOS LaunchAgent-managed gateway.
- Confirm the healthy baseline:
openclaw gateway status- expected:
Runtime: runningandRPC probe: ok
- Trigger a network disruption or route/interface switch while the gateway is active.
- In my case this happened when the machine’s network path changed after disconnecting an external display / dock.
- Restart the gateway service:
openclaw gateway restart
- Check gateway state again:
openclaw gateway statuslsof -nP -iTCP:18789
- Observe that:
gateway statusmay showRuntime: runningRPC probemay fail- the PID reported by the service runtime may differ from the PID actually listening on the gateway port
- In this state, replies may fail or the gateway may require another restart / manual cleanup before becoming healthy again.
Expected behavior
After a network disruption and a service restart, the gateway should recover to a single healthy state automatically.
Specifically:
openclaw gateway statusshould report:Runtime: runningRPC probe: ok
- the LaunchAgent runtime PID should match the actual process listening on the gateway port
- no stale parent/child gateway processes should remain
- the gateway should resume handling requests without requiring manual cleanup or repeated restarts
Actual behavior
After the network disruption and restart, the gateway could enter an inconsistent state where:
openclaw gateway statusreportedRuntime: runningRPC probefailed- the PID reported by the LaunchAgent runtime differed from the PID actually listening on
127.0.0.1:18789 - replies could fail or stall until the service was manually cleaned up and restarted again
I also observed status output like:
RPC probe: failed
gateway closed (1006 abnormal closure (no close frame))
Port 18789 is already in use.
Another process is listening on this port.
### OpenClaw version
OpenClaw: `2026.3.2`
### Operating system
macOS 26.3
### Install method
Node: `/opt/homebrew/Cellar/node/25.6.1/bin/node`
### Logs, screenshots, and evidence
```shell
Impact and severity
No response
Additional information
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingregressionBehavior that previously worked and now failsBehavior that previously worked and now fails
Type
Fields
Give feedbackNo fields configured for issues without a type.