Bug: local loopback gateway diagnostics show contradictory unreachable/missing-scope results in 2026.3.13

OpenClaw 2026.3.13 local loopback gateway diagnostic inconsistency report
Date: 2026-03-14
Environment: local gateway on Linux, bind=loopback, gateway.auth.mode=token

Summary
-------
After upgrading to OpenClaw 2026.3.13, the gateway appears operational for real usage (Feishu chat works, Web UI via SSH tunnel works, CLI is usable), but local diagnostic commands report contradictory results.

The most visible symptom is:

  Gateway │ local · ws://127.0.0.1:18789 (local loopback) · unreachable (missing scope: operator.read)

At the same time:
- the gateway service is running
- gateway status shows RPC probe: ok
- some gateway RPC calls succeed
- normal product functionality continues to work

This strongly suggests a regression or inconsistency in the local loopback diagnostic / auth / aggregation path, rather than a true reachability failure.

Version / Environment
---------------------
- OpenClaw version: 2026.3.13
- Commit shown by CLI: 61d171a
- OS: Linux 6.6.117-45.1.oc9.x86_64 (x64)
- Node: 22.22.0
- Gateway mode: local
- Gateway bind: loopback
- Gateway auth mode: token
- Gateway URL: ws://127.0.0.1:18789
- Dashboard URL: http://127.0.0.1:18789/

Observed Behavior
-----------------
1. `openclaw gateway status` reports the gateway is healthy:
   - Runtime: running
   - Listening: 127.0.0.1:18789
   - RPC probe: ok

2. `openclaw status` still reports:
   - Gateway ... unreachable (missing scope: operator.read)

3. `openclaw gateway probe --json` reports:
   - ok: false
   - degraded: false
   - target localLoopback connect.error: timeout
   - rpcOk: false

4. Gateway functionality is still available in practice:
   - Feishu messaging works
   - Web UI via SSH tunnel works
   - local CLI works for many operations

5. RPC behavior is inconsistent across methods:
   - `openclaw gateway call system-presence --json` succeeded and returned JSON
   - `openclaw gateway call health --json` failed with:
     gateway closed (1000 normal closure): no close reason

6. Gateway logs show mixed failure modes for local probe traffic:
   - `missing scope: operator.read` on some RPC methods
   - `handshake timeout` on some loopback probe connections
   - `closed before connect` on others

Why this looks incorrect
------------------------
If the gateway were truly unreachable, all related diagnostics should fail consistently.
Instead, the following all happen in the same environment:
- service is running
- RPC probe says ok
- system-presence succeeds
- probe JSON says timeout
- overview says unreachable / missing scope
- real user-facing functionality works

This suggests the issue is not real gateway unreachability, but an inconsistency between multiple internal diagnostic/auth paths.

Investigation performed
-----------------------
The following checks were performed:

1. Confirmed config state:
   - gateway.mode = local
   - gateway.bind = loopback
   - gateway.auth.mode = token
   - shared token exists in config

2. Confirmed the local CLI device is paired and has full operator scopes.
   The local CLI device has operator token scopes including:
   - operator.read
   - operator.write
   - operator.admin
   - operator.approvals
   - operator.pairing

   This means the problem is not simply that the local CLI device lacks operator.read.

3. Rotated the local CLI device operator token with the same full scopes.
   Result: no change.
   - `openclaw status` still showed missing scope / unreachable
   - `openclaw gateway probe --json` still timed out

4. Restarted the gateway service.
   Result: no change.
   - `openclaw gateway status` still showed RPC probe: ok
   - `openclaw status` still showed unreachable (missing scope: operator.read)
   - `openclaw gateway probe --json` still timed out
   - `openclaw gateway call system-presence --json` still succeeded
   - `openclaw gateway call health --json` still failed with close code 1000

Code-path clues found locally
-----------------------------
Inspection of the installed 2026.3.13 code suggests multiple auth/diagnostic paths are involved.

Relevant observations from local code inspection:

1. For local loopback probe targets, device identity is intentionally disabled in probe logic.
2. For gateway calls on loopback with explicit token/password, device identity is also not attached.
3. Docs and code indicate that `missing scope: operator.read` is treated as a degraded diagnostic state, not necessarily a true connection failure.

This may explain why different code paths produce different outcomes:
- one path can successfully identify / use an operator role
- another path becomes scope-limited or times out
- the overview then aggregates this into an incorrect `unreachable` status

Most likely root cause
----------------------
A regression or inconsistency in OpenClaw 2026.3.13 local loopback gateway diagnostics, likely involving one or more of:

1. Probe auth selection on loopback not consistently using the local paired CLI device token
2. Divergent behavior between probe/status aggregation and direct gateway call paths
3. Incorrect summary aggregation that marks the gateway as `unreachable` when at least some local RPC paths are working
4. Method-specific auth / handshake behavior differences (for example `system-presence` succeeds while `health` fails)

Minimal reproduction pattern
----------------------------
1. Run a local loopback gateway with:
   - gateway.mode=local
   - gateway.bind=loopback
   - gateway.auth.mode=token
2. Upgrade to 2026.3.13
3. Run:
   - `openclaw gateway status`
   - `openclaw status`
   - `openclaw gateway probe --json`
   - `openclaw gateway call health --json`
   - `openclaw gateway call system-presence --json`
4. Observe contradictory results across the commands above.

Representative command outputs
------------------------------
A. `openclaw gateway status`
   Expected/Observed key lines:
   - Runtime: running
   - RPC probe: ok
   - Listening: 127.0.0.1:18789

B. `openclaw status`
   Observed key line:
   - Gateway: local · ws://127.0.0.1:18789 (local loopback) · unreachable (missing scope: operator.read)

C. `openclaw gateway probe --json`
   Observed key fields:
   - ok: false
   - degraded: false
   - targets[0].connect.ok: false
   - targets[0].connect.rpcOk: false
   - targets[0].connect.error: "timeout"

D. `openclaw gateway call health --json`
   Observed failure:
   - gateway closed (1000 normal closure): no close reason

E. `openclaw gateway call system-presence --json`
   Observed success:
   - returns presence JSON successfully
   - includes probe-related CLI entries with operator role

Practical impact
----------------
- Misleading status reporting after upgrade
- False indication that gateway reachability is broken
- Confusing operator guidance (`Fix reachability first`) even though the product still works
- Makes debugging much harder because diagnostics contradict each other

What would likely be the correct behavior
-----------------------------------------
One of the following should happen consistently:

Option A:
- local loopback diagnostics should consistently use the proper local auth path and succeed

Option B:
- if scope is actually limited, status should report a degraded/partial state rather than `unreachable`

But the current combination of:
- RPC probe ok
- system-presence success
- probe timeout
- health close 1000
- status says unreachable / missing scope
is internally inconsistent.

Suggested areas to inspect
--------------------------
- loopback probe auth selection logic
- interaction between shared token vs paired device token on loopback
- status overview aggregation logic for probe results
- why `health` and `system-presence` diverge under the same environment
- whether loopback probe incorrectly disables the auth mechanism that would otherwise work

Short conclusion
----------------
This appears to be an OpenClaw 2026.3.13 local loopback diagnostic/auth regression or aggregation bug, not a genuine gateway outage.

The gateway is operational enough for real use, but the diagnostic layer reports contradictory failures and likely misclassifies the gateway as unreachable.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: local loopback gateway diagnostics show contradictory unreachable/missing-scope results in 2026.3.13 #46100

Summary

Version / Environment

Observed Behavior

Why this looks incorrect

Investigation performed

Code-path clues found locally

Most likely root cause

Minimal reproduction pattern

Representative command outputs

Practical impact

What would likely be the correct behavior

Suggested areas to inspect

Short conclusion

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Bug: local loopback gateway diagnostics show contradictory unreachable/missing-scope results in 2026.3.13 #46100

Description

Summary

Version / Environment

Observed Behavior

Why this looks incorrect

Investigation performed

Code-path clues found locally

Most likely root cause

Minimal reproduction pattern

Representative command outputs

Practical impact

What would likely be the correct behavior

Suggested areas to inspect

Short conclusion

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions