Skip to content

[Bug]: local gateway CLI handshake fails (probe timeout / gateway call closed 1000) even though gateway is running and WS challenge is reachable #45560

@adamji1900

Description

@adamji1900

Bug type

Regression (worked before, now fails)

Summary

local gateway CLI handshake fails on loopback: openclaw gateway probe times out and openclaw gateway call status closes with code 1000, even though the gateway service is running and the dashboard/HTTP endpoint is reachable.

Steps to reproduce

  1. Configure local gateway with token auth and LAN bind.
  2. Confirm gateway is running:

openclaw status

  1. Confirm the local gateway port is listening and HTTP is reachable:

ss -ltnp | grep 18789
curl -sv --max-time 5 http://127.0.0.1:18789/

  1. Run:

openclaw gateway probe --json

  1. Run:

openclaw gateway call status

  1. Retry with explicit token:

openclaw gateway probe --token "$TOKEN" --json
openclaw gateway call status --token "$TOKEN"

Expected behavior

When the local gateway service is running and token auth is configured correctly, both openclaw gateway probe and openclaw gateway call status should connect successfully over loopback and return gateway status instead of timing out or closing.

Actual behavior

• openclaw gateway probe --json reports loopback connect timeout
• openclaw gateway call status fails with:

gateway connect failed: Error: gateway closed (1000):
Gateway call failed: Error: gateway closed (1000 normal closure): no close reason

• openclaw status may still show the gateway service as running, which makes the failure appear contradictory
• HTTP dashboard remains reachable, so this does not appear to be a basic port/service outage

OpenClaw version

2026.3.12

Operating system

Ubuntu 24.04

Install method

npm global

Model

openai-codex/gpt-5.4

Provider / routing chain

openclaw local CLI -> local gateway (ws://127.0.0.1:18789)

Config file / key location

~/.openclaw/openclaw.json ; gateway.mode / gateway.bind / gateway.auth

Additional provider/model setup details

This does not appear to depend on model routing. The relevant setup is the local gateway configuration:

• gateway.mode = local
• gateway.bind = lan
• gateway.auth.mode = token
• dashboard reachable at http://:18789/
• probe target is ws://127.0.0.1:18789

Logs, screenshots, and evidence

# openclaw gateway probe --json
{
"ok": false,
"targets": [
{
"id": "localLoopback",
"kind": "localLoopback",
"url": "ws://127.0.0.1:18789",
"active": true,
"connect": {
"ok": false,
"latencyMs": null,
"error": "timeout",
"close": null
}
}
]
}

# openclaw gateway call status
gateway connect failed: Error: gateway closed (1000):
Gateway call failed: Error: gateway closed (1000 normal closure): no close reason
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/(user_dir)/.openclaw/openclaw.json
Bind: lan

# gateway service observations
ss -ltnp | grep 18789
# showed gateway listening on 0.0.0.0:18789

curl -sv --max-time 5 http://127.0.0.1:18789/
# returned HTTP 200 OK

# gateway journal
[ws] handshake timeout conn=... remote=127.0.0.1
[ws] closed before connect conn=... remote=127.0.0.1 ... code=1000 reason=n/a

Additional investigation:

• Explicit --token did not fix the issue
• Manua

l WS connection to ws://127.0.0.1:18789 did receive connect.challenge
• Temporarily moving aside local device state files did not change behavior:
• ~/.openclaw/identity/device-auth.json
• ~/.openclaw/devices/paired.json

Impact and severity
Affected: local gateway CLI users on Ubuntu 24.04 using token-auth local gateway
Severity: High (blocks local gateway probe/RPC diagnostics and makes status misleading)
Frequency: 100% repro in this environment
Consequence: local gateway CLI commands cannot reliably inspect or call the running gateway even though the service is up

Additional informationThis looks like a regression in the CLI ↔ gateway connect/handshake path rather than a simple networking issue:

• HTTP is reachable
• WS challenge is emitted
• explicit token does not help
• clearing local pairing/device-auth state does not help

Suspicious areas based on code inspection:

• GatewayClient.sendConnect()
• post-connect.challenge handshake flow
• device signature / handshake response handling

Potential UX issue as well: failed local handshake gets surfaced as timeout / close 1000 instead of a more specific structured error, making diagnosis harder.

Impact and severity

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions