Gateway: stop and restart unmanaged listeners#39355
Conversation
🔒 Aisle Security AnalysisWe found 3 potential security issue(s) in this PR:
1. 🔵 Terminal/log injection via unsanitized port diagnostic errors in gateway restart health checks
Description
Vulnerable code: if (snapshot.portUsage.errors?.length) {
lines.push(`Port diagnostics errors: ${snapshot.portUsage.errors.join("; ")}`);
}RecommendationSanitize untrusted diagnostic text before rendering to terminal/log output. Use existing sanitizers (e.g. Example fix: import { sanitizeTerminalText } from "../../terminal/safe-text.js";
if (snapshot.portUsage.errors?.length) {
const safeErrors = snapshot.portUsage.errors.map((e) => sanitizeTerminalText(String(e)));
lines.push(`Port diagnostics errors: ${safeErrors.join("; ")}`);
}If multi-line error details are valuable, consider splitting into multiple sanitized lines rather than embedding raw newlines into a single log line. 2. 🔵 Untrusted env/config can select port used to signal unmanaged gateway processes (local DoS)
DescriptionThe new unmanaged stop/restart fallback can send signals to processes based on a port value derived from environment/config, which can be influenced by untrusted context. In
When the service manager is not loaded, Because the CLI loads dotenv from the current working directory (see Vulnerable logic (port resolution feeding process signaling) is introduced by the new unmanaged stop/restart behavior. RecommendationTreat the unmanaged stop/restart path as a safety-critical operation and avoid resolving the target port from ambient/untrusted configuration. Recommended changes:
Example hardening approach: import { DEFAULT_GATEWAY_PORT } from "../../config/paths.js";
import { parsePort } from "./shared.js"; // or wherever parsePort is exported
function resolveUnmanagedSignalPort(opts: { port?: string }): number {
// Only trust an explicit CLI option (not dotenv/config/service env) for signaling.
return parsePort(opts.port) ?? DEFAULT_GATEWAY_PORT;
}Then use this value for If you need to preserve configurability, consider only trusting service-managed configuration (the installed unit’s 3. 🔵 Unverified PID signaling when service manager not loaded (port-based process.kill)
DescriptionThe new unmanaged gateway stop/restart paths signal processes based on PIDs discovered by looking up listeners on a TCP port, without strong verification that the target process is actually the intended OpenClaw gateway instance. In
Security impact:
Vulnerable code: for (const pid of pids) {
process.kill(pid, "SIGTERM");
}
...
process.kill(pids[0], "SIGUSR1");RecommendationHarden unmanaged stop/restart by validating process identity immediately before signaling. Suggested defenses (can be combined):
Example approach (pseudo-code): import { inspectPortUsage } from "../../infra/ports-inspect.js";
async function safeSignalGatewayByPort(port: number, signal: NodeJS.Signals) {
const usage = await inspectPortUsage(port);
const candidates = usage.listeners
.filter(l => typeof l.pid === "number")
.filter(l => (l.commandLine ?? l.command ?? "").toLowerCase().includes("openclaw-gateway"));
// optionally also enforce same-user:
// .filter(l => l.user === os.userInfo().username)
if (candidates.length !== 1) throw new Error("expected exactly one gateway listener");
const pid = candidates[0].pid!;
// Re-verify PID still listening (repeat inspectPortUsage or /proc check) before kill
process.kill(pid, signal);
}Also consider requiring an explicit Analyzed PR: #39355 at commit Last updated on: 2026-03-08T02:27:32Z Latest run failed. Keeping previous successful results. Trace ID: Last updated on: 2026-03-08T02:28:15Z |
Greptile SummaryThis PR adds an unmanaged-listener fallback to The implementation is clean and well-scoped:
Confidence Score: 4/5
Last reviewed commit: 042d212 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 042d21214f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Addressed the unmanaged restart post-check issue. What changed:
Verification:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54e0861de5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Validated both follow-up findings on the current branch state.
Verification:
|
|
Follow-up on the latest review batch:
Verification:
|
|
Landed in main: bf9c362 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ee2386ac2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!normalized.includes("gateway")) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Accept openclaw-gateway argv in unmanaged PID filter
The unmanaged fallback now depends on isGatewayArgv, but this guard exits early unless one argv token is exactly gateway. A process launched directly as openclaw-gateway (for example ['/usr/local/bin/openclaw-gateway', ...]) has no standalone gateway token, so its PID is filtered out and openclaw gateway stop/restart reports the service as not loaded instead of signaling the live unmanaged listener.
Useful? React with 👍 / 👎.
| if (!args || !isGatewayArgv(args)) { | ||
| throw new Error(`refusing to signal non-gateway process pid ${pid}`); | ||
| } |
There was a problem hiding this comment.
Tolerate vanished PID during unmanaged signal dispatch
After discovering listener PIDs, signalGatewayPid re-reads process args and throws when the cmdline cannot be read, which turns a normal race (PID exits between scan and signal) into a hard gateway stop/restart failed path. In busy container/supervisor environments this can make stop/restart fail even when the target process has already exited or is exiting, so the fallback should treat missing cmdline/ESRCH as best-effort instead of fatal.
Useful? React with 👍 / 👎.
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled (cherry picked from commit bf9c362)
* Daemon: allow unmanaged gateway lifecycle fallback * Status: fix service summary formatting * Changelog: note unmanaged gateway lifecycle fallback * Tests: cover unmanaged gateway lifecycle fallback * Daemon: split unmanaged restart health checks * Daemon: harden unmanaged gateway signaling * Daemon: reject unmanaged restarts when disabled (cherry picked from commit bf9c362)
Summary
Describe the problem and fix in 2–5 bullets:
openclaw gateway stopandopenclaw gateway restartbail out withservice disabledwhen the gateway is running under a container or another supervisor without a registered service manager unit.SIGTERMandSIGUSR1natively.SIGTERMfor stop, sendsSIGUSR1for a single unambiguous listener on restart, and still runs the existing restart health checks and JSON responses.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
openclaw gateway restart/stopfail in containers without systemd #36137User-visible / Behavior Changes
openclaw gateway stopnow stops unmanaged gateway listeners in container/supervisor deployments by sendingSIGTERMwhen no service manager unit is loaded.openclaw gateway restartnow gracefully restarts a single unmanaged gateway listener by sendingSIGUSR1and keeping the existing health-check/JSON response path.Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation:Repro + Verification
Environment
18789Steps
openclaw gateway stoporopenclaw gateway restart.service disabled.Expected
SIGTERMto the unmanaged gateway listener.SIGUSR1to a single unmanaged listener and still waits for restart health checks.Actual
Evidence
Attach at least one:
Human Verification (required)
What you personally verified (not just CI), and how:
pnpm build, fullpnpm check.Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoFailure Recovery (if this breaks)
src/cli/daemon-cli/lifecycle-core.ts,src/cli/daemon-cli/lifecycle.tsRisks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write
None.status --deephint.