Skip to content

fix: ensure CLI exits after command completion#12906

Merged
steipete merged 2 commits intoopenclaw:mainfrom
DrCrinkle:fix/cli-hang-on-exit
Feb 13, 2026
Merged

fix: ensure CLI exits after command completion#12906
steipete merged 2 commits intoopenclaw:mainfrom
DrCrinkle:fix/cli-hang-on-exit

Conversation

@DrCrinkle
Copy link
Contributor

@DrCrinkle DrCrinkle commented Feb 9, 2026

Summary

  • Add explicit process.exit(process.exitCode ?? 0) after program.parseAsync() in runCli() so the CLI exits cleanly once any command finishes
  • Remove unnecessary void createDefaultDeps() from daemon-cli registration — daemon lifecycle commands don't use messaging deps

Problem

After any CLI command completes successfully, the process hangs indefinitely because:

  1. runCli() returns without calling process.exit(), and Commander.js doesn't force-exit
  2. Eagerly-imported messaging-provider modules (WhatsApp, Telegram, Discord, etc.) keep the Node.js event loop alive with persistent connections/timers

Failure paths work fine since they call defaultRuntime.exit(1) — only success paths hang.

Fixes #12904
Related: #7662

Test plan

  • vitest run src/cli/program.smoke.test.ts src/cli/daemon-cli — all 20 tests pass
  • Run openclaw gateway restart and verify the CLI exits immediately after printing the success message
  • Run openclaw daemon status and verify clean exit
  • Verify process.exitCode is respected (e.g. openclaw memory error paths still exit with code 1)

🤖 Generated with Claude Code

Greptile Overview

Greptile Summary

This PR forces the CLI process to terminate after command completion by adding an explicit drain-and-exit path (flushAndExit) after both routed CLI handling and Commander’s program.parseAsync().

It also updates the ACP gateway server entrypoint to return a Promise<void> representing an intentional shutdown, awaits that promise from the openclaw acp command action, and removes eager default dependency initialization from daemon CLI registration to avoid keeping the event loop alive unnecessarily.

Confidence Score: 3/5

  • This PR addresses a real hang, but it changes success-path lifecycle semantics in a way that can break long-running commands.
  • The forced process.exit() after routed/Commander execution will reliably fix the hang caused by lingering side-effect imports, but it will also terminate any command that intentionally returns while keeping background work alive. The change is small and localized, but the behavioral risk is non-trivial without an opt-out mechanism.
  • src/cli/run-main.ts

@openclaw-barnacle openclaw-barnacle bot added the cli CLI command changes label Feb 9, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 9, 2026

Additional Comments (1)

src/cli/run-main.ts
routed commands bypass the process.exit() call added at line 76 — they return early here and will still hang

  if (await tryRouteCli(normalizedArgv)) {
    process.exit(process.exitCode ?? 0);
  }
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/run-main.ts
Line: 36:38

Comment:
routed commands bypass the `process.exit()` call added at line 76 — they return early here and will still hang

```suggestion
  if (await tryRouteCli(normalizedArgv)) {
    process.exit(process.exitCode ?? 0);
  }
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ac878b5f0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle DrCrinkle force-pushed the fix/cli-hang-on-exit branch from 5f633ae to 04a4d3a Compare February 9, 2026 22:32
@DrCrinkle
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6449d4bdc3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Exit explicitly once the command action resolves. Without this,
// eager side-effect imports (e.g. messaging-provider modules) can keep
// the Node event loop alive and the CLI hangs after the command finishes.
process.exit(process.exitCode ?? 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid forcing process exit before stdout drains

Calling process.exit(process.exitCode ?? 0) immediately after program.parseAsync() can truncate piped/redirected output because explicit exit does not wait for pending stdio writes to flush. In script/CI usage (for example --json commands with large payloads), this can yield partial/corrupt output once the data exceeds pipe buffering, which is a behavior regression versus letting Node exit naturally; set process.exitCode and return (or explicitly await stream drain) instead of forcing exit here and in the route-first path.

Useful? React with 👍 / 👎.

@DrCrinkle
Copy link
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle
Copy link
Contributor Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 71 to 76
await program.parseAsync(parseArgv);

// Exit explicitly once the command action resolves. Without this,
// eager side-effect imports (e.g. messaging-provider modules) can keep
// the Node event loop alive and the CLI hangs after the command finishes.
await flushAndExit(process.exitCode ?? 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-exit breaks long-running commands

runCli() now always calls flushAndExit() after program.parseAsync() resolves. This will also terminate any command that intentionally starts background work and returns (e.g., “start”/“serve” style commands implemented via runCommandWithRuntime() that don’t await a long-running promise). Those commands will now exit immediately even though they previously kept the process alive.

This needs a guard so only non-daemonizing commands force-exit (or long-running commands explicitly opt out the way acp now does).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/run-main.ts
Line: 71:76

Comment:
**Force-exit breaks long-running commands**

`runCli()` now always calls `flushAndExit()` after `program.parseAsync()` resolves. This will also terminate any command that intentionally starts background work and returns (e.g., “start”/“serve” style commands implemented via `runCommandWithRuntime()` that don’t await a long-running promise). Those commands will now exit immediately even though they previously kept the process alive.

This needs a guard so only *non-daemonizing* commands force-exit (or long-running commands explicitly opt out the way `acp` now does).

How can I resolve this? If you propose a fix, please make it concise.

@DrCrinkle
Copy link
Contributor Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@DrCrinkle
Copy link
Contributor Author

@greptile-apps @chatgpt-codex-connector Please re-review — all previous feedback has been addressed.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 37 to 50
const { promise: closed, resolve: onClosed, reject: onError } = Promise.withResolvers<void>();
let stopped = false;

const shutdown = () => {
if (stopped) {
return;
}
stopped = true;
gateway.stop();
onClosed();
};

process.once("SIGINT", shutdown);
process.once("SIGTERM", shutdown);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal handlers never removed

serveAcpGateway() installs process.once("SIGINT"/"SIGTERM") handlers but never unregisters them on normal shutdown/error (onClose, onConnectError, etc.). If serveAcpGateway() is called multiple times in the same Node process (e.g., from tests or programmatic reuse), you’ll accumulate extra listeners and can end up calling gateway.stop() multiple times on a later signal. Consider cleaning up the handlers when the returned closed promise resolves/rejects.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 37:50

Comment:
**Signal handlers never removed**

`serveAcpGateway()` installs `process.once("SIGINT"/"SIGTERM")` handlers but never unregisters them on normal shutdown/error (`onClose`, `onConnectError`, etc.). If `serveAcpGateway()` is called multiple times in the same Node process (e.g., from tests or programmatic reuse), you’ll accumulate extra listeners and can end up calling `gateway.stop()` multiple times on a later signal. Consider cleaning up the handlers when the returned `closed` promise resolves/rejects.


How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 80 to 87
async function flushAndExit(code: number): Promise<never> {
process.exitCode = code;
await Promise.all([
new Promise<void>((resolve) => process.stdout.write("", resolve)),
new Promise<void>((resolve) => process.stderr.write("", resolve)),
]);
process.exit();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushAndExit type mismatch

flushAndExit() is declared as Promise<never> but it doesn’t always exit synchronously (and could theoretically return if process.exit() is mocked/no-op in tests). This type makes callers assume the function never returns, which can hide control-flow bugs. Either change the return type to Promise<void> (and keep process.exit() as the last statement) or make the function actually return new Promise<never>(() => process.exit(code))-style.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/run-main.ts
Line: 80:87

Comment:
**`flushAndExit` type mismatch**

`flushAndExit()` is declared as `Promise<never>` but it doesn’t always exit synchronously (and could theoretically return if `process.exit()` is mocked/no-op in tests). This type makes callers assume the function never returns, which can hide control-flow bugs. Either change the return type to `Promise<void>` (and keep `process.exit()` as the last statement) or make the function actually `return new Promise<never>(() => process.exit(code))`-style.


How can I resolve this? If you propose a fix, please make it concise.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc183293ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 69 to 70
onConnectError: (err) => {
onError(err);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate on initial connect before rejecting ACP bridge

Rejecting closed on every onConnectError causes openclaw acp to terminate on recoverable reconnect failures, not just startup failures. GatewayClient invokes onConnectError from sendConnect() on any failed connect request, including reconnect attempts after onClose has already scheduled retries, so a transient gateway/network blip can reject serveAcpGateway(), hit the CLI catch path, and exit the process even though the client has built-in backoff/reconnect logic.

Useful? React with 👍 / 👎.

@DrCrinkle
Copy link
Contributor Author

All three new findings addressed in the latest commit:

  1. Signal handlers never removedcleanup() now removes SIGINT/SIGTERM listeners on shutdown and connect error.
  2. flushAndExit type mismatch — Return type changed from Promise<never> to Promise<void>.
  3. Gate on initial connect before rejectingonConnectError only rejects before onHelloOk fires; after a successful connection, transient errors are left to the client's built-in reconnect.

@greptile-apps @chatgpt-codex-connector Please re-review.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Additional Comments (2)

src/acp/server.ts
SIGINT handler leak

serveAcpGateway() registers process.on("SIGINT"/"SIGTERM"), but the handlers are only removed on shutdown() or initial onConnectError. If the gateway closes (or GatewayClient.stop() is invoked elsewhere) without going through shutdown(), the returned closed promise never resolves/rejects and the signal listeners remain installed for the lifetime of the process. Consider resolving/rejecting closed from onClose (only when stopped is true) and always calling cleanup() in that path so repeated invocations (tests/programmatic use) don’t accumulate listeners and hang awaiting closed.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 33:35

Comment:
**SIGINT handler leak**

`serveAcpGateway()` registers `process.on("SIGINT"/"SIGTERM")`, but the handlers are only removed on `shutdown()` or initial `onConnectError`. If the gateway closes (or `GatewayClient.stop()` is invoked elsewhere) without going through `shutdown()`, the returned `closed` promise never resolves/rejects and the signal listeners remain installed for the lifetime of the process. Consider resolving/rejecting `closed` from `onClose` (only when `stopped` is true) and always calling `cleanup()` in that path so repeated invocations (tests/programmatic use) don’t accumulate listeners and hang awaiting `closed`.

How can I resolve this? If you propose a fix, please make it concise.

src/cli/run-main.ts
Missing early return

After tryRouteCli(normalizedArgv) returns true, runCli() calls flushAndExit(...) but does not return, so the rest of the CLI initialization and program.parseAsync() will still run if process.exit() is mocked/no-op (e.g. in tests) or if future refactors remove the hard exit. Adding return await flushAndExit(...) (or await ...; return;) keeps control flow consistent with the previous behavior and avoids double-running route + commander parsing in non-exiting environments.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli/run-main.ts
Line: 33:37

Comment:
**Missing early return**

After `tryRouteCli(normalizedArgv)` returns `true`, `runCli()` calls `flushAndExit(...)` but does not `return`, so the rest of the CLI initialization and `program.parseAsync()` will still run if `process.exit()` is mocked/no-op (e.g. in tests) or if future refactors remove the hard exit. Adding `return await flushAndExit(...)` (or `await ...; return;`) keeps control flow consistent with the previous behavior and avoids double-running route + commander parsing in non-exiting environments.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7cbee0cbd2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle
Copy link
Contributor Author

Added return after flushAndExit on the routed-command path so execution can't fall through to program.parseAsync when process.exit is mocked.

The other stale comments (signal handler cleanup, gating onConnectError, flushAndExit type) were already addressed in prior commits.

@greptile-apps @chatgpt-codex-connector Please re-review.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 61 to 63
process.on("SIGINT", shutdown);
process.on("SIGTERM", shutdown);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIGINT handler bypasses cleanup

serveAcpGateway() installs signal handlers with process.on(...) but cleanup() removes them via removeListener (not once). If another part of the process installs its own SIGINT/SIGTERM handler and calls process.exit() (or throws) before shutdown() runs, these listeners remain registered for the rest of the process lifetime (e.g., in tests/programmatic use), and shutdown() will call gateway.stop()/onClosed() unexpectedly on later signals. Consider using process.once(...) here (or ensuring the listeners are always removed when closed resolves/rejects) so they can’t accumulate across repeated serveAcpGateway() calls.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 61:63

Comment:
**SIGINT handler bypasses cleanup**

`serveAcpGateway()` installs signal handlers with `process.on(...)` but `cleanup()` removes them via `removeListener` (not `once`). If another part of the process installs its own SIGINT/SIGTERM handler and calls `process.exit()` (or throws) before `shutdown()` runs, these listeners remain registered for the rest of the process lifetime (e.g., in tests/programmatic use), and `shutdown()` will call `gateway.stop()`/`onClosed()` unexpectedly on later signals. Consider using `process.once(...)` here (or ensuring the listeners are always removed when `closed` resolves/rejects) so they can’t accumulate across repeated `serveAcpGateway()` calls.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle
Copy link
Contributor Author

Switched signal handlers to process.once() — auto-removes after first invocation, no manual cleanup needed.

@greptile-apps @chatgpt-codex-connector Please re-review.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 46 to 53
const shutdown = () => {
if (stopped) {
return;
}
stopped = true;
gateway.stop();
onClosed();
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promise never resolves on close

serveAcpGateway() returns the closed promise, but nothing ever calls onClosed() on normal gateway disconnects. The only resolver path is shutdown() (SIGINT/SIGTERM). If the gateway closes (e.g. server stops, network drop, auth failure) and the process is otherwise idle, await serveAcpGateway() in src/cli/acp-cli.ts will hang indefinitely even though the bridge is effectively disconnected (and runCli() won’t reach its exit path).

This is introduced by the new Promise<void> return + await serveAcpGateway(...) in the CLI action; you likely need to resolve/reject the promise from onClose (and/or from a specific “stopped intentionally” path) so the action can complete deterministically.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 46:53

Comment:
**Promise never resolves on close**

`serveAcpGateway()` returns the `closed` promise, but nothing ever calls `onClosed()` on normal gateway disconnects. The only resolver path is `shutdown()` (SIGINT/SIGTERM). If the gateway closes (e.g. server stops, network drop, auth failure) and the process is otherwise idle, `await serveAcpGateway()` in `src/cli/acp-cli.ts` will hang indefinitely even though the bridge is effectively disconnected (and `runCli()` won’t reach its exit path).

This is introduced by the new `Promise<void>` return + `await serveAcpGateway(...)` in the CLI action; you likely need to resolve/reject the promise from `onClose` (and/or from a specific “stopped intentionally” path) so the action can complete deterministically.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle
Copy link
Contributor Author

Resolved the "promise never resolves on close" issue: moved onClosed() into the onClose callback, gated on the stopped flag. Now:

  • Intentional shutdown (SIGINT/SIGTERM → gateway.stop()) → onClose fires with stopped=true → promise resolves
  • Transient disconnects → onClose fires with stopped=false → client auto-reconnects, promise stays pending

@greptile-apps @chatgpt-codex-connector Please re-review.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 54 to 56
process.once("SIGINT", shutdown);
process.once("SIGTERM", shutdown);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal listeners leak

serveAcpGateway() registers process.once("SIGINT"/"SIGTERM", shutdown) but never removes those listeners when the gateway stops or when closed rejects. If serveAcpGateway() is invoked multiple times in the same process (tests/programmatic use), the old listeners persist and later signals will invoke multiple stale shutdown() closures. Consider removing the listeners in both the onClose (when stopped) and onConnectError (when rejecting) paths once closed settles.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 54:56

Comment:
**Signal listeners leak**

`serveAcpGateway()` registers `process.once("SIGINT"/"SIGTERM", shutdown)` but never removes those listeners when the gateway stops or when `closed` rejects. If `serveAcpGateway()` is invoked multiple times in the same process (tests/programmatic use), the old listeners persist and later signals will invoke multiple stale `shutdown()` closures. Consider removing the listeners in both the `onClose` (when `stopped`) and `onConnectError` (when rejecting) paths once `closed` settles.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 46 to 52
const shutdown = () => {
if (stopped) {
return;
}
stopped = true;
gateway.stop();
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rejected promise leaves gateway running

On initial connect failure you call onError(err) but don’t stop the GatewayClient or mark stopped=true. If the underlying WS connection was created successfully and later fails during connect (or in ws.on("open") TLS fingerprint validation), the client can still be alive/reconnecting even though callers now see serveAcpGateway() rejected. It should either stop/close the client when rejecting, or otherwise ensure no background timers/sockets remain after a rejected startup.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/acp/server.ts
Line: 46:52

Comment:
**Rejected promise leaves gateway running**

On initial connect failure you call `onError(err)` but don’t stop the `GatewayClient` or mark `stopped=true`. If the underlying WS connection was created successfully and later fails during `connect` (or in `ws.on("open")` TLS fingerprint validation), the client can still be alive/reconnecting even though callers now see `serveAcpGateway()` rejected. It should either stop/close the client when rejecting, or otherwise ensure no background timers/sockets remain after a rejected startup.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40aead2b52

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 84 to 85
if (!connected) {
onError(err);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep ACP bridge alive across initial dial failures

Rejecting the serveAcpGateway() promise on onConnectError before the first hello-ok makes openclaw acp exit immediately when the gateway is temporarily unavailable at startup (for example, gateway restarting or a brief network flap). GatewayClient already schedules reconnects on close, but this rejection is caught by acp-cli and calls defaultRuntime.exit(1), so reconnect attempts never get a chance to recover.

Useful? React with 👍 / 👎.

@DrCrinkle
Copy link
Contributor Author

Latest push addresses the remaining feedback:

  1. "Keep ACP bridge alive across initial dial failures" (Codex P1) — Removed onConnectError rejection entirely. GatewayClient handles reconnection with backoff, so the bridge now survives a temporarily-unavailable gateway at startup. For true config errors (TLS fingerprint on non-wss URL), start() returns without creating a WebSocket or reconnect timer, so the event loop drains and flushAndExit handles the exit.

  2. "Routed commands now force-exit" and "Unconditional exit breaks daemons" (Greptile) — Verified: all routed commands (health, status, sessions) are short-lived await-and-return. All long-running commands use async actions that block parseAsync. No commands are affected.

@greptile-apps @chatgpt-codex-connector Please re-review.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c056155b89

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@DrCrinkle
Copy link
Contributor Author

Latest push fixes both new comments:

  1. TDZ risk — Moved signal handler registration after new GatewayClient(), so shutdown can never reference gateway before initialization.
  2. Signal ignored when no WS activeshutdown() now calls onClosed() directly in addition to the onClose callback. If no WebSocket is active (between reconnects), gateway.stop() won't fire onClose, but the direct call ensures the promise resolves. Duplicate resolve is a no-op.

@greptile-apps @chatgpt-codex-connector Please re-review.

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

DrCrinkle and others added 2 commits February 14, 2026 00:23
The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully.  Two root causes:

1. `runCli()` returned without calling `process.exit()` after
   `program.parseAsync()` resolved, and Commander.js does not
   force-exit the process.

2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
   which imported all messaging-provider modules, creating persistent
   event-loop handles that prevented natural Node exit.

Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
  `process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
  `program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
  registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
  intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
  `parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
  point to avoid unhandled rejections.

Fixes openclaw#12904

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@steipete steipete force-pushed the fix/cli-hang-on-exit branch from c89fb1e to a85f645 Compare February 13, 2026 23:34
@steipete steipete merged commit 874ff70 into openclaw:main Feb 13, 2026
10 checks passed
@steipete
Copy link
Contributor

Landed via temp rebase onto main.

  • Gate: pnpm check && pnpm build && pnpm test
  • Land commit: a85f645
  • Merge commit: 874ff70

Thanks @DrCrinkle!

steipete added a commit to azade-c/openclaw that referenced this pull request Feb 14, 2026
* fix: ensure CLI exits after command completion

The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully.  Two root causes:

1. `runCli()` returned without calling `process.exit()` after
   `program.parseAsync()` resolved, and Commander.js does not
   force-exit the process.

2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
   which imported all messaging-provider modules, creating persistent
   event-loop handles that prevented natural Node exit.

Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
  `process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
  `program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
  registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
  intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
  `parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
  point to avoid unhandled rejections.

Fixes openclaw#12904

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: refactor CLI lifecycle and lazy outbound deps (openclaw#12906) (thanks @DrCrinkle)

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
mverrilli pushed a commit to mverrilli/openclaw that referenced this pull request Feb 14, 2026
* fix: ensure CLI exits after command completion

The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully.  Two root causes:

1. `runCli()` returned without calling `process.exit()` after
   `program.parseAsync()` resolved, and Commander.js does not
   force-exit the process.

2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
   which imported all messaging-provider modules, creating persistent
   event-loop handles that prevented natural Node exit.

Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
  `process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
  `program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
  registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
  intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
  `parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
  point to avoid unhandled rejections.

Fixes openclaw#12904

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: refactor CLI lifecycle and lazy outbound deps (openclaw#12906) (thanks @DrCrinkle)

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
GwonHyeok pushed a commit to learners-superpumped/openclaw that referenced this pull request Feb 15, 2026
* fix: ensure CLI exits after command completion

The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully.  Two root causes:

1. `runCli()` returned without calling `process.exit()` after
   `program.parseAsync()` resolved, and Commander.js does not
   force-exit the process.

2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
   which imported all messaging-provider modules, creating persistent
   event-loop handles that prevented natural Node exit.

Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
  `process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
  `program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
  registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
  intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
  `parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
  point to avoid unhandled rejections.

Fixes openclaw#12904

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: refactor CLI lifecycle and lazy outbound deps (openclaw#12906) (thanks @DrCrinkle)

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
cloud-neutral pushed a commit to cloud-neutral-toolkit/openclawbot.svc.plus that referenced this pull request Feb 15, 2026
* fix: ensure CLI exits after command completion

The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully.  Two root causes:

1. `runCli()` returned without calling `process.exit()` after
   `program.parseAsync()` resolved, and Commander.js does not
   force-exit the process.

2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
   which imported all messaging-provider modules, creating persistent
   event-loop handles that prevented natural Node exit.

Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
  `process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
  `program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
  registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
  intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
  `parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
  point to avoid unhandled rejections.

Fixes openclaw#12904

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: refactor CLI lifecycle and lazy outbound deps (openclaw#12906) (thanks @DrCrinkle)

---------

Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI command changes size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: CLI commands hang after completion (process never exits)

2 participants

Comments