Skip to content

decode Windows console output (GBK/CP936)#43611

Open
iready wants to merge 1 commit intoopenclaw:mainfrom
iready:gbkexec
Open

decode Windows console output (GBK/CP936)#43611
iready wants to merge 1 commit intoopenclaw:mainfrom
iready:gbkexec

Conversation

@iready
Copy link
Copy Markdown

@iready iready commented Mar 12, 2026

Add a shared Windows console encoding helper that detects the active code page via chcp and decodes captured output with TextDecoder (e.g. CP936/GBK, GB18030). Use that decoder for streamed stdout/stderr in the supervisor child adapter so Windows commands no longer appear garbled when they emit non‑UTF8 text. Apply the same decoding in runCommandWithTimeout() for non‑supervisor spawn paths. Refactor node-host to reuse the shared decoder and update tests accordingly. Harden Windows spawning in scripts/ui.js by avoiding shell: true for resolved absolute paths and routing .cmd/.bat through an explicit cmd.exe /c wrapper.

Add a shared Windows console encoding helper that detects the active code page via chcp and decodes captured output with TextDecoder (e.g. CP936/GBK, GB18030).
Use that decoder for streamed stdout/stderr in the supervisor child adapter so Windows commands no longer appear garbled when they emit non‑UTF8 text.
Apply the same decoding in runCommandWithTimeout() for non‑supervisor spawn paths.
Refactor node-host to reuse the shared decoder and update tests accordingly.
Harden Windows spawning in scripts/ui.js by avoiding shell: true for resolved absolute paths and routing .cmd/.bat through an explicit cmd.exe /c wrapper.
@openclaw-barnacle openclaw-barnacle bot added scripts Repository scripts size: M labels Mar 12, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR adds Windows GBK/CP936 console output decoding by extracting a shared windows-console-encoding module, wiring separate TextDecoder instances into the supervisor child adapter and runCommandWithTimeout, and hardening scripts/ui.js spawning to avoid shell: true for resolved absolute paths.

Key changes:

  • New src/process/windows-console-encoding.ts cleanly centralises code-page detection (chcp) and buffer decoding; the cache and regex are correct.
  • Separate TextDecoder instances per stream (stdout/stderr) are correctly created for streaming state isolation.
  • scripts/ui.js correctly replaces shell: true with an explicit cmd.exe /c wrapper for .cmd/.bat files.

Issues found:

  • Shell injection regression in scripts/ui.js: normalizeWindowsCommand builds a cmd.exe /c command line via args.join(" ") but no longer calls assertSafeWindowsShellArgs — which was previously invoked by createSpawnOptions for every .cmd/.bat command. User-controlled rest args from process.argv can contain &, |, <, >, etc. that cmd.exe will interpret as metacharacters.
  • TextDecoder not flushed on stream close: Both runCommandWithTimeout (exec.ts) and createChildAdapter (child.ts) create streaming decoders with { stream: true } but never call decoder.decode() (without stream: true) after the final data chunk, potentially silently dropping the tail of a split multi-byte character.

Confidence Score: 3/5

  • Safe to merge on non-Windows paths; the shell-injection regression in scripts/ui.js should be fixed before merging on Windows environments where untrusted CLI args are possible.
  • The core encoding logic is sound and the refactoring is clean. However, the removal of assertSafeWindowsShellArgs from the cmd.exe code path in scripts/ui.js is a security regression for Windows users, and the missing TextDecoder flush could cause data loss at stream boundaries.
  • scripts/ui.js (missing arg validation before cmd.exe command line construction) and src/process/exec.ts / src/process/supervisor/adapters/child.ts (TextDecoder not flushed on stream close).
Prompt To Fix All With AI
This is a comment left during a code review.
Path: scripts/ui.js
Line: 99-115

Comment:
**`assertSafeWindowsShellArgs` no longer called before cmd.exe command line construction**

The old `createSpawnOptions` called `assertSafeWindowsShellArgs(args)` when `shouldUseShellForCommand` returned `true` (which it did for any `.cmd`/`.bat`/`.com` file). The new `normalizeWindowsCommand` function builds a cmd.exe `/c` command line by simply joining args with spaces — but never validates those args against the `WINDOWS_UNSAFE_SHELL_ARG_PATTERN`.

Because `run()` passes `["install", ...rest]` or `["run", script, ...rest]` where `rest` comes directly from `process.argv`, a user-supplied argument containing `&`, `|`, `<`, `>`, `^`, `%`, or `!` will be interpreted as cmd.exe metacharacters when executed:

```
node scripts/ui.js dev "myarg & evil-command"
```

…would produce `commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command'` which cmd.exe would split at `&` and execute both parts.

`assertSafeWindowsShellArgs` should be called before joining args into the command line string:

```js
function normalizeWindowsCommand(cmd, args) {
  if (process.platform !== "win32") {
    return { command: cmd, commandArgs: args };
  }
  const extension = path.extname(cmd).toLowerCase();
  if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
    return { command: cmd, commandArgs: args };
  }

  assertSafeWindowsShellArgs(args); // add this validation

  const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
  const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
  const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;

  const command = process.env.comspec || "cmd.exe";
  const commandArgs = ["/d", "/s", "/c", commandLine];
  return { command, commandArgs };
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/process/exec.ts
Line: 302-310

Comment:
**TextDecoder not flushed on stream close**

`TextDecoder` instances created with `{ stream: true }` maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to `decoder.decode(new Uint8Array(0))` (i.e., without `stream: true`) after the last data event.

For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The `close` handler should flush both decoders and append any remaining bytes to `stdout`/`stderr`:

```js
child.on("close", (code, signal) => {
  if (settled) return;
  // flush any buffered multi-byte tail
  if (stdoutDecoder) stdout += stdoutDecoder.decode();
  if (stderrDecoder) stderr += stderrDecoder.decode();
  settled = true;
  // …rest of existing close handler…
```

The same pattern applies to the `onStdout`/`onStderr` handlers in `src/process/supervisor/adapters/child.ts` (lines 114–132) — a corresponding flush should be added when the `child` emits `close`.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: ff49a6c

Comment on lines +99 to +115
function normalizeWindowsCommand(cmd, args) {
if (process.platform !== "win32") {
return { command: cmd, commandArgs: args };
}
const extension = path.extname(cmd).toLowerCase();
if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
return { command: cmd, commandArgs: args };
}

const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;

const command = process.env.comspec || "cmd.exe";
const commandArgs = ["/d", "/s", "/c", commandLine];
return { command, commandArgs };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertSafeWindowsShellArgs no longer called before cmd.exe command line construction

The old createSpawnOptions called assertSafeWindowsShellArgs(args) when shouldUseShellForCommand returned true (which it did for any .cmd/.bat/.com file). The new normalizeWindowsCommand function builds a cmd.exe /c command line by simply joining args with spaces — but never validates those args against the WINDOWS_UNSAFE_SHELL_ARG_PATTERN.

Because run() passes ["install", ...rest] or ["run", script, ...rest] where rest comes directly from process.argv, a user-supplied argument containing &, |, <, >, ^, %, or ! will be interpreted as cmd.exe metacharacters when executed:

node scripts/ui.js dev "myarg & evil-command"

…would produce commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command' which cmd.exe would split at & and execute both parts.

assertSafeWindowsShellArgs should be called before joining args into the command line string:

function normalizeWindowsCommand(cmd, args) {
  if (process.platform !== "win32") {
    return { command: cmd, commandArgs: args };
  }
  const extension = path.extname(cmd).toLowerCase();
  if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
    return { command: cmd, commandArgs: args };
  }

  assertSafeWindowsShellArgs(args); // add this validation

  const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
  const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
  const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;

  const command = process.env.comspec || "cmd.exe";
  const commandArgs = ["/d", "/s", "/c", commandLine];
  return { command, commandArgs };
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/ui.js
Line: 99-115

Comment:
**`assertSafeWindowsShellArgs` no longer called before cmd.exe command line construction**

The old `createSpawnOptions` called `assertSafeWindowsShellArgs(args)` when `shouldUseShellForCommand` returned `true` (which it did for any `.cmd`/`.bat`/`.com` file). The new `normalizeWindowsCommand` function builds a cmd.exe `/c` command line by simply joining args with spaces — but never validates those args against the `WINDOWS_UNSAFE_SHELL_ARG_PATTERN`.

Because `run()` passes `["install", ...rest]` or `["run", script, ...rest]` where `rest` comes directly from `process.argv`, a user-supplied argument containing `&`, `|`, `<`, `>`, `^`, `%`, or `!` will be interpreted as cmd.exe metacharacters when executed:

```
node scripts/ui.js dev "myarg & evil-command"
```

…would produce `commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command'` which cmd.exe would split at `&` and execute both parts.

`assertSafeWindowsShellArgs` should be called before joining args into the command line string:

```js
function normalizeWindowsCommand(cmd, args) {
  if (process.platform !== "win32") {
    return { command: cmd, commandArgs: args };
  }
  const extension = path.extname(cmd).toLowerCase();
  if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
    return { command: cmd, commandArgs: args };
  }

  assertSafeWindowsShellArgs(args); // add this validation

  const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
  const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
  const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;

  const command = process.env.comspec || "cmd.exe";
  const commandArgs = ["/d", "/s", "/c", commandLine];
  return { command, commandArgs };
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 302 to 310
child.stdout?.on("data", (d) => {
stdout += d.toString();
const chunk = d as Buffer;
stdout += stdoutDecoder ? stdoutDecoder.decode(chunk, { stream: true }) : chunk.toString();
armNoOutputTimer();
});
child.stderr?.on("data", (d) => {
stderr += d.toString();
const chunk = d as Buffer;
stderr += stderrDecoder ? stderrDecoder.decode(chunk, { stream: true }) : chunk.toString();
armNoOutputTimer();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextDecoder not flushed on stream close

TextDecoder instances created with { stream: true } maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to decoder.decode(new Uint8Array(0)) (i.e., without stream: true) after the last data event.

For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The close handler should flush both decoders and append any remaining bytes to stdout/stderr:

child.on("close", (code, signal) => {
  if (settled) return;
  // flush any buffered multi-byte tail
  if (stdoutDecoder) stdout += stdoutDecoder.decode();
  if (stderrDecoder) stderr += stderrDecoder.decode();
  settled = true;
  // …rest of existing close handler…

The same pattern applies to the onStdout/onStderr handlers in src/process/supervisor/adapters/child.ts (lines 114–132) — a corresponding flush should be added when the child emits close.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/process/exec.ts
Line: 302-310

Comment:
**TextDecoder not flushed on stream close**

`TextDecoder` instances created with `{ stream: true }` maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to `decoder.decode(new Uint8Array(0))` (i.e., without `stream: true`) after the last data event.

For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The `close` handler should flush both decoders and append any remaining bytes to `stdout`/`stderr`:

```js
child.on("close", (code, signal) => {
  if (settled) return;
  // flush any buffered multi-byte tail
  if (stdoutDecoder) stdout += stdoutDecoder.decode();
  if (stderrDecoder) stderr += stderrDecoder.decode();
  settled = true;
  // …rest of existing close handler…
```

The same pattern applies to the `onStdout`/`onStderr` handlers in `src/process/supervisor/adapters/child.ts` (lines 114–132) — a corresponding flush should be added when the `child` emits `close`.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff49a6caaf

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Escape args when composing cmd.exe /c command

On Windows, this builds the cmd.exe /c command line with args.join(" "), so cmd.exe reparses unescaped user arguments instead of receiving the original argv values. Any forwarded arg containing spaces (e.g. --grep "foo bar") is split into multiple args, and metacharacters like &/| are interpreted by cmd.exe, which changes behavior and can execute unintended subcommands. This regresses the previous guarded shell path for .cmd/.bat execution.

Useful? React with 👍 / 👎.

@easyteacher
Copy link
Copy Markdown

Can confirm this works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scripts Repository scripts size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants