decode Windows console output (GBK/CP936)#43611
Conversation
Add a shared Windows console encoding helper that detects the active code page via chcp and decodes captured output with TextDecoder (e.g. CP936/GBK, GB18030). Use that decoder for streamed stdout/stderr in the supervisor child adapter so Windows commands no longer appear garbled when they emit non‑UTF8 text. Apply the same decoding in runCommandWithTimeout() for non‑supervisor spawn paths. Refactor node-host to reuse the shared decoder and update tests accordingly. Harden Windows spawning in scripts/ui.js by avoiding shell: true for resolved absolute paths and routing .cmd/.bat through an explicit cmd.exe /c wrapper.
Greptile SummaryThis PR adds Windows GBK/CP936 console output decoding by extracting a shared Key changes:
Issues found:
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: scripts/ui.js
Line: 99-115
Comment:
**`assertSafeWindowsShellArgs` no longer called before cmd.exe command line construction**
The old `createSpawnOptions` called `assertSafeWindowsShellArgs(args)` when `shouldUseShellForCommand` returned `true` (which it did for any `.cmd`/`.bat`/`.com` file). The new `normalizeWindowsCommand` function builds a cmd.exe `/c` command line by simply joining args with spaces — but never validates those args against the `WINDOWS_UNSAFE_SHELL_ARG_PATTERN`.
Because `run()` passes `["install", ...rest]` or `["run", script, ...rest]` where `rest` comes directly from `process.argv`, a user-supplied argument containing `&`, `|`, `<`, `>`, `^`, `%`, or `!` will be interpreted as cmd.exe metacharacters when executed:
```
node scripts/ui.js dev "myarg & evil-command"
```
…would produce `commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command'` which cmd.exe would split at `&` and execute both parts.
`assertSafeWindowsShellArgs` should be called before joining args into the command line string:
```js
function normalizeWindowsCommand(cmd, args) {
if (process.platform !== "win32") {
return { command: cmd, commandArgs: args };
}
const extension = path.extname(cmd).toLowerCase();
if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
return { command: cmd, commandArgs: args };
}
assertSafeWindowsShellArgs(args); // add this validation
const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;
const command = process.env.comspec || "cmd.exe";
const commandArgs = ["/d", "/s", "/c", commandLine];
return { command, commandArgs };
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/process/exec.ts
Line: 302-310
Comment:
**TextDecoder not flushed on stream close**
`TextDecoder` instances created with `{ stream: true }` maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to `decoder.decode(new Uint8Array(0))` (i.e., without `stream: true`) after the last data event.
For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The `close` handler should flush both decoders and append any remaining bytes to `stdout`/`stderr`:
```js
child.on("close", (code, signal) => {
if (settled) return;
// flush any buffered multi-byte tail
if (stdoutDecoder) stdout += stdoutDecoder.decode();
if (stderrDecoder) stderr += stderrDecoder.decode();
settled = true;
// …rest of existing close handler…
```
The same pattern applies to the `onStdout`/`onStderr` handlers in `src/process/supervisor/adapters/child.ts` (lines 114–132) — a corresponding flush should be added when the `child` emits `close`.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: ff49a6c |
| function normalizeWindowsCommand(cmd, args) { | ||
| if (process.platform !== "win32") { | ||
| return { command: cmd, commandArgs: args }; | ||
| } | ||
| const extension = path.extname(cmd).toLowerCase(); | ||
| if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) { | ||
| return { command: cmd, commandArgs: args }; | ||
| } | ||
|
|
||
| const hasPathSeparator = cmd.includes("\\") || cmd.includes("/"); | ||
| const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd; | ||
| const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd; | ||
|
|
||
| const command = process.env.comspec || "cmd.exe"; | ||
| const commandArgs = ["/d", "/s", "/c", commandLine]; | ||
| return { command, commandArgs }; | ||
| } |
There was a problem hiding this comment.
assertSafeWindowsShellArgs no longer called before cmd.exe command line construction
The old createSpawnOptions called assertSafeWindowsShellArgs(args) when shouldUseShellForCommand returned true (which it did for any .cmd/.bat/.com file). The new normalizeWindowsCommand function builds a cmd.exe /c command line by simply joining args with spaces — but never validates those args against the WINDOWS_UNSAFE_SHELL_ARG_PATTERN.
Because run() passes ["install", ...rest] or ["run", script, ...rest] where rest comes directly from process.argv, a user-supplied argument containing &, |, <, >, ^, %, or ! will be interpreted as cmd.exe metacharacters when executed:
node scripts/ui.js dev "myarg & evil-command"
…would produce commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command' which cmd.exe would split at & and execute both parts.
assertSafeWindowsShellArgs should be called before joining args into the command line string:
function normalizeWindowsCommand(cmd, args) {
if (process.platform !== "win32") {
return { command: cmd, commandArgs: args };
}
const extension = path.extname(cmd).toLowerCase();
if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
return { command: cmd, commandArgs: args };
}
assertSafeWindowsShellArgs(args); // add this validation
const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;
const command = process.env.comspec || "cmd.exe";
const commandArgs = ["/d", "/s", "/c", commandLine];
return { command, commandArgs };
}Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/ui.js
Line: 99-115
Comment:
**`assertSafeWindowsShellArgs` no longer called before cmd.exe command line construction**
The old `createSpawnOptions` called `assertSafeWindowsShellArgs(args)` when `shouldUseShellForCommand` returned `true` (which it did for any `.cmd`/`.bat`/`.com` file). The new `normalizeWindowsCommand` function builds a cmd.exe `/c` command line by simply joining args with spaces — but never validates those args against the `WINDOWS_UNSAFE_SHELL_ARG_PATTERN`.
Because `run()` passes `["install", ...rest]` or `["run", script, ...rest]` where `rest` comes directly from `process.argv`, a user-supplied argument containing `&`, `|`, `<`, `>`, `^`, `%`, or `!` will be interpreted as cmd.exe metacharacters when executed:
```
node scripts/ui.js dev "myarg & evil-command"
```
…would produce `commandLine = '"C:\\...\\pnpm.CMD" myarg & evil-command'` which cmd.exe would split at `&` and execute both parts.
`assertSafeWindowsShellArgs` should be called before joining args into the command line string:
```js
function normalizeWindowsCommand(cmd, args) {
if (process.platform !== "win32") {
return { command: cmd, commandArgs: args };
}
const extension = path.extname(cmd).toLowerCase();
if (!WINDOWS_SHELL_EXTENSIONS.has(extension)) {
return { command: cmd, commandArgs: args };
}
assertSafeWindowsShellArgs(args); // add this validation
const hasPathSeparator = cmd.includes("\\") || cmd.includes("/");
const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd;
const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd;
const command = process.env.comspec || "cmd.exe";
const commandArgs = ["/d", "/s", "/c", commandLine];
return { command, commandArgs };
}
```
How can I resolve this? If you propose a fix, please make it concise.| child.stdout?.on("data", (d) => { | ||
| stdout += d.toString(); | ||
| const chunk = d as Buffer; | ||
| stdout += stdoutDecoder ? stdoutDecoder.decode(chunk, { stream: true }) : chunk.toString(); | ||
| armNoOutputTimer(); | ||
| }); | ||
| child.stderr?.on("data", (d) => { | ||
| stderr += d.toString(); | ||
| const chunk = d as Buffer; | ||
| stderr += stderrDecoder ? stderrDecoder.decode(chunk, { stream: true }) : chunk.toString(); | ||
| armNoOutputTimer(); |
There was a problem hiding this comment.
TextDecoder not flushed on stream close
TextDecoder instances created with { stream: true } maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to decoder.decode(new Uint8Array(0)) (i.e., without stream: true) after the last data event.
For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The close handler should flush both decoders and append any remaining bytes to stdout/stderr:
child.on("close", (code, signal) => {
if (settled) return;
// flush any buffered multi-byte tail
if (stdoutDecoder) stdout += stdoutDecoder.decode();
if (stderrDecoder) stderr += stderrDecoder.decode();
settled = true;
// …rest of existing close handler…The same pattern applies to the onStdout/onStderr handlers in src/process/supervisor/adapters/child.ts (lines 114–132) — a corresponding flush should be added when the child emits close.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/process/exec.ts
Line: 302-310
Comment:
**TextDecoder not flushed on stream close**
`TextDecoder` instances created with `{ stream: true }` maintain an internal buffer for incomplete multi-byte sequences that span chunk boundaries. That buffer is never flushed — there is no final call to `decoder.decode(new Uint8Array(0))` (i.e., without `stream: true`) after the last data event.
For GBK/GB18030 output, if the last data event ends on the first byte of a two-byte character, that byte will silently be dropped. The `close` handler should flush both decoders and append any remaining bytes to `stdout`/`stderr`:
```js
child.on("close", (code, signal) => {
if (settled) return;
// flush any buffered multi-byte tail
if (stdoutDecoder) stdout += stdoutDecoder.decode();
if (stderrDecoder) stderr += stderrDecoder.decode();
settled = true;
// …rest of existing close handler…
```
The same pattern applies to the `onStdout`/`onStderr` handlers in `src/process/supervisor/adapters/child.ts` (lines 114–132) — a corresponding flush should be added when the `child` emits `close`.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ff49a6caaf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| const hasPathSeparator = cmd.includes("\\") || cmd.includes("/"); | ||
| const quotedCmd = hasPathSeparator ? `"${cmd}"` : cmd; | ||
| const commandLine = args.length > 0 ? `${quotedCmd} ${args.join(" ")}` : quotedCmd; |
There was a problem hiding this comment.
Escape args when composing cmd.exe /c command
On Windows, this builds the cmd.exe /c command line with args.join(" "), so cmd.exe reparses unescaped user arguments instead of receiving the original argv values. Any forwarded arg containing spaces (e.g. --grep "foo bar") is split into multiple args, and metacharacters like &/| are interpreted by cmd.exe, which changes behavior and can execute unintended subcommands. This regresses the previous guarded shell path for .cmd/.bat execution.
Useful? React with 👍 / 👎.
|
Can confirm this works |
Add a shared Windows console encoding helper that detects the active code page via chcp and decodes captured output with TextDecoder (e.g. CP936/GBK, GB18030). Use that decoder for streamed stdout/stderr in the supervisor child adapter so Windows commands no longer appear garbled when they emit non‑UTF8 text. Apply the same decoding in runCommandWithTimeout() for non‑supervisor spawn paths. Refactor node-host to reuse the shared decoder and update tests accordingly. Harden Windows spawning in scripts/ui.js by avoiding shell: true for resolved absolute paths and routing .cmd/.bat through an explicit cmd.exe /c wrapper.