Skip to content

fix: add encoding="utf-8" to prompt file open() calls in script_runner (Windows CP950)#607

Merged
sergio-sisternes-epam merged 3 commits intomicrosoft:mainfrom
edenfunf:fix/cp950-file-encoding
Apr 14, 2026
Merged

fix: add encoding="utf-8" to prompt file open() calls in script_runner (Windows CP950)#607
sergio-sisternes-epam merged 3 commits intomicrosoft:mainfrom
edenfunf:fix/cp950-file-encoding

Conversation

@edenfunf
Copy link
Copy Markdown
Contributor

@edenfunf edenfunf commented Apr 7, 2026

Summary

Fixes a UnicodeDecodeError crash when reading or writing .prompt.md files on Windows systems with a non-UTF-8 locale encoding.

Closes #604.

Root Cause

Three open() calls in script_runner.py rely on the platform default encoding. On Windows with CP950/CP936/CP932, Python cannot decode UTF-8 multi-byte sequences (any CJK character, emoji, or non-ASCII content in a prompt file triggers the error).

Changes

src/apm_cli/core/script_runner.py — 3 lines changed

Location Before After
ScriptRunner._execute_script — read compiled file open(compiled_path, "r") open(compiled_path, "r", encoding="utf-8")
PromptCompiler.compile — read source prompt open(prompt_path, "r") open(prompt_path, "r", encoding="utf-8")
PromptCompiler.compile — write compiled output open(output_path, "w") open(output_path, "w", encoding="utf-8")

All other open() and read_text() calls in the codebase already carry an explicit encoding parameter; this brings the remaining three into line.

Test Plan

  • uv run pytest tests/unit/test_script_runner.py -x -v — all tests pass
  • On Windows CP950: create a .prompt.md containing CJK characters, run apm run start — no UnicodeDecodeError

…r (Windows CP950)

PromptCompiler.compile() and _resolve_prompt_file() open .prompt.md files
with plain open() which defaults to the system locale encoding. On Windows
systems set to CP950/CP936/CP932 (Chinese/Japanese/Korean), a UTF-8 encoded
prompt file containing any multibyte character causes:

  UnicodeDecodeError: 'cp950' codec can't decode byte 0x8b in position 12

Three open() calls were missing explicit encoding:
- compiled_path read in ScriptRunner._execute_script
- prompt_path read in PromptCompiler.compile
- output_path write in PromptCompiler.compile

Adding encoding="utf-8" to all three matches the behaviour of the rest of
the codebase and fixes the crash for any non-UTF-8 Windows locale.
@edenfunf edenfunf requested a review from danielmeppiel as a code owner April 7, 2026 06:03
@sergio-sisternes-epam sergio-sisternes-epam merged commit 84d0401 into microsoft:main Apr 14, 2026
6 checks passed
danielmeppiel added a commit that referenced this pull request Apr 30, 2026
…1068)

* fix: add explicit UTF-8 encoding to open() calls across 5 modules

Successor to #635 (closed-no-CLA). #604 was fixed by #607 in
script_runner.py only. This extends the same fix to the remaining
files where Windows cp950 / cp1252 locales would crash on UTF-8
content:

- src/apm_cli/config.py: global config read/write
- src/apm_cli/marketplace/client.py: marketplace cache I/O
- src/apm_cli/marketplace/registry.py: registry persistence
- src/apm_cli/models/plugin.py: plugin metadata read
- src/apm_cli/runtime/copilot_runtime.py: MCP config read

Adds round-trip unit tests with non-ASCII content for each module
to catch regressions.

Refs #604 (closed). Closes #635 (closed-no-CLA, re-derived in microsoft/apm).

Co-authored-by: Copilot <[email protected]>

* test: use unicode escapes instead of raw non-ASCII literals

Addresses Copilot review on #1068. The repo's encoding rule requires
Python source files to stay within printable ASCII. Replace the CJK /
accented literals introduced for round-trip tests with \\uXXXX escapes
so runtime values remain non-ASCII (still exercising the UTF-8
encoding fix) without putting non-ASCII bytes in .py source.

Affects 5 test files; no behavioral change.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: UnicodeDecodeError reading .prompt.md on Windows CP950 — open() missing encoding parameter

3 participants