fix: add ensure_ascii=False to json.dumps in MCP server#425
fix: add ensure_ascii=False to json.dumps in MCP server#425barry3406 wants to merge 1 commit intoMemPalace:developfrom
Conversation
json.dumps without ensure_ascii=False escapes non-ASCII characters to \uXXXX sequences, which breaks when the input contains lone surrogate characters from certain encoding round-trips (e.g. Chinese text through some MCP clients). The serialization crashes instead of returning valid JSON. Applied to both the tool result serialization and the main loop stdout writer. Added tests verifying non-ASCII text survives the round-trip. Fixes MemPalace#359
web3guru888
left a comment
There was a problem hiding this comment.
Clean fix, +1. We use ensure_ascii=False in our integration for exactly this reason — scientific terms with Greek letters (α, β, γ), mathematical symbols (∑, ∇), and CJK references get mangled into \uXXXX escape sequences without it.
The change covers both the right spots: the tool result serialization in handle_request and the stdout writer in main(). Good that the tests cover round-trip serialization with actual CJK characters rather than just asserting no exception.
One consideration for downstream: any MCP client that was parsing the escaped \uXXXX sequences will now see raw UTF-8 bytes instead. This is correct per the JSON spec (both are valid), but if anyone was doing naive string matching on the escaped form, they'd need to update. Probably not a concern in practice.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
This PR addresses Windows encoding issues and improves non-ASCII character handling throughout the MCP server. **Changes:** 1. **stdin/stdout UTF-8 enforcement (MemPalace#503)** - Windows defaults to cp936/GBK, causing crashes with CJK content - Wrapped stdin with 'surrogateescape' error handler (tolerant input) - Wrapped stdout with 'replace' error handler (safe output) - Added hasattr() checks for edge cases (IDE/test environments) 2. **MCP tool result serialization** - Added ensure_ascii=False to tool result json.dumps() (line 907) - Preserves Unicode characters in tool responses 3. **JSON-RPC response serialization (MemPalace#359)** - Added ensure_ascii=False to main loop json.dumps() (line 949) - Prevents Chinese characters from being escaped to \uXXXX **Technical Details:** - stdin uses 'surrogateescape': handles malformed input gracefully - stdout uses 'replace': prevents lone surrogates in JSON-RPC responses (per recommendation from MemPalace#503 discussion to avoid breaking client parsers) - All json.dumps() calls now preserve Unicode with ensure_ascii=False Fixes MemPalace#503 Related to MemPalace#359 (already has PR MemPalace#425 with tests) Made-with: Cursor
This PR addresses Windows encoding issues and improves non-ASCII character handling throughout the MCP server. **Changes:** 1. **stdin/stdout UTF-8 enforcement (MemPalace#503)** - Windows defaults to cp936/GBK, causing crashes with CJK content - Wrapped stdin with 'surrogateescape' error handler (tolerant input) - Wrapped stdout with 'replace' error handler (safe output) - Added hasattr() checks for edge cases (IDE/test environments) 2. **MCP tool result serialization** - Added ensure_ascii=False to tool result json.dumps() (line 907) - Preserves Unicode characters in tool responses 3. **JSON-RPC response serialization (MemPalace#359)** - Added ensure_ascii=False to main loop json.dumps() (line 949) - Prevents Chinese characters from being escaped to \uXXXX **Technical Details:** - stdin uses 'surrogateescape': handles malformed input gracefully - stdout uses 'replace': prevents lone surrogates in JSON-RPC responses (per recommendation from MemPalace#503 discussion to avoid breaking client parsers) - All json.dumps() calls now preserve Unicode with ensure_ascii=False Fixes MemPalace#503 Related to MemPalace#359 (already has PR MemPalace#425 with tests) Made-with: Cursor
This PR addresses Windows encoding issues and improves non-ASCII character handling throughout the MCP server. **Changes:** 1. **stdin/stdout UTF-8 enforcement (MemPalace#503)** - Windows defaults to cp936/GBK, causing crashes with CJK content - Wrapped stdin with 'surrogateescape' error handler (tolerant input) - Wrapped stdout with 'replace' error handler (safe output) - Added hasattr() checks for edge cases (IDE/test environments) 2. **MCP tool result serialization** - Added ensure_ascii=False to tool result json.dumps() (line 907) - Preserves Unicode characters in tool responses 3. **JSON-RPC response serialization (MemPalace#359)** - Added ensure_ascii=False to main loop json.dumps() (line 949) - Prevents Chinese characters from being escaped to \uXXXX **Technical Details:** - stdin uses 'surrogateescape': handles malformed input gracefully - stdout uses 'replace': prevents lone surrogates in JSON-RPC responses (per recommendation from MemPalace#503 discussion to avoid breaking client parsers) - All json.dumps() calls now preserve Unicode with ensure_ascii=False Fixes MemPalace#503 Related to MemPalace#359 (already has PR MemPalace#425 with tests) Made-with: Cursor
This PR addresses Windows encoding issues and improves non-ASCII character handling throughout the MCP server. **Changes:** 1. **stdin/stdout UTF-8 enforcement (MemPalace#503)** - Windows defaults to cp936/GBK, causing crashes with CJK content - Wrapped stdin with 'surrogateescape' error handler (tolerant input) - Wrapped stdout with 'replace' error handler (safe output) - Added hasattr() checks for edge cases (IDE/test environments) 2. **MCP tool result serialization** - Added ensure_ascii=False to tool result json.dumps() (line 907) - Preserves Unicode characters in tool responses 3. **JSON-RPC response serialization (MemPalace#359)** - Added ensure_ascii=False to main loop json.dumps() (line 949) - Prevents Chinese characters from being escaped to \uXXXX **Technical Details:** - stdin uses 'surrogateescape': handles malformed input gracefully - stdout uses 'replace': prevents lone surrogates in JSON-RPC responses (per recommendation from MemPalace#503 discussion to avoid breaking client parsers) - All json.dumps() calls now preserve Unicode with ensure_ascii=False Fixes MemPalace#503 Related to MemPalace#359 (already has PR MemPalace#425 with tests) Made-with: Cursor
This PR addresses Windows encoding issues and improves non-ASCII character handling throughout the MCP server. **Changes:** 1. **stdin/stdout UTF-8 enforcement (MemPalace#503)** - Windows defaults to cp936/GBK, causing crashes with CJK content - Wrapped stdin with 'surrogateescape' error handler (tolerant input) - Wrapped stdout with 'replace' error handler (safe output) - Added hasattr() checks for edge cases (IDE/test environments) 2. **MCP tool result serialization** - Added ensure_ascii=False to tool result json.dumps() (line 907) - Preserves Unicode characters in tool responses 3. **JSON-RPC response serialization (MemPalace#359)** - Added ensure_ascii=False to main loop json.dumps() (line 949) - Prevents Chinese characters from being escaped to \uXXXX **Technical Details:** - stdin uses 'surrogateescape': handles malformed input gracefully - stdout uses 'replace': prevents lone surrogates in JSON-RPC responses (per recommendation from MemPalace#503 discussion to avoid breaking client parsers) - All json.dumps() calls now preserve Unicode with ensure_ascii=False Fixes MemPalace#503 Related to MemPalace#359 (already has PR MemPalace#425 with tests) Made-with: Cursor
Fixes #359
json.dumpsin the MCP server didn't setensure_ascii=False, so non-ASCII text (Chinese, Japanese, etc.) would either get escaped to\uXXXXor crash outright when lone surrogate characters showed up from certain MCP client encoding paths.Before:
Chinese text →
UnicodeEncodeErroror garbled\uXXXXoutput.After:
Chinese text → preserved as-is in the JSON output.
Added two tests in
test_mcp_server.py:test_tool_result_preserves_unicode— end-to-end throughhandle_requestwith Chinese texttest_main_loop_writes_unicode— verifies the serialization doesn't crashAll 35 tests pass.