fix: replace Unicode checkmark with ASCII for Windows encoding (#535)#681
fix: replace Unicode checkmark with ASCII for Windows encoding (#535)#681bensig merged 2 commits intoMemPalace:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent Windows console encoding crashes during progress output by replacing the Unicode checkmark (✓) with an ASCII alternative in mining/splitting scripts.
Changes:
- Replace
✓with+inconvo_miner.pyprogress output. - Replace
✓with+insplit_mega_files.pyper-chunk write output.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| mempalace/convo_miner.py | Swaps the per-file progress marker from Unicode to ASCII. |
| mempalace/split_mega_files.py | Swaps the per-chunk success marker from Unicode to ASCII. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| total_drawers += drawers_added | ||
| print(f" ✓ [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers_added}") | ||
| print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers_added}") |
There was a problem hiding this comment.
This change removes the checkmark here, but the same Unicode ✓ progress marker still exists in mempalace/miner.py (used by the main mempalace mine flow). If the goal is to avoid Windows cp1251/cp1252 UnicodeEncodeErrors during progress output, that remaining print will still crash—please update it (or centralize the status marker) so all mining commands are consistent.
There was a problem hiding this comment.
Fixed — added miner.py checkmark and all box-drawing/arrow chars (─, →) across both files. 5044464
| print(f" [{i + 1}/{len(boundaries) - 1}] {name} ({len(chunk)} lines)") | ||
| else: | ||
| out_path.write_text("".join(chunk), encoding="utf-8") | ||
| print(f" ✓ {name} ({len(chunk)} lines)") | ||
| print(f" + {name} ({len(chunk)} lines)") | ||
|
|
There was a problem hiding this comment.
Replacing ✓ with '+' helps, but this script still prints other non-ASCII characters (e.g., the box-drawing '─' separators and the '→' arrow). On Windows consoles/codepages like cp1251/cp1252 those can also raise UnicodeEncodeError, so the encoding crash may persist even after this change. Consider switching those to ASCII equivalents (e.g., '-' and '->') or explicitly ensuring stdout is UTF-8 / using a safe write strategy.
There was a problem hiding this comment.
Fixed — replaced ─ with - and → with -> in split_mega_files.py. All non-ASCII progress markers are now ASCII. 5044464
…emPalace#681-684 new Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ged, add MemPalace#681, bump test count to 715 Co-Authored-By: Claude Opus 4.6 <[email protected]>
5044464 to
d092918
Compare
…Palace#535) Windows terminals using cp1251/cp1252 crash on the Unicode ✓ (U+2713) in progress output. Replace with ASCII + in convo_miner.py and split_mega_files.py. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Also fix miner.py checkmark and box-drawing/arrow chars (─, →) in both miner.py and split_mega_files.py that would crash on cp1251/cp1252. Co-Authored-By: Claude Opus 4.6 <[email protected]>
586d302 to
15ea385
Compare
fork-direction.md and TODO-fork-improvements.md were scattered strategic thinking that belongs in the front-door README: competitive landscape, roadmap (P0-P6), and open problems. Merges them in and deletes the standalone files. PR-status tables (README + CLAUDE.md) were stale after today's rebases of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863. test_readme_claims.py skips MCP-tool-table and dialect-reference checks when absent — our slimmed fork README doesn't reproduce upstream's tool table structure. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
|
Friendly ping on review — this is a one-character change swapping the Unicode ✓ for ASCII to fix Windows GBK / cp1252 encoding errors (#535). Clean merge against current |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code |
…emPalace#1036 Overnight/morning: - MemPalace#681, MemPalace#1000, MemPalace#1023 merged — moved from "Still ahead" to "Merged upstream (post-v3.3.1)" - bensig reviewed MemPalace#659 (wing_ prefix + agent filter) and MemPalace#1021 (silent_guard default) — both addressed on their PR branches - MemPalace#673 needed re-rebase after overnight develop merges; done - MemPalace#1036 filed: paginate miner.status(), closes upstream MemPalace#802 and MemPalace#1015 Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Version bumps across pyproject.toml, mempalace/version.py, README badge, uv.lock, and plugin manifests (.claude-plugin/*, .codex-plugin/*). CHANGELOG aligned with main (post-3.3.1) and a new [3.3.2] section added covering the 11 PRs merged on develop since v3.3.1 — silent-transcript-drop fix + tandem sweeper (#998), None-metadata guards (#999, #1013), chromadb ≥1.5.4 for Py 3.13/3.14 (#1010), Windows Unicode (#681), HNSW quarantine recovery (#1000), PID stacking guard (#1023), doc-path cleanup (#996, #1012), and RFC 001/002 internal scaffolding (#995, #1014, #990).
Non-ASCII glyphs (regression of the #681 class of Windows UnicodeEncodeError): - mempalace/cli.py: "✗" → "ERROR:", "⚠" → "WARNING:", em dash → "-" - mempalace/sweeper.py: "⚠" → "WARNING:" Backend arg validation: - mempalace/backends/chroma.py: `_normalize_get_collection_args` now raises TypeError on unexpected trailing positional args instead of silently dropping them — surfaces call-site bugs early. Docs site: - website/.vitepress/config.mts: gate Google Analytics scripts behind MEMPALACE_DOCS_GA_ID env var (default off). Self-hosters no longer get GA injected unconditionally. Landing page SPA hygiene: - website/.vitepress/theme/landing/useLandingEffects.js: collect all IntersectionObserver disconnects and removeEventListener thunks in a shared `cleanups` registry; drain it in `onBeforeUnmount` so observers and form/replay listeners don't leak across SPA navigations.
Resolutions: - `.claude-plugin/.mcp.json`, `plugin.json` — adopt upstream's `mempalace-mcp` console-script command (added via upstream MemPalace#340 for pipx/uv). Run `pip install -e .` in plugin venv after merge to install the entry point. - `.claude-plugin/hooks/*.sh`, `hooks/*.sh` — adopt upstream's console-command resolution order (`mempalace` script → `python3 -m mempalace` → `python`). `MEMPAL_PYTHON` override still works inside `hooks_cli.py`. - `mempalace/hooks_cli.py`, `tests/test_hooks_cli.py` — keep fork's `_mempalace_python()` helper (fork-ahead #4); upstream only had `sys.executable`, which loses MEMPAL_PYTHON override. - `mempalace/miner.py` — keep fork's concurrent mining path (fork-ahead #1), apply upstream's unicode-`✓` → ASCII-`+` fix (MemPalace#681) to both paths. - `mempalace/backends/chroma.py` — take upstream's refined `quarantine_stale_hnsw` docstring (it's the version merged via our own MemPalace#1000). Brought in: 33 upstream commits including Belarusian/Chinese/German/Spanish/French entity detection, console-script entry points, hook plugin-root space quoting, and v3.3.2 tag (which contains our MemPalace#681/MemPalace#1000/MemPalace#1023). Tests: 1096 passed, 106 deselected (benchmarks). Ruff clean.
…#1023 merged, MemPalace#673 rebased - Bump fork-ahead section header to "after v3.3.2" - Strike #11 (quarantine_stale_hnsw) and MemPalace#18 (PID file guard) as merged-into-upstream-via-v3.3.2, keep entries for traceability - Add v3.3.2-shipped items to "Merged into upstream (post-v3.3.1)" - Rebuild PR table: 10 merged / 7 open / 7 closed; add MemPalace#1024 row, reclassify MemPalace#681/MemPalace#1000/MemPalace#1023 as merged, note MemPalace#673 rebased 2026-04-21 - Annotate MemPalace#661 status with the GitHub review-state machine caveat (CHANGES_REQUESTED persists until reviewer dismisses, not owed) - Bump test count 1063 → 1096 post-merge
…unts - Upstream released v3.3.2 on 2026-04-21 (our MemPalace#681/MemPalace#1000/MemPalace#1023) - Drawer count 152,682 → 165,632; wings 23 → 28; tests 1063 → 1096 - MemPalace#673 now MERGEABLE after fresh rebase + squash to 1 commit - MemPalace#661 status clarified: CHANGES_REQUESTED persists until reviewer dismisses, not an outstanding owe - Merged-upstream section split out v3.3.2 release group
…ace#1034) GBK consoles (Windows PowerShell/CMD default) cannot encode U+2713 (✓), U+2717 (✗), and U+2014 (—). The same class of UnicodeEncodeError fixed in miner.py via MemPalace#681 affects closet_llm.py and cli.py. Replace with ASCII equivalents: [OK], [FAIL], [!], and hyphen.
…ace#1034) GBK consoles (Windows PowerShell/CMD default) cannot encode U+2713 (✓), U+2717 (✗), and U+2014 (—). The same class of UnicodeEncodeError fixed in miner.py via MemPalace#681 affects closet_llm.py and cli.py. Replace with ASCII equivalents: [OK], [FAIL], [!], and hyphen.
Summary
+inconvo_miner.pyandsplit_mega_files.pyFiles changed
mempalace/convo_miner.py— line 359mempalace/split_mega_files.py— line 227Test plan
🤖 Generated with Claude Code