fix(cli): paginate miner.status() to remove 10K drawer truncation#1
Merged
jphein merged 1 commit intojphein:feat/mcp-hooks-exportfrom Apr 10, 2026
Merged
Conversation
The CLI `mempalace status` command still capped at 10,000 drawers after this PR's MCP server fix, because miner.status() retained the original `col.get(limit=10000, include=["metadatas"])` call. On palaces larger than 10K, it printed a wrong total and silently dropped every wing past the cutoff (tested on a 14,902-drawer / 17-wing palace: CLI reported "10000 drawers" and listed only 11 wings). Replace the single bounded call with the same paginated offset loop the new `_fetch_all_metadata()` helper uses in mcp_server.py (1,000-item batches), and report the true total via `col.count()`. Closes the CLI half of MemPalace#478.
4 tasks
jphein
pushed a commit
that referenced
this pull request
Apr 10, 2026
…silon, add schema bounds - Paginate miner.status() past 10K drawer limit (fixes PR #1 from psaghelyi) - Use math.isclose(abs_tol=0.001) instead of abs() < 0.01 for mtime dedup - Add minimum/maximum bounds to min_similarity in MCP search tool schema - Add debug logging to bare except in file_already_mined() https://claude.ai/code/session_01LAi5NQmr4KKyx6QNwZrRXq
jphein
added a commit
that referenced
this pull request
Apr 11, 2026
Hybrid search (TODO #1): when vector results are poor (best distance > 1.0), automatically falls back to keyword text-match via ChromaDB where_document.$contains. Extracts most distinctive non-stopword token from query, or accepts explicit keyword param. Results merged, deduped, sorted by distance. MCP server exposes new keyword parameter. Wing fix (TODO #0): _ingest_transcript() now derives project wing from Claude Code transcript path (-Projects-<name>/) instead of hardcoding "sessions". Per-project search now finds auto-mined content. 692 tests pass (22 new). Co-Authored-By: Claude Opus 4.6 <[email protected]>
jphein
added a commit
that referenced
this pull request
Apr 11, 2026
Co-Authored-By: Claude Opus 4.6 <[email protected]>
jphein
added a commit
that referenced
this pull request
Apr 19, 2026
Two items in "Fork Changes (still ahead of upstream after v3.3.1 merge)" were never — or are no longer — fork-only. Demote both: 1. Epsilon mtime comparison (palace.py) Upstream merged Arnold Wender's equivalent fix as PR MemPalace#610 on 2026-04-12 (commit bb7ed80). Their threshold is 0.001 vs our fork's 0.01, but abs(stored - current) < epsilon is semantically identical. Moved to "Merged into upstream (post-v3.3.1)". 2. ".jsonl exempt from JUNK_FILE_SIZE cap" The description was wrong. The actual change (commit 560fdbd) adds ".jsonl" to READABLE_EXTENSIONS in miner.py — a whitelist addition, not a size-cap exemption. And it was authored by MSL (upstream maintainer) at the same SHA on upstream/develop. Never was a fork contribution. Moved to "Pulled in from upstream/develop". Related: upstream also raised MAX_FILE_SIZE 10MB → 500MB in d137d12 (the actual size-cap fix, separate concern). Clarified that item now at #1 (bulk_check_mined) is fork-only and independent of the mtime comparison fix. Renumbered remaining "still ahead" items 1-18. Co-Authored-By: Claude Opus 4.7 <[email protected]>
jphein
added a commit
that referenced
this pull request
Apr 21, 2026
Resolutions: - `.claude-plugin/.mcp.json`, `plugin.json` — adopt upstream's `mempalace-mcp` console-script command (added via upstream MemPalace#340 for pipx/uv). Run `pip install -e .` in plugin venv after merge to install the entry point. - `.claude-plugin/hooks/*.sh`, `hooks/*.sh` — adopt upstream's console-command resolution order (`mempalace` script → `python3 -m mempalace` → `python`). `MEMPAL_PYTHON` override still works inside `hooks_cli.py`. - `mempalace/hooks_cli.py`, `tests/test_hooks_cli.py` — keep fork's `_mempalace_python()` helper (fork-ahead #4); upstream only had `sys.executable`, which loses MEMPAL_PYTHON override. - `mempalace/miner.py` — keep fork's concurrent mining path (fork-ahead #1), apply upstream's unicode-`✓` → ASCII-`+` fix (MemPalace#681) to both paths. - `mempalace/backends/chroma.py` — take upstream's refined `quarantine_stale_hnsw` docstring (it's the version merged via our own MemPalace#1000). Brought in: 33 upstream commits including Belarusian/Chinese/German/Spanish/French entity detection, console-script entry points, hook plugin-root space quoting, and v3.3.2 tag (which contains our MemPalace#681/MemPalace#1000/MemPalace#1023). Tests: 1096 passed, 106 deselected (benchmarks). Ruff clean.
jphein
added a commit
that referenced
this pull request
Apr 24, 2026
…ostgres Three things merged into one README pass: 1. Badge: link version-3.3.4 to jphein/mempalace/releases (the v3.3.4 tag we just pushed) and add an upstream-3.3.3 secondary badge so readers can tell fork vs upstream version at a glance. Was sitting uncommitted from earlier today. 2. Multi-client coordination section: replaced the three-fix v3.3.4 summary with a four-fix one. Added @felipetruman's MemPalace#976 num_threads pin (cherry-picked at 552a0d7) as fix #1 — the actual root-cause fix. Reframed our MemPalace#1171/MemPalace#1173/MemPalace#1177 as defense-in-depth around symptoms. Walked back palace-daemon from "primary concurrency story in progress" to "deferred pending observation" — with MemPalace#976's fix in place, the daemon's same-machine value drops; multi-machine and Windows remain its differentiators but neither is current pain. 3. Postgres + pgvector: walked back from "parallel track" framing to "long-term option, no immediate move" for the same reason. Migration cost stays real, current pain is mitigated, decision deferred until v3.3.4 stack is observed in production or TS rewrite ships. Removed two stale paragraphs that were left over from the previous "daemon as primary" framing.
jphein
pushed a commit
that referenced
this pull request
May 3, 2026
The MCP `mempalace_get_drawer` tool returned the entire raw drawer metadata blob to any connected client, and the `source_file` field in that blob is the absolute filesystem path written by the miners (`miner.py`, `convo_miner.py` — `source_file = str(filepath)`). On a single-user local deployment this is self-disclosure, but in nested-agent or multi-server MCP topologies the client is a separate trust domain and the host's directory layout has no documented client-side use. Mirror the mitigation that `searcher.search_memories()` already applies on its own return path: reduce `source_file` to its basename via `Path(source_file).name` before handing the metadata to the client. Citations still work — the directory layout does not leak. Companion to #1 (omit palace_path from tool_status). Same threat class, different surface: - mempalace_status — palace dir path → fixed in #1 - mempalace_get_drawer — per-drawer source_file path → this PR Other read tools were audited and do not leak host paths: - mempalace_search — already basenames source_file - mempalace_list_drawers — returns wing/room/preview only - mempalace_diary_read — date/timestamp/topic/content only - mempalace_reconnect — success/message/drawers only - mempalace_kg_* — entity/predicate strings, counts - mempalace_check_duplicate — wing/room/preview only Changes: - mempalace/mcp_server.py: tool_get_drawer() now basenames metadata.source_file - tests/test_mcp_server.py: regression test asserting the absolute path and its parent directory do not appear anywhere in the response - website/reference/mcp-tools.md: clarify the documented return shape
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to the MemPalace#493 MCP fix — closes the CLI half of milla-jovovich/mempalace#478.
Context
@web3guru888 asked me to roll the CLI fix into this branch before
feat/mcp-hooks-exportlands, after I reported that MemPalace#493 leftminer.status()still capped at 10,000 drawers. Detailed test report from a real 14,902-drawer palace: MemPalace#493 (comment)Change
miner.py:850-876— replace the singlecol.get(limit=10000, include=["metadatas"])call with the same paginated offset loop the new_fetch_all_metadata()helper uses inmcp_server.py:col.count()len(metas), which topped out at 10,000)Same fix pattern as
palace_graph.py:49-51and the server-side helper in this PR, so the whole codebase now has consistent pagination semantics on ChromaDB reads.Test plan
Verified against a real 14,902-drawer / 17-wing palace on mempalace 3.1.0 + chromadb 0.6.3:
Before (this PR's parent
feat/mcp-hooks-export@ 548abd6):After (this PR):
No new tests added — the change is mechanical and matches the existing
_fetch_all_metadata()pattern already covered by MemPalace#493's MCP tests.