fix(status): paginate metadata fetch to support large palaces#851
Conversation
`col.get(limit=total)` causes SQLite "too many SQL variables" on palaces with >10k drawers (MemPalace#802) and on older versions the hardcoded limit=10000 silently truncated the count (MemPalace#850). Paginate in 5k batches using offset and aggregate wing/room counts incrementally. Also use `col.count()` for the header instead of `len(metas)` so the displayed total is always correct. Tested on a 122,686-drawer palace. Fixes MemPalace#850 Related: MemPalace#802, MemPalace#723
CI note (run 24379493758)test-linux failed during Action: re-run failed jobs or the full workflow. If the same install step fails repeatedly, it is still an install/network flake unless Same |
Code reviewFound 1 issue:
Lines 834 to 837 in c924d07 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
@vnguyen-lexipol two small items before this can merge: (1) the pagination loop is missing the |
Upstream develop commit feba7e8 (2026-04-18) added `m = m or {}` to the single-shot `for m in metas:` loop after this branch already rewrote status() to paginate. Without porting the guard forward, merging this PR would silently drop jp's fix and crash again on palaces with null-metadata drawers. Addresses bensig's review on MemPalace#851.
|
Good catch, thanks. You're right — Didn't pull in a full rebase onto develop — this branch's base is a root commit in my worktree so the rebase produced 77 add/add conflicts across unrelated files. Happy to re-do as a clean branch off develop if you'd prefer that over the surgical port. |
|
Also — on item (2), the branch does show CONFLICTING against develop. Happy to resolve conflicts and push, but wanted to check: do you prefer a clean conflict-free branch, or are you fine squashing and resolving at merge time? Either works on my end. |
…ate-large-palaces # Conflicts: # mempalace/miner.py
|
Went ahead and resolved the conflict — just one hunk in |
… open PRs, not 8) MemPalace#851 (vnguyen-lexipol) was approved by bensig on 2026-04-15 and already fixes the miner.status() SQLITE_MAX_VARIABLE_NUMBER crash plus MemPalace#850 silent-truncation. I filed MemPalace#1036 this morning missing that context despite having triaged MemPalace#851 the prior day. Closed MemPalace#1036 in favor of MemPalace#851, updated lead paragraph count + Still-ahead row + Open-PRs table accordingly.
|
Field note from the fork side: a semantically equivalent paginated |
|
Bumping this with production data from the last ~20 hours on v3.3.2:
All three hit the same Happy to help land this however useful — rebase, additional reviewer ping, splitting scope, anything. |
… (2026-04-22) Ben's batched queue-clear pass merged four PRs at 00:38 UTC: graph cache (MemPalace#661), deterministic hook saves (MemPalace#673), Claude Code 2.1.114 hook stdout + silent_save guard (MemPalace#1021), and upstream's own MemPalace#851 pagination fix (closing MemPalace#1036 as superseded). Moved four rows out of the "Fork Changes" / "Fork change queue" tables into their respective merged-upstream history sections. Intro sentence PR count reduced from 7 → 4 open. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Restore-integrity release. Unbreaks fresh `pip install mempalace` from v3.3.2 by re-tagging current develop, which carries both the plugin.json consumer (shipped in 3.3.2) and the matching mempalace-mcp entry point in pyproject.toml (added on develop ~10h after the 3.3.2 tag via MemPalace#340 by @messelink). MemPalace#1093 diagnosed by @jphein. Bumps (all 5 sources agree per Version Guard / CLAUDE.md): - mempalace/version.py 3.3.2 → 3.3.3 - pyproject.toml 3.3.2 → 3.3.3 - .claude-plugin/plugin.json 3.3.2 → 3.3.3 - .claude-plugin/marketplace.json 3.3.2 → 3.3.3 - .codex-plugin/plugin.json 3.3.2 → 3.3.3 - CHANGELOG.md new [3.3.3] entry No code changes. The fix for MemPalace#1093 is already on develop via merged PRs MemPalace#340, MemPalace#1021, MemPalace#851, MemPalace#942, MemPalace#833, MemPalace#673, MemPalace#661, MemPalace#659, MemPalace#1097, MemPalace#1051, MemPalace#1001, MemPalace#945. Branch name intentionally outside the `release/*` ruleset so follow-up CI-fix commits aren't gated behind a nested PR. (Supersedes MemPalace#1143 — closed for exactly that reason after it missed 3 of 5 version files.) Smoke-tested locally from a fresh develop clone: grep mempalace-mcp pyproject.toml .claude-plugin/plugin.json # both ✓ python -m build --wheel # ✓ pip install …-py3-none-any.whl # ✓ which mempalace-mcp # ✓ mempalace-mcp --help # ✓
…safe graph cache 31 commits including: - v3.3.3 release (restores pip install integrity post-v3.3.2 MemPalace#1093 regression) - MemPalace#1097 empty-string filter normalization in mempalace_search - MemPalace#659 diary wing parameter (our fork PR, now upstream) - MemPalace#851/MemPalace#1097/MemPalace#1021/MemPalace#661/MemPalace#673 incorporated - website Crystal Lattice brand refresh - thread-safe graph cache in palace_graph.py Conflict resolutions (10 files): - README.md keep fork version; bump badge 3.3.1 → 3.3.3 for test compat - hooks/README.md keep fork silent/block architecture docs; keep MEMPAL_PYTHON (correct for legacy hook, upstream's rename is stale) - examples/HOOKS_TUTORIAL.md same treatment - mcp_server.py take upstream's sanitize_name(wing) — strictly better than our crude lowercase+underscore normalization - miner.py keep fork 10K batch size + comma formatting on status; adopt upstream's pagination rationale comment - palace_graph.py take upstream entirely — thread-safety improvements layered on top of our MemPalace#661 (which upstream already merged) - hooks_cli.py take upstream (Windows path-separator compat in _wing_from_transcript_path, Codex CLI format in _extract_recent_messages), then re-apply fork-ahead: use _wing_from_transcript_path in _ingest_transcript instead of hardcoded "sessions" — keeps transcript mining coherent with the diary wing derivation from MemPalace#659 - tests/test_hooks_cli.py take upstream's updated wing-kwarg assertions and new test_stop_hook_derives_wing_from_transcript_path; take upstream's mock-based security test (simpler than our three-way assertion, same property tested) Post-merge test state: - 1096 passed, 10 failed in tests/test_claude_plugin_hook_wrappers.py - The 10 failures are the fork-ahead MemPalace#19 divergence already documented in CLAUDE.md: our venv-aware hooks use `dirname`/`cat` which the test's scrubbed-PATH environment doesn't provide. Same class that correctly caught MemPalace#1115 and led us to withdraw it pending MemPalace#1069 arbitration. Expected. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
… state Three stale sections updated: - Fork change queue: row 8 (.blob_seq_ids_migrated marker) struck through → FILED as MemPalace#1177. Two new rows added for segfault fixes discovered today (MemPalace#1171 concurrent-write lock, MemPalace#1173 quarantine in make_client) that weren't in the queue because the bugs surfaced today, not during the original 2026-04-21 triage. - Open upstream PRs: was showing 3 of 10 PRs. Now shows all 10 with current CI/review state. All rebased onto current upstream/develop and MERGEABLE as of today. - Merged since v3.3.1: added v3.3.3 release (2026-04-24) with its constituent merges — MemPalace#942, MemPalace#833, MemPalace#1097, MemPalace#1145, MemPalace#1147, MemPalace#1148/1150/1157 entity-detection overhaul (via @igorls's MemPalace#1175 stacked-PR rescue), MemPalace#1166 palace-path security, MemPalace#340/MemPalace#1093 install regression, plus MemPalace#851 from the 2026-04-22 batch.
Mirror the pagination pattern PR MemPalace#851 landed in miner.py:status(). A single drawers_col.get(limit=total, ...) on palaces larger than SQLite's SQLITE_MAX_VARIABLE_NUMBER (32766) crashes inside chromadb. Fetch drawers in batch_size=5000 chunks, stepping offset until the collection is drained. by_source aggregation semantics are preserved exactly — grouping, wing filter, meta capture all unchanged. Closes MemPalace#1073. Related: MemPalace#802, MemPalace#850, MemPalace#1016.
Mirror the pagination pattern PR MemPalace#851 landed in miner.py:status(). A single drawers_col.get(limit=total, ...) on palaces larger than SQLite's SQLITE_MAX_VARIABLE_NUMBER (32766) crashes inside chromadb. Fetch drawers in batch_size=5000 chunks, stepping offset until the collection is drained. by_source aggregation semantics are preserved exactly — grouping, wing filter, meta capture all unchanged. Closes MemPalace#1073. Related: MemPalace#802, MemPalace#850, MemPalace#1016.
Mirror the pagination pattern PR MemPalace#851 landed in miner.py:status(). A single drawers_col.get(limit=total, ...) on palaces larger than SQLite's SQLITE_MAX_VARIABLE_NUMBER (32766) crashes inside chromadb. Fetch drawers in batch_size=5000 chunks, stepping offset until the collection is drained. by_source aggregation semantics are preserved exactly — grouping, wing filter, meta capture all unchanged. Closes MemPalace#1073. Related: MemPalace#802, MemPalace#850, MemPalace#1016.
Summary
col.get()inminer.py:status()using 5k batches withoffsetinstead of a singlecol.get(limit=total)callcol.count()for the header total instead oflen(metas)Problem
On palaces with >10k drawers:
limit=10000): silently displays "10000 drawers" regardless of actual count (mempalace status silently truncates at 10,000 drawers (incorrect count displayed) #850)limit=total): crashes with SQLite "too many SQL variables" (mempalace statuscrashes with "too many SQL variables" on large palaces #802)Fix
Testing
Tested on a 122,686-drawer palace — correctly displays full count and complete wing/room breakdown with no SQLite errors.
Fixes #850
Related: #802, #723
🤖 Generated with Claude Code