fix: honor configured drawer collection everywhere#423
fix: honor configured drawer collection everywhere#423gut-puncture wants to merge 1 commit intoMemPalace:developfrom
Conversation
web3guru888
left a comment
There was a problem hiding this comment.
Review: Honor Configured Drawer Collection Everywhere
This is a significant correctness fix. The core problem — different code paths silently using different collections — is exactly the kind of bug that's invisible in testing but devastating in production. We hit a variant of this when our integration's knowledge_graph_bridge.py was writing to one collection while the MCP search read from another.
Architecture
The new resolve_drawer_context() + get_drawer_collection() + open_collection_on_client() factoring in palace.py is clean. Single source of truth for collection resolution, config-aware, with proper create flag semantics.
iter_collection_metadatas() — The paginated iteration is the right fix for the limit=10000 hard cap. The offset-based pagination pattern is correct:
offset += batch_count
if batch_count < batch_size:
breakThis handles the "last batch smaller than page size" termination correctly.
Specific observations
✅ ~ expansion from config/env — _expand_path() is a good single-point normalization. Previously MEMPALACE_PALACE_PATH=~/palace would silently work differently depending on whether the shell expanded ~ before Python saw it. Now it's consistent.
✅ MCP cache rekeying — The _collection_cache_key = (palace_path, collection_name) tuple ensures that changing either dimension invalidates the cache. The test_get_collection_rekeys_when_context_changes test covers path change, name change, and both change. Thorough.
✅ create=False doesn't materialize empty palaces — get_drawer_collection(create=False) returns None if the directory is missing. This prevents mempalace status from creating empty palace directories as a side effect. We added the same guard in our integration and it eliminated a confusing class of "phantom palace" bugs.
cmd_repair():
palace_path, collection_name = resolve_drawer_context(palace_path=palace_arg, config=cfg)Then later:
col = get_drawer_collection(palace_path=palace_path, create=False, config=cfg)Then:
client = chromadb.PersistentClient(path=palace_path)The repair flow creates a new client after already getting a collection through get_drawer_collection. This means two separate PersistentClient instances pointing at the same directory. ChromaDB's locking should handle this, but it's a bit fragile. Consider getting the client from the same path as the collection, or restructuring to use a single client.
Test coverage — Excellent. The regression tests for tilde expansion, configured collection mining, paged status, and cache rekeying are well-targeted. The _PagedCollection helper in tests is a clean way to test pagination without real ChromaDB overhead.
Scope
16 files changed, 611 additions, 222 deletions — this is wide-ranging but every change is consistent with the stated goal. No scope creep. The "What this does not change" section in the description is appreciated.
Strong PR. Addresses a real trust problem in the tool.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
|
Quick merge-order note after comparing the overlap with #413: I think Reason:
So they do not look like competing PRs to me; they look ordered. If If In other words: Not a blocker for the design in |
|
Restacking this onto current |
|
Restack is now live on this PR branch. I rebuilt it on top of current
Result: I also folded in one small cleanup from review while restacking: |
1db9a36 to
2e30664
Compare
|
@skuznetsov @web3guru888 restack is now live on this PR branch. This is now rebuilt on top of current main, so the earlier merge-order concern should be resolved. I kept the same scope and reran the focused validation lane for this change: ruff check mempalace tests |
web3guru888
left a comment
There was a problem hiding this comment.
Took another pass on the restacked diff — the merge-order concern is resolved.
The collection-access seam changes that were in conflict with #413 are now on main, and your restack picks them up cleanly. The mcp_server.py diff in particular shrank by ~100 lines (net -44) which is exactly what you'd expect after the upstream abstraction landed — less bespoke resolution code needed here.
162 passing on the focused lane (config, palace, searcher, layers, mcp_server, miner, convo_miner, cli) and ruff clean is the right validation set for this change.
The scope is still what I'd want: consistent collection resolution, ~ expansion from config/env, no create-on-read false positives, paginated metadata reads for large palaces, and keyed MCP cache. No scope creep from the restack.
LGTM — approving.
|
Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution! |
Summary
This fixes a trust problem in the current tool: different parts of MemPalace could look at different drawer collections, so a user could mine memories successfully and then have search, wake-up, status, repair, or MCP read from somewhere else.
In simple terms, this PR makes the tool use the same default drawer collection everywhere it already promises to use it.
What changed
palace_pathfrom config and env now expands~python -m mempalace.mcp_server --palace ~/...now resolves~correctly(palace_path, collection_name)so context changes do not go staleWhat this does not change
Tests
~expansion from config and envValidation run locally:
.venv/bin/pytest tests/test_config.py tests/test_palace.py tests/test_miner.py tests/test_convo_miner.py tests/test_searcher.py tests/test_layers.py tests/test_mcp_server.py tests/test_cli.py.venv/bin/ruff check mempalace tests.venv/bin/python -m pytest