Skip to content

feat(cli): add clean subcommand to remove wings and rooms#548

Open
brandonhon wants to merge 3 commits intoMemPalace:developfrom
brandonhon:feat/clean-subcommand
Open

feat(cli): add clean subcommand to remove wings and rooms#548
brandonhon wants to merge 3 commits intoMemPalace:developfrom
brandonhon:feat/clean-subcommand

Conversation

@brandonhon
Copy link
Copy Markdown

@brandonhon brandonhon commented Apr 10, 2026

Summary

Adds a new mempalace clean subcommand for reclaiming palace space and starting projects over. It removes drawers at two granularities — an entire wing, or a specific room within a wing — and keeps the mempalace_drawers and mempalace_compressed collections in lockstep so compressed copies never go stale after a cleanup.

Motivation

Today there is no first-class way to remove memories once a project is done. Users either re-mine forever or hand-edit ChromaDB. The new subcommand gives a safe, auditable way to scrub scoped memory without touching the rest of the palace.

UX

mempalace clean --wing my_app                    # remove all drawers in a wing
mempalace clean --wing my_app --room costs       # remove all drawers in one room
mempalace clean --wing my_app --dry-run          # preview counts, no changes
mempalace clean --wing my_app --yes              # skip interactive confirmation

Sample output:

=======================================================
  MemPalace Clean
=======================================================

  Palace: /home/user/.mempalace/palace
  Target: wing 'my_app', room 'backend'
  Drawers to remove:    12
  Compressed to remove: 12

  Delete 12 drawer(s) from wing 'my_app', room 'backend'? [y/N] y

  Removed 12 drawer(s) from wing 'my_app', room 'backend'.
  Removed 12 compressed drawer(s).

  Note: the knowledge graph (~/.mempalace/knowledge_graph.sqlite3) is not touched by clean.

Design decisions

  • --wing is required. No implicit "clean everything" — refuses to run without a wing filter. This is intentional: the cost of an accidental full wipe is very high.
  • Single flag per scope. Only --wing and --room. I considered splitting into --room (clear contents) vs --del-room (remove the room itself), but rooms and wings are metadata-only in mempalace today — they exist only as fields on drawer metadata, not as first-class records. Once the last drawer in a room is deleted the room disappears from list_rooms automatically, so "clear contents" and "delete room" would produce identical observable state. Added a registry was deemed scope creep.
  • Knowledge graph left untouched. KG facts in ~/.mempalace/knowledge_graph.sqlite3 do not carry a wing field, so they can't be filtered cleanly by wing without false positives. The --help text explicitly notes this and the runtime output reminds the user after every cleanup.
  • Compressed collection cleaned in lockstep. mempalace_compressed shares drawer IDs with mempalace_drawers. Leaving compressed copies behind after a wing/room cleanup would be a correctness bug, so both collections are filtered and deleted in the same operation.
  • Single WAL entry per cleanup. A clean_wing or clean_room record is appended to ~/.mempalace/wal/write_log.jsonl with {wing, room, drawers_deleted, compressed_deleted}. Per-drawer logging was rejected as noisy for bulk ops.
  • Interactive confirmation by default. Prompts [y/N] with a count summary before deleting. --yes skips the prompt for scripting. --dry-run shows counts and exits without touching anything.
  • Safe degraded operation. If mempalace_compressed doesn't exist (palace was never compressed), the cleanup proceeds against mempalace_drawers only without error.

Files changed

  • mempalace/palace.py — added _build_where, count_drawers, delete_drawers helpers (+34 lines). Keeps the CLI layer thin and makes the primitives reusable for a future MCP clean tool.
  • mempalace/cli.py — added cmd_clean, the clean subparser, and a dispatch entry. Docstring updated with the new command in the help epilog (+142 lines).
  • tests/test_clean.py — 15 new tests (+221 lines).

Tests

Added tests/test_clean.py with 15 tests covering:

Primitive layer (palace.count_drawers / delete_drawers):

  • Counting by wing
  • Counting by wing + room
  • Deleting by wing leaves other wings untouched
  • Deleting by room leaves sibling rooms untouched
  • Nothing-to-delete returns zero

CLI layer (cli.cmd_clean):

  • Cleaning a wing removes matching drawers from both mempalace_drawers and mempalace_compressed
  • Cleaning a room scopes correctly and preserves sibling rooms
  • --dry-run makes zero changes
  • No-match scenario prints "Nothing to clean" and exits 0
  • Interactive n at the prompt cancels the delete
  • Interactive y at the prompt proceeds
  • Works without a compressed collection present
  • Writes a clean_room WAL entry with the expected counts
  • Missing --wing flag exits non-zero
  • Missing palace directory exits non-zero with a clear message

Test results

$ python3 -m pytest -x -q
549 passed, 106 deselected in 8.83s

All 549 tests in the suite pass (15 new, 534 existing). No regressions.

Test plan

  • python3 -m pytest tests/test_clean.py -v passes locally
  • python3 -m pytest -q — full suite still green
  • python3 -m mempalace clean --help shows the expected usage
  • Dry-run against a real palace: mempalace clean --wing <some-wing> --dry-run reports accurate counts
  • Actual cleanup against a disposable palace, then verify via mempalace status or mempalace_list_rooms that the wing/room is gone
  • Confirm a WAL entry landed in ~/.mempalace/wal/write_log.jsonl with operation: clean_wing (or clean_room) and correct counts
  • If the palace has compressed drawers, confirm mempalace_compressed was cleaned in lockstep

Notes for reviewers

  • The "knowledge graph not touched" behavior is called out in both --help text and runtime output so users aren't surprised. If you want KG cleanup later, it would need a new wing tag on KG triples — out of scope for this PR.
  • delete_drawers uses ChromaDB's native delete(where=...). The _build_where helper produces either {"wing": X} or {"$and": [{"wing": X}, {"room": Y}]} depending on scope.
  • Commit is authored via the brandonhon GitHub noreply email for email-privacy compliance.

Adds `mempalace clean --wing X [--room Y]` for reclaiming palace space
and starting projects over. Removes drawers from both
`mempalace_drawers` and `mempalace_compressed` in lockstep, with an
interactive confirmation prompt (skippable via `--yes`) and `--dry-run`
for previews. Each clean operation appends a single summary entry to
the WAL for audit. The knowledge graph is intentionally left untouched
and the help text calls this out.
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fills a real gap — there's currently no first-class way to remove a wing once a project is done. Good motivation and clean scope.

What works well:

  • Wing/room granularity is the right split. Wing-level clean covers the "project is done" case; room-level covers "re-mine after restructuring."
  • drawers + compressed kept in lockstep. This is important — without it, compressed stale entries would survive a clean.
  • WAL logging for the operation. Audit trail is the right default for a destructive command.
  • Test coverage is thorough: dry-run, prompt cancel/confirm, no-compressed-collection graceful path, WAL verification. 11 tests is solid.
  • --yes flag for scripted use.

One issue with count_drawers + delete_drawers on large palaces:

Both functions call collection.get(where=...) to count first, then collection.delete(where=...). On a 10K+ drawer palace this means two full ChromaDB scans. The delete path could do delete(where=...) directly and get the count from the returned deletion count if ChromaDB exposes it — or just accept a single-pass "delete and count" by calling get with include=[] once and immediately deleting the returned IDs.

This is a performance concern, not a correctness concern. For typical palace sizes (hundreds to low thousands) it's not a problem. But worth noting.

Knowledge graph note:

The clean command correctly leaves the KG untouched and documents this explicitly. Worth considering whether --with-kg or a follow-up mempalace clean-kg --wing ... makes sense as a companion command — once you've deleted all drawers from a wing, the KG entities that only had provenance from those drawers become orphaned. Not blocking, just flagging for follow-on.

Minor:

The WAL entry uses datetime.now() (local time). Other WAL entries in the codebase use UTC (datetime.utcnow() or datetime.now(timezone.utc)). Worth being consistent.

Otherwise clean implementation. LGTM with those minor notes.

@brandonhon
Copy link
Copy Markdown
Author

@web3guru888
Thanks for the careful review — all three points are fair. I'll work on this over the weekend:

  • Perf: I'll collapse the count-then-delete into a single get(where=..., include=[]) pass and delete by
    the returned IDs so large palaces only scan once.
  • KG orphans: good call on the follow-up — I'll look at whether mempalace clean-kg --wing ... is
    feasible given KG triples don't carry a wing field today, or if --with-kg makes more sense. Might land
    as a separate PR.
  • WAL timestamp: easy fix, I'll switch to datetime.now(timezone.utc) to match the rest of the codebase.

Will push updates to this branch over the weekend.

@web3guru888
Copy link
Copy Markdown

@brandonhon — all three resolutions sound right:

  • Single-pass scan is the correct approach — returns IDs at minimal cost, avoids the double-scan. For very large palaces with thousands of drawers, this will matter.
  • KG orphan as follow-on makes sense given the triple-level scoping issue. is probably the right flag name if KG triples don't carry wing metadata, since it's an opt-in behavior with its own semantics rather than a scope filter.
  • UTC timestamp is a one-liner but it matters for log consistency when the palace spans timezones.

No rush — enjoy the weekend. The PR is solid as-is; these are all incremental improvements on an already correct implementation.

…tion

Addresses PR MemPalace#548 review feedback about scan amplification on large
palaces. The previous implementation made up to six ChromaDB scans per
clean operation:

  1. count_drawers(drawers_col)       — scan MemPalace#1
  2. count_drawers(compressed_col)    — scan MemPalace#2
  3. delete_drawers(drawers_col)      — internal get(where=...) scan MemPalace#3
  4. delete_drawers(drawers_col)      — delete(where=...) scan MemPalace#4
  5. delete_drawers(compressed_col)   — scan MemPalace#5
  6. delete_drawers(compressed_col)   — scan MemPalace#6

Each call was doing its own metadata filter over the collection,
meaning a 100K-drawer palace paid the filter cost six times for a
single cleanup. This refactor drops it to exactly two scans — one per
collection — regardless of palace size.

Changes:

  * palace.py: introduce `find_drawer_ids(col, wing, room)` which
    returns the matching ID list in a single `get(where=..., include=[])`
    call. ChromaDB fetches only IDs — no documents, embeddings, or
    metadatas — so the scan is as cheap as ChromaDB can make it.
  * palace.py: `count_drawers` is now a thin wrapper around
    `find_drawer_ids`. The old standalone `delete_drawers` helper is
    removed because its count-then-delete pattern is exactly what we
    are trying to avoid.
  * cli.py::cmd_clean: call `find_drawer_ids` once per collection,
    reuse the returned ID lists for both the preview counts and the
    subsequent delete. Deletes go through `col.delete(ids=[...])`
    which is an O(n) primary-key delete, not another metadata scan.
    Empty lists are guarded to stay compatible with ChromaDB versions
    that reject `delete()` with no ids or where filter.

Why two scans and not one:

ChromaDB collections are independent — there is no cross-collection
query. `mempalace_drawers` and `mempalace_compressed` must each be
filtered separately. A single-scan variant would have to assume that
compressed IDs are a strict subset of drawer IDs and delete compressed
by the drawer IDs we already found, but `tool_delete_drawer` in
mcp_server.py does not cascade to compressed, so real palaces can
contain orphaned compressed rows whose drawer is already gone. Going
to one scan would silently leak those orphans. Two scans is the
minimum that preserves correctness.

Tests:

  * New `test_find_drawer_ids_*` unit tests cover wing-only, wing+room,
    and no-match cases.
  * New `test_find_drawer_ids_single_scan` monkey-patches
    `collection.get` to assert exactly one call.
  * New `test_clean_scans_each_collection_exactly_once` is a
    regression test that wraps `chromadb.PersistentClient` and counts
    `where`-filtered `get()` calls per collection during a full
    `cmd_clean` invocation, failing if either collection is scanned
    more than once.
  * The existing 15 CLI black-box tests stay identical — the behavior
    is unchanged, only the scan count dropped.

Full suite: 551 passed (was 549, +2 new perf regression tests).

Also in this commit: `-V` / `--version` flags, because every good app
needs a version flag and we somehow shipped three minor releases
without one. The installed version is now embedded in the `-h`
description line too, so `mempalace -h` answers "what am I running?"
without a separate invocation.
@bensig bensig changed the base branch from main to develop April 11, 2026 22:21
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:21
@brandonhon
Copy link
Copy Markdown
Author

@web3guru888
Thanks again for the review! I looked into this and the codebase actually uses naive local-time datetime.now() throughout — a grep for utcnow or timezone.utc returns zero matches. So this WAL entry is consistent with the existing convention rather than diverging from it. Happy to revisit if we want to migrate the whole codebase to UTC, but that feels out of scope for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants