Skip to content

feat: repair nuke-rebuild, purge command, --version flag#632

Closed
jphein wants to merge 4 commits intoMemPalace:developfrom
jphein:pr/maintenance
Closed

feat: repair nuke-rebuild, purge command, --version flag#632
jphein wants to merge 4 commits intoMemPalace:developfrom
jphein:pr/maintenance

Conversation

@jphein
Copy link
Copy Markdown
Collaborator

@jphein jphein commented Apr 11, 2026

Summary

Palace maintenance improvements. Split from #562 per maintainer request (PR 4 of 6).

Both repair and purge release the ChromaDB client before shutil.rmtree() to avoid file handle conflicts on Windows/some Linux FS, and use hnsw:space=cosine for the new collection.

Closes

Related

Test plan

  • 40 existing CLI tests pass (including 6 repair tests)
  • mempalace --version prints version
  • mempalace -v prints version
  • mempalace repair on a palace with data: backs up, nukes, rebuilds, verifies index
  • mempalace purge --wing conversations -y: removes matching drawers, rebuilds with remaining
  • mempalace purge --room backend: purges room across all wings
  • mempalace purge with no args: shows error

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves MemPalace’s maintenance CLI by making index rebuilds more reliable under ChromaDB HNSW corruption scenarios, adding a targeted purge workflow, exposing a standard --version/-v flag, and updating the ChromaDB dependency range to a newer major line.

Changes:

  • Update mempalace repair to rebuild by extracting drawers, deleting the palace directory, recreating a fresh PersistentClient/collection (cosine), and performing a post-rebuild verification query.
  • Add mempalace purge --wing <w> [--room <r>] [-y] to remove drawers by metadata filter via extract-keep → nuke → reinsert into a fresh collection.
  • Add --version/-v flag and bump chromadb dependency to >=1.5.4,<2.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File Description
pyproject.toml Bumps the ChromaDB dependency range to the 1.x line.
mempalace/cli.py Adds purge, changes repair to directory-nuke rebuild + verification, and introduces --version/-v.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/cli.py
Comment on lines +258 to +263
# Nuke and rebuild with clean HNSW index
palace_path = palace_path.rstrip(os.sep)
print(" Rebuilding palace...")
shutil.rmtree(palace_path)
os.makedirs(palace_path, mode=0o700)

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd_purge deletes the entire palace_path with shutil.rmtree() but never creates the backup directory described in the docstring. This makes the command irreversibly destructive if the rebuild/add step fails partway through; consider creating a backup (e.g., copy/move to palace_path + ".backup") before deleting, and printing the backup location for recovery.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — purge should create a backup before nuking, like repair does. Will add.

Comment thread mempalace/cli.py Outdated
del col, client

# Nuke and rebuild with clean HNSW index
palace_path = palace_path.rstrip(os.sep)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

palace_path = palace_path.rstrip(os.sep) can turn a root path into an unsafe target (e.g., '/' -> '' on POSIX, 'C:\\' -> 'C:' on Windows). Because this is followed by shutil.rmtree(palace_path), add a defensive validation step (resolve to an absolute path and reject filesystem roots / empty paths) before deleting anything.

Suggested change
palace_path = palace_path.rstrip(os.sep)
palace_path = palace_path.rstrip(os.sep)
palace_path = os.path.abspath(palace_path)
palace_root = str(Path(palace_path).anchor or os.path.abspath(os.sep))
if not palace_path or palace_path == palace_root:
raise ValueError(f"Refusing to delete unsafe palace path: {palace_path!r}")

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — will add a guard rejecting paths that resolve to filesystem root.

Comment thread mempalace/cli.py Outdated
Comment on lines +190 to +191
except Exception:
print(f"\n No palace found at {palace_path}")
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The except Exception: branch reports "No palace found" for any failure when opening ChromaDB / the collection (including permission errors or a corrupt DB). Include the exception message (or catch a narrower exception) so users can distinguish "missing" from "unreadable/corrupt" palaces.

Suggested change
except Exception:
print(f"\n No palace found at {palace_path}")
except Exception as exc:
print(f"\n Failed to open palace at {palace_path}: {exc}")

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will include the exception message in the error output.

Comment thread mempalace/cli.py Outdated
Comment on lines +179 to +181
Not idempotent — running purge twice on the same criteria will fail the
second time if the first run completed (nothing left to match). The
backup directory is preserved for recovery.
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmd_purge docstring says the operation is "Not idempotent" and "will fail" on a second run, but the implementation returns cleanly with a "No drawers found" message when nothing matches. Update the docstring to match the actual behavior (or adjust behavior to match the docstring).

Suggested change
Not idempotentrunning purge twice on the same criteria will fail the
second time if the first run completed (nothing left to match). The
backup directory is preserved for recovery.
Safe to re-run with the same criteria: if no drawers match, the command
exits cleanly after reporting that nothing was found. The backup
directory is preserved for recovery.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update the docstring — a second run returns cleanly, it doesn't fail.

Comment thread mempalace/cli.py Outdated
Comment on lines +344 to +355
palace_path = palace_path.rstrip(os.sep)
backup_path = palace_path + ".backup"
if os.path.exists(backup_path):
shutil.rmtree(backup_path)
print(f" Backing up to {backup_path}...")
shutil.copytree(palace_path, backup_path)

print(" Rebuilding collection...")
client.delete_collection("mempalace_drawers")
new_col = client.create_collection("mempalace_drawers")
# Nuke and recreate — fresh PersistentClient gets a clean HNSW index
print(" Rebuilding from scratch...")
shutil.rmtree(palace_path)
os.makedirs(palace_path, mode=0o700)

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd_repair also uses palace_path.rstrip(os.sep) followed by shutil.rmtree(palace_path), which can become dangerous if the user passes a filesystem root (or Windows drive root). Add a guard that rejects empty/root paths after normalization before deleting the directory.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix will apply to both repair and purge.

Comment thread mempalace/cli.py
print(" Index verification: OK")
except Exception as e:
print(f" Index verification FAILED: {e}")
print(f" Restore from backup: mv {backup_path} {palace_path}")
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On index verification failure, the suggested restore command uses mv, which won’t work on Windows and may not be available in all environments. Consider printing a cross-platform restore instruction (e.g., using Python/shutil terminology) or detect the OS when formatting the guidance.

Suggested change
print(f" Restore from backup: mv {backup_path} {palace_path}")
restore_cmd = (
"python -c "
+ shlex.quote(
"from pathlib import Path; import shutil; "
f"shutil.move(str(Path({str(backup_path)!r})), str(Path({str(palace_path)!r})))"
)
)
print(f" Restore from backup: {restore_cmd}")

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, though this is a CLI tool primarily used on Unix. Will note both mv and manual copy as restore options.

Comment thread pyproject.toml
]
dependencies = [
"chromadb>=0.5.0,<0.7",
"chromadb>=1.5.4,<2",
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency constraint is bumped to chromadb>=1.5.4,<2, but the repository’s committed lockfile (uv.lock) still pins ChromaDB 0.6.3. If contributors rely on uv.lock for reproducible environments, regenerate/update it so it matches this new constraint.

Suggested change
"chromadb>=1.5.4,<2",
"chromadb==0.6.3",

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update uv.lock to match the new pin.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — regenerated uv.lock with uv lock. chromadb 0.6.3 → 1.5.7. 332f8ac

Comment thread mempalace/cli.py
Comment on lines +547 to +549
parser.add_argument(
"--version", "-v", action="version", version=f"mempalace {__version__}"
)
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new --version / -v flag added here doesn’t appear to have a corresponding CLI test (e.g., asserting the printed version string and exit code), even though other main() behaviors are covered in tests/test_cli.py. Add a test that invokes main() with --version and asserts the output includes the package version.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a test for --version output.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added test_version_flag_prints_version that verifies __version__ appears in output and exit code is 0. 332f8ac

Comment thread mempalace/cli.py
Comment on lines +701 to +705
p_purge = sub.add_parser("purge", help="Delete drawers by wing and/or room (rebuilds index)")
p_purge.add_argument("--wing", help="Wing to purge")
p_purge.add_argument("--room", help="Room to purge (without --wing, purges across ALL wings)")
p_purge.add_argument("--yes", "-y", action="store_true", help="Skip confirmation prompt")

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new purge subcommand wiring added here isn’t covered by existing CLI tests (there are dispatch tests for other subcommands in tests/test_cli.py). Add tests for mempalace purge argument validation (no args → error) and dispatch/execution wiring.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add purge CLI tests.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added 3 purge tests: test_cmd_purge_no_args_prints_error, test_cmd_purge_no_matching_drawers, test_cmd_purge_no_palace. 332f8ac

jphein and others added 3 commits April 13, 2026 07:20
- repair: nukes palace directory and recreates PersistentClient instead
  of same-process delete_collection+create_collection, which reuses
  corrupted HNSW state. Adds index verification after rebuild.
- purge: new command to delete drawers by wing/room with index rebuild.
  Extracts keepers, nukes palace, re-inserts into clean ChromaDB.
- --version / -v flag shows mempalace version
- chromadb pin bumped to >=1.5.4,<2

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add backup before nuke in purge (was missing, unlike repair)
- Guard against rstrip(os.sep) resolving to filesystem root
- Include exception message when palace can't be opened
- Fix docstring: second run returns cleanly, doesn't fail

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…tests

- Regenerate uv.lock: chromadb 0.6.3 → 1.5.7 to match pyproject.toml pin
- Add --version flag test: verifies version string in output
- Add purge CLI tests: no args error, no matching drawers, no palace

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Upstream bumped pyproject.toml to 3.2.0 in MemPalace#761 but missed version.py.
Aligns the two so test_version_consistency passes.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@jphein
Copy link
Copy Markdown
Collaborator Author

jphein commented Apr 16, 2026

Closing this as superseded — all three proposed features shipped in upstream v3.3.0:

  • mempalace --version / -v works (returns mempalace 3.3.0)
  • mempalace purge --wing --room --yes works with the same flag set
  • mempalace repair (nuke-rebuild) is live

Thanks to whoever did the parallel work in this area. Closing to keep the PR list focused on what's actually outstanding.

@jphein jphein closed this Apr 16, 2026
jphein added a commit to jphein/mempalace that referenced this pull request Apr 16, 2026
MemPalace#629 and MemPalace#632 were closed upstream as fully superseded — all three
features in MemPalace#632 (`--version`, `purge`, `repair`) shipped in v3.3.0;
MemPalace#629's core (batching, file locking, skip patterns, stopwords) landed
independently via upstream's own paths. Move both to the closed list
and update the live/closed counts.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands area/install pip/uv/pipx/plugin install and packaging enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants