Skip to content

Feat/bulk delete and search drawer#310

Closed
emirbartu wants to merge 4 commits intoMemPalace:developfrom
emirbartu:feat/bulk-delete-and-search-drawer-id
Closed

Feat/bulk delete and search drawer#310
emirbartu wants to merge 4 commits intoMemPalace:developfrom
emirbartu:feat/bulk-delete-and-search-drawer-id

Conversation

@emirbartu
Copy link
Copy Markdown

What does this PR do?

This PR implements bulk deletion capabilities and adds drawer_id to search results, fixing three open issues:

  1. Search Results (Fixes Stale drawer retrieval can inject contradictory memory into live agent context; no official sync/update workflow exists #224)

    • Results now include an id field containing the drawer_id
    • Enables delete/update workflow after finding content via search
  2. MCP Bulk Delete Tools (Fixes Feature Request / Bug: No Storage Limit Handling or Disk-Full Graceful Degradation #237, Index does not respect the config. Room removal does not remove already indexed files, and mining should respect project .gitignore #209)

    • mempdelete_wing: Delete all drawers in a wing
    • mempdelete_room: Delete all drawers in a specific room
    • Uses ChromaDB where filters (wing, room fields) for efficient bulk operations
    • Returns deleted count for verification
  3. CLI Commands (Fixes Feature Request / Bug: No Storage Limit Handling or Disk-Full Graceful Degradation #237)

    • mempalace delete-wing <wing> – Delete entire wing
    • mempalace delete-room <wing> <room> – Delete specific room
    • Interactive confirmation showing the number of drawers to be deleted
    • --force flag to skip confirmation
  4. Documentation

    • Updated tool count from 19 to 21 (added: mempdelete_wing, mempdelete_room)
    • Added new tools to MCP tools table
    • Added CLI commands to reference
    • Noted drawer_id in search results

How to test

# Run all tests
python -m pytest tests/ -v

# Test search drawer_id
python -m pytest tests/test_searcher.py::TestSearchMemories::test_result_includes_drawer_id -v

# Test MCP bulk delete
python -m pytest tests/test_mcp_server.py::TestWriteTools::test_delete_wing -v
python -m pytest tests/test_mcp_server.py::TestWriteTools::test_delete_room -v

# Test CLI (manual)
mempalace delete-wing test-wing --force
mempalace delete-room test-wing test-room --force

## Checklist

- [x] Tests pass (122/122)
- [x] No hardcoded paths
- [x] Linter passes (`ruff check .`)

Add 'id' field to search results containing the drawer ID.
This enables the delete/update workflow after finding content via search.

- Extract drawer IDs from results["ids"][0] (index-aligned with documents)
- Include 'id' field in each hit dictionary
- Add test to verify drawer_id is present in search results

Fixes milla-jovovich#224
Implement bulk deletion capabilities via MCP tools:
- mempdelete_wing: Delete all drawers in a wing using where filter
- mempdelete_room: Delete specific room in a wing using \ filter

Both tools:
- Check existence before deletion
- Return deleted_count in response
- Log deletion for audit trail
- Include comprehensive tests

Fixes MemPalace#237
Fixes MemPalace#209
Add CLI commands for bulk deletion:
- mempalace delete-wing <wing>: Delete entire wing with confirmation
- mempalace delete-room <wing> <room>: Delete specific room

Features:
- Shows drawer count before deletion
- Interactive confirmation prompt
- --force flag to skip confirmation
- Proper error handling for non-existent wings/rooms

Fixes MemPalace#237
Update documentation to reflect new features:
- Update tool count from 19 to 21
- Add mempdelete_wing and mempdelete_room to MCP tools table
- Add delete-wing and delete-room CLI commands
- Note that search results now include 'id' field
- Update plugin READMEs with new tool count
@emirbartu emirbartu marked this pull request as ready for review April 8, 2026 22:12
@bgauryy
Copy link
Copy Markdown

bgauryy commented Apr 8, 2026

PR Review: Feat/bulk delete and search drawer

Executive Summary

Aspect Value
PR Goal Add bulk delete (wing/room) to CLI + MCP, and include drawer_id in search results
Files Changed 8
Risk Level 🟡 MEDIUM - Irreversible bulk deletes + architectural rule violation in CLI
Review Effort 2 - Straightforward feature with clear scope
Recommendation 🔄 REQUEST_CHANGES

Affected Areas: mempalace/cli.py, mempalace/mcp_server.py, mempalace/searcher.py, tests/test_mcp_server.py, tests/test_searcher.py, READMEs

Business Impact: Enables users and AI agents to bulk-delete entire wings or rooms, and allows search→delete workflows by exposing drawer IDs in search results.

Flow Changes: Search results now include an id field. Two new destructive operations added to both CLI and MCP interfaces.

Ratings

Aspect Score
Correctness 3/5
Security 3/5
Performance 4/5
Maintainability 2/5

PR Health

High Priority Issues

🏗️ #1: cli.py bypasses palace_db.py singleton — violates project architecture rule

Location: mempalace/cli.py:364-393 (cmd_delete_wing) and mempalace/cli.py:396-427 (cmd_delete_room) | Confidence: ✅ HIGH

AGENTS.md states: "All ChromaDB access goes through palace_db.py — never import chromadb directly elsewhere." Both new CLI commands do import chromadb and create their own chromadb.PersistentClient(path=palace_path), bypassing the singleton client cache and collection cache in palace_db.py. The same file already uses from .palace_db import get_collection at line 131 for other commands.

This creates duplicate ChromaDB client instances and diverges from the pattern every other command in the file follows.

- import chromadb
- client = chromadb.PersistentClient(path=palace_path)
- col = client.get_collection(config.collection_name)
+ from .palace_db import get_collection, build_where_filter
+ col = get_collection(palace_path=palace_path)
+ if not col:
+     print("No palace found. Run: mempalace init <dir> && mempalace mine <dir>")
+     return 1

🏗️ #2: Manual where-filter construction bypasses build_where_filter()

Location: mempalace/cli.py:408 and mempalace/mcp_server.py:356 | Confidence: ✅ HIGH

palace_db.py defines build_where_filter() as "THE one place this logic lives" (line 46). Both cmd_delete_room() and tool_delete_room() manually construct {"$and": [{"wing": wing}, {"room": room}]} instead of calling build_where_filter(wing, room).

If the where-filter format ever changes (e.g., ChromaDB API update), these manual constructions will silently diverge.

- where = {"$and": [{"wing": args.wing}, {"room": args.room}]}
+ where = build_where_filter(args.wing, args.room)

Same fix needed in mcp_server.py:

- where = {"$and": [{"wing": wing}, {"room": room}]}
+ from .palace_db import build_where_filter
+ where = build_where_filter(wing, room)

Note: build_where_filter is already imported in mcp_server.py indirectly via palace_db — just add it to the existing import line.


Medium Priority Issues

🐛 #3: TOCTOU race between count and delete

Location: mempalace/mcp_server.py:337-348 and mempalace/cli.py:375-388 | Confidence: ⚠️ MED

Both delete functions first col.get(where=..., include=[]) to count drawers, then separately col.delete(where=...). If drawers are added or removed between these two calls, the reported deleted_count will be inaccurate.

For a single-user local tool this is low-risk, but the MCP server could be called concurrently by multiple agents. Consider performing the delete first and reporting "deleted wing X" without a pre-counted guarantee, or accepting the minor inconsistency with a note.


🚨 #4: MCP bulk delete tools have no safety guardrail

Location: mempalace/mcp_server.py:333-370 | Confidence: ⚠️ MED

The CLI commands properly implement --force / [y/N] confirmation prompts. The MCP tools have no equivalent protection — an AI agent can delete an entire wing in a single call with no confirmation. The docstring says "Irreversible" which is good, but consider:

  • Adding a dry_run parameter that returns the count without deleting
  • Or returning a preview response with {"action": "confirm_delete", "count": N} requiring a second call

This is especially relevant since this tool will be called by AI assistants that may misinterpret user intent.


Low Priority Issues

🎨 #5: No CLI tests for delete commands

Location: tests/test_cli.py (not modified) | Confidence: ✅ HIGH

MCP tools have 4 tests (positive + negative for wing and room). Searcher has a drawer_id test. But cmd_delete_wing() and cmd_delete_room() in cli.py have no test coverage. These are destructive operations that deserve at minimum:

  • Happy path test
  • Not-found test
  • Confirmation prompt abort test

🎨 #6: query_palace() doesn't explicitly request ids in include list

Location: mempalace/palace_db.py:92 | Confidence: ⚠️ MED

query_palace() uses include=["documents", "metadatas", "distances"]. The PR's searcher.py change accesses results["ids"][0] which works because ChromaDB always returns ids regardless of the include list. However, this implicit behavior could confuse future maintainers who look at query_palace() and don't see ids in the include spec.

- "include": ["documents", "metadatas", "distances"],
+ "include": ["documents", "metadatas", "distances", "ids"],

Note: Adding "ids" is a no-op for ChromaDB but makes the data contract explicit.


Flow Impact Analysis

Before PR:
  search() → results: [{ text, wing, room, source_file, similarity }]
  delete: single drawer by ID only

After PR:
  search() → results: [{ id, text, wing, room, source_file, similarity }]
  delete: single drawer by ID, bulk by wing, bulk by room

New flow enabled:
  search("auth") → get id → delete_drawer(id)     ← was impossible before
  delete_wing("old-project")                        ← new bulk operation
  delete_room("project", "deprecated")              ← new bulk operation

Existing callers of search_memories() gain a new id field in each result dict. This is additive and non-breaking.


Created by Octocode MCP https://octocode.ai 🔍🐙

Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The code properly implements the requested bulk deletion features (wings and rooms) with proper confirmation prompts when not forced. Returning the DB IDs for search results is useful too. I verified the test coverage ensures wing/room filters and ID return work as expected.

Minor heads up for the future since stability is a focus rn: it might be valuable to extract ChromaDB specific interaction out of cli.py into a separate DB layer at some point so cli.py handles exclusively the commands/formatting part.

Otherwise looks clean and ready!

@bensig
Copy link
Copy Markdown
Collaborator

bensig commented Apr 12, 2026

Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants