Background
packages/memtomem/src/memtomem/web/routes/tags.py only has two
routes today: GET /tags (list with counts) and POST /tags/auto
(extract tags). There is no way to rename, delete, or merge tags
through the API or the UI — every fix-up requires either editing
chunks one by one in the Search-tab detail panel or re-running
auto_tag with overwrite=true, both of which are error-prone for
even small taxonomy clean-ups.
A reviewer flagged this in a recent UX walk-through (Tags tab → "no
manage UI"). The Tier-1/Tier-2 polish PRs (#682–#687) intentionally
deferred this because it's a backend-shaped change with retrieval-
correctness implications, not a paint job. Filing as an RFC so we can
align on shape before any code lands.
Goals
- Rename one tag globally:
old → new, applied to every chunk
that carries old. Idempotent if new already coexists.
- Delete one tag globally: drop it from every chunk that carries
it. Chunks that end up tag-less stay indexed; we don't re-tag.
- Merge N tags → 1: same as rename but for a set, with the
resulting chunk-tag list deduplicated.
Out of scope for this RFC: tag taxonomies / hierarchies, synonyms,
per-namespace tag scopes, undo history.
Proposed surface
Backend
PUT /api/tags/{name} # rename body { new_name: str }
DELETE /api/tags/{name} # delete tag from all chunks
POST /api/tags/merge # body { sources: [str], target: str }
All three return:
dry_run=true (URL param) → no writes, just return the count and
sample chunk ids (cap 10) so the UI can show a confirmation modal.
Implementation notes:
- Iterate matching chunks via the existing
storage.list_chunks_by_tag(tag) (add if missing) and rebuild
ChunkMetadata.tags per chunk.
- Single transaction per request (atomic across the
upsert_chunks batch) so partial failure can't strand a tag in
a half-renamed state.
- Embeddings stay valid: tag changes touch
metadata.tags only,
not content, so BM25 / dense indexes don't need a rebuild.
(Worth pinning with a test that asserts embedding,
content_hash, created_at survive a rename.)
- Reject system / reserved prefixes (
validity:, system:,
whatever the canonical list is) at the route layer with a 400.
UI (Tags tab)
Each row in the tag list / cloud gets a hover-revealed action menu
with Rename, Merge into…, Delete. All three open a confirm
modal that:
- Calls the route with
dry_run=true first
- Shows
"This will affect N chunks (sample: …)"
- Re-calls without
dry_run only after the user confirms
The active-filter chip pattern from PR #684 is the closest visual
cousin — same accent / muted colour split, same "you are about to
do something irreversible" weight.
Risks / open questions
- Concurrent
auto_tag: if the Auto-Tag form is mid-run when a
rename fires, the rename can race against newly written tags.
Cheapest answer: hold the same per-storage write lock the
auto-tag path already uses, and let the second caller block.
Worth a test that pins the lock invariant.
- Reserved tags: we don't have a canonical list. Surveying
validity:*, system:*, archive:* callsites is part of the
RFC, not the implementation PR.
- Empty-tag chunks after delete: behaviour is "stay indexed,
no re-tag." Is that the right call? Alternative: queue a
auto_tag pass with sample_limit=0 for the affected chunks,
but that mixes two features and breaks the "this op is fast and
reversible at the metadata layer" property.
- Audit / history: do we want a write-ahead log of
rename/delete/merge ops in storage, or is the chunk
updated_at bump enough? Default to the latter for v1.
- CLI parity: the
mm CLI doesn't have mm tags rename
either. Probably worth shipping CLI + Web in the same release so
scripts and the UI stay symmetric (see
feedback_mcp_cli_sibling_gate_parity.md-style invariant).
Suggested split
If this gets a green light, three small PRs feel cleaner than one:
- Backend routes + storage helpers + tests (no UI).
- UI confirm modal + hover actions + i18n.
- CLI commands (
mm tags rename, mm tags delete,
mm tags merge) with shared service-layer code from PR 1.
Happy to take the first PR if there's appetite — comment with
agreement on the surface or push back on any of the four risks
above.
🤖 Generated with Claude Code
Background
packages/memtomem/src/memtomem/web/routes/tags.pyonly has tworoutes today:
GET /tags(list with counts) andPOST /tags/auto(extract tags). There is no way to rename, delete, or merge tags
through the API or the UI — every fix-up requires either editing
chunks one by one in the Search-tab detail panel or re-running
auto_tagwithoverwrite=true, both of which are error-prone foreven small taxonomy clean-ups.
A reviewer flagged this in a recent UX walk-through (Tags tab → "no
manage UI"). The Tier-1/Tier-2 polish PRs (#682–#687) intentionally
deferred this because it's a backend-shaped change with retrieval-
correctness implications, not a paint job. Filing as an RFC so we can
align on shape before any code lands.
Goals
old → new, applied to every chunkthat carries
old. Idempotent ifnewalready coexists.it. Chunks that end up tag-less stay indexed; we don't re-tag.
resulting chunk-tag list deduplicated.
Out of scope for this RFC: tag taxonomies / hierarchies, synonyms,
per-namespace tag scopes, undo history.
Proposed surface
Backend
All three return:
{ "tag": "<resolved name>", "affected_chunks": <int>, "dry_run": <bool>, }dry_run=true(URL param) → no writes, just return the count andsample chunk ids (cap 10) so the UI can show a confirmation modal.
Implementation notes:
storage.list_chunks_by_tag(tag)(add if missing) and rebuildChunkMetadata.tagsper chunk.upsert_chunksbatch) so partial failure can't strand a tag ina half-renamed state.
metadata.tagsonly,not
content, so BM25 / dense indexes don't need a rebuild.(Worth pinning with a test that asserts
embedding,content_hash,created_atsurvive a rename.)validity:,system:,whatever the canonical list is) at the route layer with a 400.
UI (Tags tab)
Each row in the tag list / cloud gets a hover-revealed action menu
with
Rename,Merge into…,Delete. All three open a confirmmodal that:
dry_run=truefirst"This will affect N chunks (sample: …)"dry_runonly after the user confirmsThe active-filter chip pattern from PR #684 is the closest visual
cousin — same accent / muted colour split, same "you are about to
do something irreversible" weight.
Risks / open questions
auto_tag: if the Auto-Tag form is mid-run when arename fires, the rename can race against newly written tags.
Cheapest answer: hold the same per-storage write lock the
auto-tag path already uses, and let the second caller block.
Worth a test that pins the lock invariant.
validity:*,system:*,archive:*callsites is part of theRFC, not the implementation PR.
no re-tag." Is that the right call? Alternative: queue a
auto_tagpass withsample_limit=0for the affected chunks,but that mixes two features and breaks the "this op is fast and
reversible at the metadata layer" property.
rename/delete/merge ops in storage, or is the chunk
updated_atbump enough? Default to the latter for v1.mmCLI doesn't havemm tags renameeither. Probably worth shipping CLI + Web in the same release so
scripts and the UI stay symmetric (see
feedback_mcp_cli_sibling_gate_parity.md-style invariant).Suggested split
If this gets a green light, three small PRs feel cleaner than one:
mm tags rename,mm tags delete,mm tags merge) with shared service-layer code from PR 1.Happy to take the first PR if there's appetite — comment with
agreement on the surface or push back on any of the four risks
above.
🤖 Generated with Claude Code