feat(mem_add,mem_edit): canonical writer + mem_edit header preservation by memtomem · Pull Request #464 · memtomem/memtomem

memtomem · 2026-04-24T23:53:49Z

Follow-up to #463 (chunker reader). Two coupled writer-side changes plus the e2e round-trip the chunker enabled.

Summary

Writer canonical form — append_entry now emits > tags: ["a", "b"] (explicit > prefix on every line, JSON-quoted) instead of the lazy-continuation form tags: ['a', 'b'] (Python repr()). Chunker reader (PR feat(chunking): promote per-entry blockquote tags to ChunkMetadata.tags #463) already accepts both shapes, so old files still parse.
mem_edit header preservation — new replace_chunk_body helper keeps the heading + > created: / > tags: blockquote intact when the caller supplies body-only new_content. Prefix new_content with ## to bypass preservation and revert to full-replacement (pre-RFC semantic).
mem_search(tag_filter=...) round-trip — multi-agent e2e now asserts mem_search(tag_filter=f"shared-from={src}") returns the share copy. Closes the gap discovered in PR test(multi-agent): end-to-end integration coverage #462.

Why two writer changes in one PR

The reviewer flagged on #463 that mem_edit's existing replace_lines call wasn't a new regression but became a much louder footgun once the chunker strips the blockquote from chunk content. Post-PR-A, a caller who reads chunk.content and feeds it back through mem_edit would supply body-only and silently erase the metadata header. Fixing that in the same PR as the canonical writer keeps the related contract change coherent.

What changed

packages/memtomem/src/memtomem/tools/memory_writer.py
- append_entry switches to > tags: {json.dumps(list(tags))}.
- New _find_body_start_index(chunk_lines) walks a chunk's line range and returns the index where the body starts, after heading + leading blockquote group (with lazy-continuation tolerance for legacy on-disk files).
- New replace_chunk_body(file_path, start_line, end_line, new_content) preserves heading + leading blockquote unless new_content starts with ## (full replace).
packages/memtomem/src/memtomem/server/tools/memory_crud.py
- mem_edit calls replace_chunk_body instead of replace_lines.
packages/memtomem/tests/test_memory_writer.py — existing tag-format pin updated to canonical form.
New tests/test_memory_writer_tag_format.py — 5 cases pinning the canonical output (explicit > prefix, JSON serialization, round-trip via json.loads, empty-tag omission, > created: prefix preserved).
New tests/test_mem_edit_header_preservation.py — 6 cases (canonical / legacy lazy / no-header / explicit ## full-replace / non-first sub-chunk of oversized section / trailing-newline round-trip).
tests/test_chunking_blockquote_tags.py — adds block-list-shape case (> tags:\n> - a\n> - b). Writer never emits this, but the shared _parse_tags_value helper supports it and we lock that in.
tests/test_multi_agent_integration.py test_share_copies_chunk_with_audit_tag — adds forward-direction tag_filter round-trip assertion.

Test plan

Full suite: uv run pytest -m "not ollama" → 2402 passed, 0 failed (PR-A baseline 2390 + 12 new).
uv run ruff check && uv run ruff format --check (full repo).
uv run mypy on changed source files — clean.

Out of scope (still tracked)

Three remaining follow-ups noted in #463 review, all in planning/mem-add-tags-blockquote-promote-rfc.md §Follow-ups:

_split_section line-number drift after blockquote strip (small follow-up PR; mem_edit / UI only, search unaffected).
add_entries_batch broadcast-tag cleanup (memory_crud.py:392–401); orthogonal cleanup PR.
chunk_links table to permanently close the shared-from=<old-uuid> audit-chain gap on reindex UUID churn — separate RFC mem-agent-share-chunk-links-rfc.md.

🤖 Generated with Claude Code

Stacks on PR #463 (chunker reader). Two coupled writer-side changes: 1. append_entry emits canonical blockquote tags > tags: ["a", "b"] # explicit `> ` prefix, JSON-quoted instead of the legacy lazy-continuation form tags: ['a', 'b'] # Python repr(), no `> ` prefix Lazy continuation is well-defined CommonMark but easy to break (a single blank line ends the blockquote), and Python repr() is not valid YAML. The chunker's section-leading parser already accepts both shapes, so existing files still parse — the writer just stops producing the fragile form on new entries. New test_memory_writer_tag_format.py pins the canonical output so a future regression that reverts to lazy / Python-repr fails fast. 2. mem_edit preserves the per-entry header Pre-PR, mem_edit called replace_lines(start_line, end_line, new_content), which replaced the entire chunk range — heading, blockquote header, body — with whatever the caller supplied. That was already wide pre-RFC, but post-PR-A the chunker strips the blockquote from chunk content, so a caller working from chunk.content would supply body-only and silently erase the metadata header on disk. New replace_chunk_body helper detects the heading + section- leading blockquote group at the top of the chunk's line range and keeps them intact while replacing only the body. If new_content itself starts with `## `, the call reverts to a full replacement (preserving the pre-RFC mem_edit semantic for users who want to override the heading explicitly). New test_mem_edit_header_preservation.py covers six cases (canonical, legacy lazy, no header, full-replace via `## `, non-first sub-chunk of an oversized section, trailing-newline round-trip). Round-trip and follow-ups: - test_multi_agent_integration test_share_copies_chunk_with_audit_tag now also asserts the forward direction: mem_search(tag_filter="shared-from=<src>") returns the share copy and only the share copy. Closes the e2e gap discovered during PR #462 that drove RFC mem-add-tags-blockquote-promote-rfc.md. - test_chunking_blockquote_tags adds a block-list-shape case (`> tags:\n> - a\n> - b`). The current writer never emits this shape, but the shared _parse_tags_value helper supports it and this locks the support in. Implements RFC §Migration & tests #2 + #3 and the writer-PR scope addition (mem_edit header preservation, ③-1) noted in PR #463 review. Co-Authored-By: Claude <[email protected]>

Reviewer caught that web/routes/chunks.py:edit_chunk still called replace_lines directly, which bypasses the header-preservation logic mem_edit gained in this PR. Same footgun, different surface — the Web UI editor surfaces chunk.content (already header-stripped by the chunker), so a Save from the browser would feed body-only content into replace_lines and silently erase the heading + > created: + > tags: on disk. Swap the call to replace_chunk_body so mem_edit and the Web UI route share the same preservation contract. Add a regression test (test_edit_chunk_preserves_blockquote_header) that drives the route end-to-end with a real tmp-file source and asserts the metadata header survives a body-only PATCH. Also folded in two reviewer nits: - mem_edit MCP-tool docstring now mentions the body-only / `## ` bypass policy (was only documented on the underlying helper). - replace_chunk_body has a one-line comment noting that ``## `` is the full-replace trigger because append_entry always emits H2; other heading levels in user input are body content. grep across src/ confirms replace_lines now has zero non-test callers in the codebase. Whether to deprecate it or keep it as a low-level escape hatch is tracked as a follow-up; not closing that question here. Co-Authored-By: Claude <[email protected]>

pandas-studio and others added 2 commits April 25, 2026 08:53

memtomem merged commit f215428 into main Apr 25, 2026
7 checks passed

github-actions Bot locked and limited conversation to collaborators Apr 25, 2026

memtomem deleted the feat/blockquote-tags-writer branch April 27, 2026 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mem_add,mem_edit): canonical writer + mem_edit header preservation#464

feat(mem_add,mem_edit): canonical writer + mem_edit header preservation#464
memtomem merged 2 commits intomainfrom
feat/blockquote-tags-writer

memtomem commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

memtomem commented Apr 24, 2026

Summary

Why two writer changes in one PR

What changed

Test plan

Out of scope (still tracked)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants