Skip to content

feat: add brief retrieval mode for wing/room overviews#601

Open
mvanhorn wants to merge 4 commits intoMemPalace:developfrom
mvanhorn:feat/73-brief-retrieval
Open

feat: add brief retrieval mode for wing/room overviews#601
mvanhorn wants to merge 4 commits intoMemPalace:developfrom
mvanhorn:feat/73-brief-retrieval

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Reopens #174 (closed for merge conflicts) — rebased onto current main and conflicts resolved.

What does this PR do?

Adds a brief() retrieval mode that returns a deduplicated, truncated overview of a wing or room's contents. Complements the existing recall() (raw L2 retrieval) and search() (semantic L3 search) with a "catch me up" mode.

How it works:

  • Fetches top 20 drawers via L2
  • Deduplicates using 70% word-overlap threshold
  • Caps output at ~2000 chars
  • Formats with topic count header

Available via:

  • MemoryStack.brief(wing=..., room=...) in Python
  • searcher.brief() wrapper
  • mempalace brief --wing X --room Y CLI subcommand

Closes #73.

Rebase notes

Conflicts resolved in two files:

  • mempalace/cli.py — dispatch table (kept both mcp and new brief entries)
  • tests/test_searcher.py — kept all the new TestSearchCLI tests from main and added test_brief to TestSearchMemories

How to test

# CLI
mempalace brief --wing my_project

# Python
from mempalace.layers import MemoryStack
stack = MemoryStack()
print(stack.brief(wing="my_project"))

Or run the test:

pytest tests/test_searcher.py::TestSearchMemories::test_brief -v

Checklist

  • Tests pass (pytest tests/ → 568 passed)
  • No hardcoded paths
  • Linter passes (ruff check .)

This contribution was developed with AI assistance (Codex).

mvanhorn and others added 3 commits April 11, 2026 01:46
Add MemoryStack.brief() that returns a deduplicated, truncated overview
of a wing/room's content. Uses L2 retrieval with deduplication (70%
word overlap threshold) and a 2000-char cap.

Also adds:
- searcher.brief() wrapper function
- `mempalace brief` CLI subcommand
- Test in test_searcher.py

Closes MemPalace#73.
Adds guard for "Retrieval error" prefix alongside existing "No " check
so L2 errors don't get parsed as content lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good rebase — this feature has been wanted for a while and the implementation is clean. A few observations:

What works well

  • The 70% word-overlap dedup threshold is reasonable for catching near-duplicate drawer summaries
  • 2000-char cap is a sensible default for a "catch me up" context injection
  • Dispatching to L2 for the underlying retrieval is correct (fast, no embedding cost)
  • CLI integration follows the existing pattern exactly

One correctness concern — dedup math

The overlap check:

if len(words & prev_words) > 0.7 * max(len(words), len(prev_words), 1):

This is asymmetric in a way that can produce false positives. A short line (3 words) that shares 3 words with a long line (20 words) will pass because 3 > 0.7 * 3, even though the long line has lots of unique content. Consider Jaccard similarity instead:

intersection = len(words & prev_words)
union = len(words | prev_words)
if union > 0 and intersection / union > 0.6:  # Jaccard threshold

Or keep the current approach but use min() instead of max():

if len(words & prev_words) > 0.7 * min(len(words), len(prev_words), 1):

Minor: parsing L2 output

brief() parses the string output of l2.retrieve() to get individual lines — this creates a text→parse→text pipeline that's fragile if L2's output format changes. If l2.retrieve() can return raw drawer data (or if there's a retrieve_raw() method), using that would be more robust. Not a blocker, just worth considering for the long term.

Tests

test_brief checks the output contains "Brief" and "topics" — that's the minimum needed to pass. A test for the dedup behavior (e.g., two near-identical drawers → only one in output) would give more confidence. Also worth a test for the empty-palace case since brief() has a special code path for raw.startswith("No ").

568 tests passing is solid. LGTM with the minor dedup note — would be comfortable approving after the math is verified.

Adds two tests to back the brief() retrieval mode:
- test_brief_empty_palace: verifies brief() surfaces L2's "no drawers"
  message instead of crashing when the palace has no data
- test_brief_dedupes_near_identical_drawers: verifies the >70% word
  overlap dedup actually collapses duplicates while preserving distinct
  content
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks for the careful read @web3guru888.

On the dedup math: I traced through the example you raised. With len(words & prev_words) > 0.7 * max(len(words), len(prev_words), 1), the 3-word/20-word case gives intersection=3, max=20, threshold=14, so 3 > 14 is false and both lines are kept. The current max() formulation is intentional — it's the conservative direction (favors keeping content over collapsing). Switching to min() would be the version that produces false positives on short-vs-long. I'd rather leave the formula as-is.

Good points on the test coverage though. Added two tests in f4b507e:

  • test_brief_empty_palace — covers the empty-palace short-circuit path
  • test_brief_dedupes_near_identical_drawers — verifies the >70% overlap actually collapses duplicates while preserving distinct content

On the L2 string-parsing point: agreed it's slightly fragile, but adding a retrieve_raw() API to L2 felt out of scope for this PR — happy to follow up in a separate one if maintainers want it.

570 tests passing.

@web3guru888
Copy link
Copy Markdown

Thanks for the trace-through on the dedup math — you're right, I had it backwards. max() as the denominator prevents the short-vs-long false positive (3 > 14 → keep both), whereas min() would cause it (3 > 2.1 → collapse). The conservative direction is exactly correct here.

The new tests look solid — test_brief_empty_palace and test_brief_dedupes_near_identical_drawers cover the two cases I'd flagged. And the note on the L2 string-parsing is fair — a retrieve_raw() API would be a separate PR anyway.

570 passing + these additions. LGTM from me.

Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All feedback addressed — dedup math confirmed correct (max() is intentional and right), test gaps filled with two new tests, L2 string parsing noted for follow-up PR. 570 passing.

@bensig bensig changed the base branch from main to develop April 11, 2026 22:21
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:21
@igorls igorls added area/cli CLI commands area/search Search and retrieval enhancement New feature or request labels Apr 14, 2026
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks @web3guru888 — appreciate the close read. Agree retrieve_raw is a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands area/search Search and retrieval enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature idea: separate recall vs brief retrieval modes

3 participants