feat: batch writes, concurrent mining, MCP tools, hooks, export, search improvements by jphein · Pull Request #562 · MemPalace/mempalace

jphein · 2026-04-10T19:00:48Z

Summary

Consolidated PR covering performance, reliability, and feature improvements across the codebase. Supersedes closed PRs #492, #493, #556 — all Copilot review feedback has been incorporated.

Performance

Batch ChromaDB writes — one upsert per file instead of per chunk in both miners
Concurrent mining — ThreadPoolExecutor for parallel file processing
Bulk mtime pre-fetch — bulk_check_mined() avoids per-file DB queries during mining

Reliability

Epsilon mtime comparison — abs() < 0.01 instead of == for float mtime dedup
Entity detector STOPWORDS — 73 technical terms added (Handler, Node, Service, etc.)
Search limit capped [1,100] — schema bounds enforced
Status/taxonomy tools paginated past 10K drawers
Save marker race fix — only advances after confirmed successful save
MCP notification handling — notifications/* catch-all returns None per JSON-RPC spec
Method guard — handles None/missing method without crash
Corrupt checkpoint cleanup — bad last_checkpoint file unlinked on parse error
Venv python resolution — background subprocesses find the correct interpreter (Hook scripts should use venv python in it's hooks #545)

Bug fixes from upstream issues

Cosine distance default — new collections use hnsw:space=cosine instead of L2, fixing negative similarity scores (fix: default collection to cosine distance and clamp similarity scores #568)
Emotion regex — removed overly broad \*[^*]+\* from EMOTION_MARKERS that routed all markdown bold into emotional room (--extract general dumps Markdown-bold content into emotional room via overly broad \*[^*]+\* regex #536)
Unicode checkmark — replaced U+2713 with ASCII in progress output, fixing crash on Windows cp1251/cp1252 (UnicodeEncodeError in miner.py / convo_miner.py / split_mega_files.py on Windows cp1251/cp1252 (incomplete fix of #47) #535)
KG path mismatch — MCP server always uses palace_path/knowledge_graph.sqlite3 instead of diverging default path (MCP server stdio transport: add_drawer and kg_add write to WAL but not to ChromaDB/SQLite #538)
MCP arg filtering — unexpected tool args filtered before dispatch, preventing TypeError crash (fix: ignore unexpected MCP tool args during handler dispatch #572)
WAL rotation — rotate write_log.jsonl at 10 MB with one backup (fix: rotate WAL file at 10 MB to prevent unbounded growth #573)
Compress dict keys — aligned with compression_stats() return values (fix: align cmd_compress dict keys with compression_stats() return values #569)
Spellcheck registry — reads from people key, uses dict keys as canonical names (fix: use correct registry key in spellcheck entity lookup #570)
Skill docs — mempalace status instead of nonexistent --version, --yes in init instructions (SKILL.md uses nonexistent mempalace --version; init instructions omit --yes so agents hit EOFError #534)

Features

Silent stop hook — saves directly via Python API with systemMessage notification instead of blocking MCP calls (UX: Stop hook MCP tool calls clutter terminal every 15 messages #554)
Theme extraction — word frequency analysis surfaces conversation topics in notifications
Precompact hook — emergency save before context compaction
Configurable hook settings — mempalace_hook_settings MCP tool for silent_save/desktop_toast
mempalace export — markdown backup of the entire palace
New MCP tools — get/list/update drawer, memories_filed_away, hook_settings
max_distance parameter — renamed from min_similarity with backwards compat shim
Configurable chunk size/overlap in mining

Palace maintenance

Junk file filter — SKIP_PATTERNS skips minified JS/CSS, bundles, source maps, lockfiles; JUNK_FILE_SIZE (500KB) skips large generated text files
mempalace purge --wing X --room Y — selective batch deletion with confirmation prompt, batched in groups of 10K
Status 10K cap fix — uses col.count() + paginated metadata fetch with thousands-separator formatting (fix: replace hardcoded limit=10000 with dynamic col.count() in navigation tools #563)
Repair command fixed — nukes palace directory and creates fresh PersistentClient to eliminate HNSW ghost entries from bulk deletes (hnswlib updatePoint race on modified-file re-mine: EXC_BAD_ACCESS in repairConnectionsForUpdate (macOS ARM64, Python 3.13, chromadb 0.6.3) #521)
MCP stale cache detection — tracks chroma.sqlite3 inode to auto-reconnect after palace rebuild

Stats

Test plan

Float equality on mtime fails due to JSON round-trip precision loss, causing every file to be re-mined on each run. Use epsilon < 0.01. Also adds bulk_check_mined() for fetching all source_file/mtime pairs in paginated batches — turns 25K individual DB queries into ~5 fetches. Fixes MemPalace#475 Co-Authored-By: Claude Opus 4.6 <[email protected]>

…decls - Clamp tool_search limit to [1, 100] to prevent memory exhaustion - Replace hardcoded limit=10000 in status/taxonomy tools with paginated _fetch_all_metadata() helper (matches palace_graph.py pattern) - Remove duplicate _client_cache/_collection_cache declarations Fixes MemPalace#477, MemPalace#478, MemPalace#479 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Accumulate all chunks for a file into lists, then issue a single collection.upsert() (miner) or collection.add() (convo_miner) call. Reduces 125K-375K individual DB round-trips to ~25K batched calls. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Prevents false positives like Handler, Node, Service, Manager, Client being flagged as project/person entities in code-heavy directories. Fixes MemPalace#476 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Adds min_similarity parameter (L2 distance cutoff) to search_memories() and MCP tool_search (default 1.5). Filters out clearly irrelevant results instead of always returning top-N regardless of quality. Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Updated STOP_BLOCK_REASON to instruct AI to use mempalace_diary_write and mempalace_add_drawer instead of generic "memory system" - Updated PRECOMPACT_BLOCK_REASON with same MCP tool instructions - Added _ingest_transcript() to mine Claude Code JSONL transcripts into the palace automatically on stop/precompact triggers - Transcript goes into a "sessions" wing via convo_miner Co-Authored-By: Claude Opus 4.6 <[email protected]>

Documents fork relationship, key files, development workflow, fork changes, upstream PRs, and integration details. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Mining: - Added _prepare_file() for thread-safe file processing (read/chunk/route) - mine() now supports --workers flag (default: min(8, cpu_count)) - Concurrent path: bulk mtime pre-fetch, parallel _prepare_file(), serialized ChromaDB writes in batches of 100. Sequential path unchanged (workers=1). Room routing: - Priority 1: exact folder match only (no substring) - Priority 2: exact filename match only - Content scan increased from 2KB to 5KB (full file if <10KB) - Keyword scoring uses word-boundary regex instead of substring count - Added 13 unit tests for detect_room covering all priority paths Co-Authored-By: Claude Opus 4.6 <[email protected]>

New exporter.py: paginates all drawers, groups by wing/room, writes browsable markdown tree with index.md table of contents. Each drawer becomes a blockquoted section with metadata table. Usage: mempalace export -o ./palace-export Also fixes test_cli.py for new --workers arg on mine subparser. Co-Authored-By: Claude Opus 4.6 <[email protected]>

… cache I7: Three new MCP tools — get_drawer, list_drawers (paginated), update_drawer (with WAL audit logging and input sanitization). I8: WAL file chmod(0o600) now only runs on file creation instead of every write call. I9: 5-second TTL metadata cache for status/wings/taxonomy tools. Eliminates redundant full-palace pagination when tools are called in quick succession. Co-Authored-By: Claude Opus 4.6 <[email protected]>

I6: Chunk size/overlap/min now configurable via ~/.mempalace/config.json instead of hardcoded constants. Wired through mine() → process_file() → chunk_text(). I11: Layer1.generate() capped at MAX_SCAN=2000 drawers (was unbounded). Reduces wake-up from 250+ ChromaDB round-trips to 4 max. I12: Extracted _build_where_filter() helper in searcher.py, replaced 5 duplicate where-filter blocks across searcher.py and layers.py. Co-Authored-By: Claude Opus 4.6 <[email protected]>

I10: 10 unit tests for chunk_text() covering boundaries, overlap, indices, empty/whitespace input, content preservation. I13: KG query_entity default direction aligned from "outgoing" to "both" to match the MCP schema default. I14: Plugin versions synced to 3.1.0 in both .claude-plugin/ and .codex-plugin/. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Address web3guru888's review feedback across PRs MemPalace#492 and MemPalace#493: - palace.py: remove unused filepaths param from bulk_check_mined(), replace bare except with logger.warning for partial fetch visibility - miner.py: wrap future.result() in try/except so one file failure doesn't abort the entire concurrent mining run - exporter.py: stream drawers in batches instead of loading entire palace into memory — keeps memory bounded for large palaces - searcher.py: document min_similarity as L2 distance (not cosine) with typical range guidance in docstring Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Rename _build_where_filter → build_where_filter (public cross-module API) - Add float() cast + TypeError/ValueError handling in _is_already_mined - Add chunk_overlap validation (must be >= 0 and < chunk_size) - Batch convo_miner adds to 100 docs per call (avoid SQLite limits) - Stream miner writes as futures complete (bounded memory) - Remove unused palace_path in hooks_cli - Remove unused chromadb import in test_exporter - Sanitize wing/room as path components in exporter (prevent traversal) - Filter on raw distance before rounding in searcher - Clamp negative offset in tool_list_drawers - No-op early return + cache invalidation in tool_update_drawer - Add min/max schema bounds for search limit and list_drawers limit/offset - Update CLAUDE.md test count (534 → 562) - Improve chunk coverage test with position-unique tokens Co-Authored-By: Claude Opus 4.6 <[email protected]>

Stop and precompact hooks used bare `python3` which resolves to system Python in sessions outside the memorypalace project directory, causing `No module named mempalace` errors. Now uses the venv's Python with fallback to system python3. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Replace hardcoded venv path with a resolution chain: 1. MEMPALACE_PYTHON env var (user override) 2. Plugin root's own venv (development installs) 3. System python3 (pip/pipx installs) Co-Authored-By: Claude Opus 4.6 <[email protected]>

Covers basic CRUD, filtering, pagination, negative offset clamping, not-found errors, and no-op update detection. Addresses review comment on PR MemPalace#493 requesting coverage for the new drawer tools. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 <[email protected]>

Single source of truth for the limit ceiling (100) so operators can adjust without hunting through multiple clamp sites. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 <[email protected]>

…king MCP Stop hook no longer blocks Claude with MCP tool call instructions every 15 messages. Instead it saves a diary checkpoint directly via the Python API and shows a single-line terminal notification + desktop toast. Fixes MemPalace#554 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Add hooks.silent_save and hooks.desktop_toast to config.json, readable via new mempalace_hook_settings MCP tool (get/set). Stop hook checks config to decide between silent direct save vs legacy blocking MCP. Restore STOP_BLOCK_REASON for legacy mode. Toast is opt-in via config. Co-Authored-By: Claude Opus 4.6 <[email protected]>

stderr from hook subprocesses doesn't reach the Claude Code terminal. Block with a one-liner notification after the direct save completes — save already happened, Claude just continues. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Claude Code shows all hook blocks as "Stop hook error:" with no info level available. Return {} for truly invisible saves. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Hook saves directly, then blocks asking Claude to call mempalace_checkpoint_ack — a zero-param tool returning one line like "✦ Journal entry filed — 30 messages tucked into drawers". Replaces both the verbose MCP diary/drawer calls and the invisible silent mode with a single clean terminal line. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Claude Code labels all hook blocks as "Stop hook error:" with no way to customize. Go fully silent instead — save happens invisibly. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Stop hook now outputs {"systemMessage": "✦ N messages filed away"} which Claude Code renders as a visible one-line terminal notification — no MCP tool call needed. Also renames checkpoint_ack → memories_filed_away and fixes MCP server to silently ignore all notifications/ methods per spec. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Copilot review caught that hook_stop() updated the last-save marker before _save_diary_direct() ran. If save failed, the marker would still advance and skip the next checkpoint. Move marker write after save confirms success. Also updates CLAUDE.md test count and hook docs. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Stop hook now extracts topic keywords from recent messages and displays them in the notification: "✦ 10 memories woven into the palace — hooks, notifications, MCP". Stopword filtering keeps only distinctive terms. Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Rename min_similarity → max_distance (searcher + MCP schema), keep backwards compat alias in MCP tool handler - Fix ingest comment accuracy (async/best-effort, not guaranteed) - Add notification protocol tests (all notifications/* return None, unknown methods without id return None) - 578 tests passing Co-Authored-By: Claude Opus 4.6 <[email protected]>

Both save and precompact hooks now auto-mine the JSONL transcript directly into the palace before blocking the AI. This captures raw tool output (Bash results, search findings, build errors) that the AI would otherwise summarize away during its save cycle. - Add MP_PYTHON auto-detection (MEMPAL_PYTHON env → repo venv → system) - Add inline normalize → chunk → upsert pipeline to both hooks - Skip file_already_mined — transcript grows, upsert is idempotent - Update block reason messages to explicitly request verbatim tool output - Precompact hook: parse transcript_path from input, fallback to session_id lookup - Update README with two-layer capture docs and MEMPAL_PYTHON config Co-Authored-By: Claude Opus 4.6 <[email protected]>

…b upgrade - CLAUDE.md: test count 648, fork changes 8-13, PR MemPalace#562, hook descriptions - README.md: hooks section reflects two-layer capture, chromadb >=1.5.4, normalize.py captures tool blocks, MEMPAL_PYTHON env var documented - AGENTS.md: add normalize.py and hooks to key files section - HOOKS_TUTORIAL.md: new sections for two-layer capture and configuration - mempalace/README.md: normalize.py description updated, version.py added - normalize.py: docstring updated for tool_use/tool_result capture Co-Authored-By: Claude Opus 4.6 <[email protected]>

jphein · 2026-04-11T02:53:18Z

Two more commits:

17518c3 — feat: hooks auto-mine transcript for tool output capture

Both save and precompact hooks now auto-mine the JSONL transcript directly into the palace before blocking the AI
Two-layer capture: auto-mine (deterministic, runs normalize pipeline) + updated block reason (tells AI to save tool output verbatim)
MP_PYTHON auto-detection for venv (chromadb isn't on system python)
Precompact hook finds transcript via input JSON or session_id fallback
Closes the live-session gap from Claude Code JSONL mining silently drops all tool output (49% content loss) #590

bddc087 — docs: update all docs for tool output mining, hook auto-mine, chromadb upgrade

CLAUDE.md: test count 648, fork changes 8-13, PR feat: batch writes, concurrent mining, MCP tools, hooks, export, search improvements #562
README.md: hooks section reflects two-layer capture, chromadb >=1.5.4, MEMPAL_PYTHON
AGENTS.md, HOOKS_TUTORIAL.md, mempalace/README.md, normalize.py docstring all updated

648 tests pass. This PR now addresses 18+ upstream issues with tool output capture both post-hoc and live.

web3guru888 · 2026-04-11T03:01:20Z

The two-layer capture approach (auto-mine + updated block reason) is the right architecture for this. A few observations on the implementation:

Auto-mine in hooks: Running normalize() → chunk_exchanges() → upsert() inline in both save and precompact hooks ensures tool output is captured regardless of what the AI does with the block reason. The idempotent upsert with deterministic IDs means no double-store even if the transcript grows incrementally during a session. Clean.

MEMPAL_PYTHON env var chain: The MEMPAL_PYTHON → repo venv → system python3 fallback is exactly right — this is the exact failure mode we hit early in our setup where chromadb was installed in the venv but the hook fired using system python. The auto-detection saves future integrators from that silent failure.

Precompact transcript fallback: The find by session_id fallback is a good safety net, though worth noting it will fail in multi-project setups where the same session_id appears in multiple project directories (unlikely but possible). A -maxdepth 3 or sorting by mtime would make the fallback more deterministic.

This PR is in excellent shape — 648 tests, 18+ upstream issues addressed, the chromadb upgrade, and now live-session capture closed. Full LGTM. Ready for maintainer merge.

bensig · 2026-04-11T05:19:11Z

There's good work in here but 32 files covering batch writes, MCP tools, hooks, export, and search improvements is too much for one PR. Could you split this into separate PRs by feature? We'd merge the individual pieces much faster.

web3guru888 · 2026-04-11T05:47:33Z

@jphein — having followed this PR closely I can suggest a logical split that maps to the natural seams in the codebase:

PR 1 — Bug fixes (mergeable today, low risk)
The standalone bug fixes with no cross-dependencies: cosine distance default (#568), emotion regex fix (#536), Unicode checkmark (#535), KG path mismatch (#538), MCP arg filtering (#572), compress dict keys (#569), spellcheck registry (#570), skill docs (#534). These are all single-file patches with existing test coverage. Maintainers should be able to review each in isolation quickly.

PR 2 — Performance (batch writes + concurrent mining)
Batch ChromaDB writes, ThreadPoolExecutor mining, bulk mtime pre-fetch. These three features are tightly coupled — they interact around the batching contract — so they belong together. The epsilon mtime comparison and entity detector STOPWORDS also fit here since they directly affect what gets batched.

PR 3 — Reliability / MCP protocol
MCP notification handling, method guard, save marker race fix, MCP stale cache detection, WAL rotation, corrupt checkpoint cleanup, venv python resolution. These all touch the server reliability surface without adding new tools.

PR 4 — Palace maintenance
Junk file filter, mempalace purge, status 10K cap fix, repair command fix. These are user-facing maintenance features that stand alone.

PR 5 — Hooks
Silent stop hook, precompact hook, configurable hook settings, theme extraction. These are all connected through the hook lifecycle so they ship best as one unit.

PR 6 — New MCP tools + export
mempalace export, get/list/update drawer tools, memories_filed_away, hook_settings MCP tool, max_distance rename. These are additive feature additions.

Splitting this way, PRs 1–3 could likely land in the next release cycle. PR 1 especially — those bug fixes close 8 upstream issues and have minimal review surface.

jphein · 2026-04-11T13:53:31Z

@bensig — Fair point, will split. @web3guru888's breakdown is solid and maps to the natural seams.

I'll close this mega-PR and open the individual ones. Starting with PR 1 (bug fixes) since those are the lowest risk and close 8 issues immediately. Will reference this PR in each for context.

Thanks for the review time on this — the feedback from both of you made the code better.

Formatted · 2026-04-11T13:56:21Z

On the split — worth noting that the status 10K cap fix (proposed PR 4) overlaps with #482, which is already open and addresses it via a shared iter_metadatas(col, where=None, batch=500) generator in palace.py. That PR also consolidates the inline loops in mcp_server.py and palace_graph.py onto the same helper. Might be worth checking whether the pagination piece from #562 can defer to #482 rather than duplicating it in a new split PR.

…ORDS Split from MemPalace#562 per maintainer request. - palace.py: epsilon mtime comparison for float dedup, cosine distance space, bulk_check_mined() pre-fetch to eliminate N+1 queries - miner.py: batch ChromaDB upserts (one per file instead of per chunk), ThreadPoolExecutor concurrent mining with --workers flag, _prepare_file() helper for thread-safe read/chunk/route, junk file filter (SKIP_PATTERNS + JUNK_FILE_SIZE), word-boundary keyword matching in detect_room(), configurable chunk params, paginated status() past 10K drawers - convo_miner.py: batch upserts in groups of 100 instead of per-chunk writes - entity_detector.py: 73 technical STOPWORDS (Handler, Node, Service, etc.) to reduce false-positive entity detection - cli.py: --workers argument for parallel mining

jphein · 2026-04-11T14:17:56Z

@web3guru888 Good catch on the #482 overlap. I've reviewed that PR — its iter_metadatas() generator in palace.py is the right long-term approach (one shared helper for all pagination consumers). Our _fetch_all_metadata() in mcp_server.py solves the same problem but is scoped to just the MCP server.

Plan for the split: PR 4 (palace maintenance) will not include the status 10K pagination fix — we'll defer that piece to #482's iter_metadatas() landing. PR 4 will focus on: repair nuke-rebuild, purge command, --version flag, and chromadb pin bump. The pagination improvement will come naturally when #482 merges.

Also just merged upstream/main into our fork — query sanitizer (#385) and pagination (#371) are now integrated. Starting the split PRs now.

- Batch upserts (100 docs/call) in both general miner and convo_miner instead of one-at-a-time writes — 3-5x faster on large projects - _prepare_file() separates read/chunk (thread-safe) from write, enabling ThreadPoolExecutor concurrency with --workers flag - bulk_check_mined() pre-fetches all source_file/mtime pairs in one paginated scan instead of per-file ChromaDB queries - Epsilon mtime comparison (abs < 0.01) instead of float == (MemPalace#483) - Cosine distance default for new collections (hnsw:space=cosine) - SKIP_PATTERNS/JUNK_FILE_SIZE filter minified JS, lockfiles, large dumps - detect_room() uses word-boundary regex and exact-match priority - chunk_text() accepts configurable size/overlap/min params - Entity detector STOPWORDS expanded (73 technical terms) - Configurable chunk_size/chunk_overlap/min_chunk_size in palace config Split from MemPalace#562 per maintainer request. Closes MemPalace#483, MemPalace#568. Co-Authored-By: Claude Opus 4.6 <[email protected]>

jphein · 2026-04-11T20:55:18Z

Update on chromadb 0.6.3 → 1.5.x migration: The migration is not seamless — there's a latent data corruption bug.

Bug

ChromaDB 0.6.x stored seq_id values as big-endian 8-byte BLOBs in the embeddings and max_seq_id tables. ChromaDB 1.5.x expects these to be INTEGER. The auto-migration does not convert existing BLOB seq_id values — it only writes new rows as INTEGER, leaving old rows as BLOBs.

The palace appears to work initially (reads from WAL/cache succeed), but the Rust compactor eventually hits:

InternalError: Error executing plan: Error sending backfill request to compactor:
Error reading from metadata segment reader: error occurred while decoding column 0:
mismatched types; Rust type `u64` (as SQL type `INTEGER`) is not compatible with SQL type `BLOB`

At that point, all ChromaDB API calls fail — count(), get(), query() — the palace is bricked until the BLOBs are fixed.

Impact

Any palace created with ChromaDB 0.6.x and then opened with 1.5.x will hit this. Our 50K palace had 50,796 BLOB seq_id rows in embeddings and 1 in max_seq_id. The initial "648 tests pass" was misleading — tests use fresh databases, never migrated ones.

Fix

Direct SQLite conversion of BLOB seq_ids to INTEGER:

import sqlite3
conn = sqlite3.connect("path/to/chroma.sqlite3")

# Fix embeddings table
cursor = conn.execute(
    "SELECT rowid, seq_id FROM embeddings WHERE typeof(seq_id) = 'blob'"
)
updates = [(int.from_bytes(blob, byteorder='big'), rowid) for rowid, blob in cursor]
conn.executemany("UPDATE embeddings SET seq_id = ? WHERE rowid = ?", updates)

# Fix max_seq_id table
cursor = conn.execute(
    "SELECT rowid, seq_id FROM max_seq_id WHERE typeof(seq_id) = 'blob'"
)
updates = [(int.from_bytes(blob, byteorder='big'), rowid) for rowid, blob in cursor]
conn.executemany("UPDATE max_seq_id SET seq_id = ? WHERE rowid = ?", updates)

conn.commit()
conn.close()

Should this be a mempalace repair step? cc @web3guru888

- Batch upserts (100 docs/call) in both general miner and convo_miner instead of one-at-a-time writes — 3-5x faster on large projects - _prepare_file() separates read/chunk (thread-safe) from write, enabling ThreadPoolExecutor concurrency with --workers flag - bulk_check_mined() pre-fetches all source_file/mtime pairs in one paginated scan instead of per-file ChromaDB queries - Epsilon mtime comparison (abs < 0.01) instead of float == (MemPalace#483) - Cosine distance default for new collections (hnsw:space=cosine) - SKIP_PATTERNS/JUNK_FILE_SIZE filter minified JS, lockfiles, large dumps - detect_room() uses word-boundary regex and exact-match priority - chunk_text() accepts configurable size/overlap/min params - Entity detector STOPWORDS expanded (73 technical terms) - Configurable chunk_size/chunk_overlap/min_chunk_size in palace config Split from MemPalace#562 per maintainer request. Closes MemPalace#483, MemPalace#568. Co-Authored-By: Claude Opus 4.6 <[email protected]>

jphein and others added 30 commits April 9, 2026 19:15

fix: add 73 technical terms to entity detector STOPWORDS

86eadc7

Prevents false positives like Handler, Node, Service, Manager, Client being flagged as project/person entities in code-heavy directories. Fixes MemPalace#476 Co-Authored-By: Claude Opus 4.6 <[email protected]>

docs: add CLAUDE.md for project context

fccb705

Documents fork relationship, key files, development workflow, fork changes, upstream PRs, and integration details. Co-Authored-By: Claude Opus 4.6 <[email protected]>

fix: portable Python resolution in hook scripts

d16ac4b

Replace hardcoded venv path with a resolution chain: 1. MEMPALACE_PYTHON env var (user override) 2. Plugin root's own venv (development installs) 3. System python3 (pip/pipx installs) Co-Authored-By: Claude Opus 4.6 <[email protected]>

docs: update expected test count to 573

d8c423b

Co-Authored-By: Claude Opus 4.6 <[email protected]>

refactor: extract _MAX_RESULTS constant for search/list limit cap

6dc8891

Single source of truth for the limit ceiling (100) so operators can adjust without hunting through multiple clamp sites. Co-Authored-By: Claude Opus 4.6 <[email protected]>

docs: update expected test count to 573

2da4a65

Co-Authored-By: Claude Opus 4.6 <[email protected]>

fix: make silent stop hook fully silent — no block, no error label

5ebfcd4

Claude Code shows all hook blocks as "Stop hook error:" with no info level available. Return {} for truly invisible saves. Co-Authored-By: Claude Opus 4.6 <[email protected]>

fix: fully silent stop hook — no block, no error label

995aed3

Claude Code labels all hook blocks as "Stop hook error:" with no way to customize. Go fully silent instead — save happens invisibly. Co-Authored-By: Claude Opus 4.6 <[email protected]>

jphein and others added 2 commits April 10, 2026 19:43

bensig closed this Apr 11, 2026

bensig mentioned this pull request Apr 11, 2026

fix: batch ChromaDB upserts and parallelize file processing for Windo… #298

Closed

3 tasks

web3guru888 mentioned this pull request Apr 11, 2026

feat: add mine_epoch tracking and revision snapshots #607

Open

3 tasks

This was referenced Apr 11, 2026

fix: use epsilon comparison for mtime dedup + add bulk pre-fetch #483

Closed

fix: cap search limit, paginate status tools, remove duplicate cache decls #484

Closed

This was referenced Apr 11, 2026

fix: standalone bug fixes (#534 #535 #536 #538 #570 #572 #637) #626

Closed

feat: palace maintenance — junk filter, purge, status pagination, repair (#478 #586 #587 #581) #627

Closed

jphein mentioned this pull request Apr 11, 2026

perf: batch writes, bulk mtime dedup, concurrent mining, entity STOPWORDS #628

Closed

6 tasks

jphein mentioned this pull request Apr 11, 2026

perf: batch writes, concurrent mining, bulk mtime pre-fetch #629

Closed

4 tasks

Formatted mentioned this pull request Apr 11, 2026

refactor: consolidate pagination into shared iter_metadatas() (#171) #482

Closed

3 tasks

z3tz3r0 mentioned this pull request Apr 12, 2026

Add lock/kill-switch guidance for autosave and MCP hook wrappers #504

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: batch writes, concurrent mining, MCP tools, hooks, export, search improvements#562

feat: batch writes, concurrent mining, MCP tools, hooks, export, search improvements#562
jphein wants to merge 51 commits intoMemPalace:mainfrom
jphein:main

jphein commented Apr 10, 2026 •

edited

Loading

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

bensig commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

Formatted commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jphein commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance

Reliability

Bug fixes from upstream issues

Features

Palace maintenance

Stats

Test plan

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

bensig commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

Formatted commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Bug

Impact

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jphein commented Apr 10, 2026 •

edited

Loading