fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR#1231
fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR#1231
Conversation
The Stop and PreCompact hooks spawn `mempalace mine <dir>` with no `--mode` flag, which defaults to `projects` in cli.py. When MEMPAL_DIR is unset, _get_mine_dir falls back to the parent of the transcript JSONL — and miner.py's READABLE_EXTENSIONS includes `.jsonl`, so the projects miner happily ingests Claude Code session JSONL as if it were source code instead of conversation. Make _get_mine_dir return (dir, mode): MEMPAL_DIR keeps `projects`, the JSONL fallback yields `convos`. Both _maybe_auto_ingest and _mine_sync now thread the mode into the spawned command.
- Normalize MEMPAL_DIR via Path.expanduser().resolve() so ~/proj paths are correctly accepted instead of falling through to transcript fallback - Replace bare Path.expanduser().is_file() transcript check with the existing _validate_transcript_path() which adds .resolve(), enforces .jsonl/.json extension, and rejects '..' path-traversal components - Update tests to compare resolved paths (cross-platform correctness) - Add tests for tilde expansion, path-traversal rejection, and non-jsonl extension rejection in _get_mine_dir Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/f69176c7-d752-40ef-ba71-d0e4adc3a689 Co-authored-by: igorls <[email protected]>
…MEMPAL_DIR #1230 fixed --mode convos for the case where MEMPAL_DIR was unset, but left two configurations broken: - MEMPAL_DIR set to a project dir: convos never mined (MEMPAL_DIR overrode the transcript path); only project files were ingested. - MEMPAL_DIR set to a conversations dir per the old hooks/README: the projects miner ran on JSONL — same wrong-miner behaviour. The shell hooks (mempal_save_hook.sh, mempal_precompact_hook.sh) had the same MEMPAL_DIR-overrides-transcript bug AND were missing --mode on every spawned `mempalace mine` call. Make the auto-ingest *additive*. _get_mine_dir → _get_mine_targets, returning a list of (dir, mode) pairs: - MEMPAL_DIR (when valid) contributes (dir, "projects") - A valid transcript JSONL contributes (parent, "convos") - Both can appear together; the hook spawns one ingest per target Same change applied to the shell save and precompact hooks. Precompact also gained transcript_path parsing so it can run the convos mine synchronously before context is compressed. hooks/README.md updated to describe MEMPAL_DIR as a project-files target, never a convos target.
There was a problem hiding this comment.
Pull request overview
This PR fixes MemPalace Claude Code/Codex hooks so the active transcript is always mined as conversations (--mode convos) while MEMPAL_DIR (when set) becomes an additional project-files mining target (--mode projects), rather than overriding transcript ingestion.
Changes:
- Refactors Python hook auto-ingest targeting from a single directory to a list of
(dir, mode)targets and spawns one mine per target. - Updates shell save/precompact hooks to mine transcript parent dir with
--mode convosand mineMEMPAL_DIRwith--mode projects. - Updates tests and hook documentation to reflect the additive targeting semantics.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
mempalace/hooks_cli.py |
Switches auto-ingest to multi-target (dir, mode) and spawns mines per target. |
hooks/mempal_save_hook.sh |
Mines transcript dir as convos and MEMPAL_DIR as projects (both optional/additive). |
hooks/mempal_precompact_hook.sh |
Adds transcript_path parsing and mines transcript synchronously as convos before compaction. |
hooks/README.md |
Clarifies MEMPAL_DIR is for project files only; transcript mining is automatic. |
tests/test_hooks_cli.py |
Adds/updates tests for multi-target spawning and mode correctness. |
tests/test_save_hook_mines.py |
Updates assertions to require transcript-driven --mode convos mining in the save hook. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if mempal_dir: | ||
| resolved = Path(mempal_dir).expanduser().resolve() | ||
| if resolved.is_dir(): | ||
| targets.append((str(resolved), "projects")) | ||
| path = _validate_transcript_path(transcript_path) | ||
| if path is not None and path.is_file(): | ||
| targets.append((str(path.parent), "convos")) | ||
| return targets |
There was a problem hiding this comment.
_get_mine_targets() adds a convos target for the transcript directory, but hook_stop()/hook_precompact() already call _ingest_transcript(transcript_path), which spawns a convos mine as well (with a different --wing). This can result in duplicate conversation ingestion (and potentially into different wings), plus extra load on every save/precompact. Consider consolidating to a single transcript-ingest path (either remove the transcript-derived target here and leave transcript mining to _ingest_transcript, or remove/adjust _ingest_transcript and ensure the spawned mine uses the intended --wing consistently).
| def _maybe_auto_ingest(transcript_path: str = ""): | ||
| """Run mempalace mine in background if a mine directory is available.""" | ||
| mine_dir = _get_mine_dir(transcript_path) | ||
| if not mine_dir: | ||
| """Run mempalace mine in background for every available target.""" | ||
| targets = _get_mine_targets(transcript_path) | ||
| if not targets: | ||
| return | ||
| if _mine_already_running(): | ||
| _log("Skipping auto-ingest: mine already running") | ||
| return | ||
| try: | ||
| _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir]) | ||
| except OSError: | ||
| pass | ||
| for mine_dir, mode in targets: | ||
| try: | ||
| _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir, "--mode", mode]) | ||
| except OSError: | ||
| pass |
There was a problem hiding this comment.
_maybe_auto_ingest() now spawns multiple background mempalace mine processes when both targets are present, but _spawn_mine() writes a single PID to _MINE_PID_FILE each time. The second spawn overwrites the first PID, so _mine_already_running() can return false while the first mine is still running, allowing overlapping mines on later hook fires. Consider tracking multiple PIDs (e.g., one PID file per target/mode) or switching to a filesystem lock (flock/portalocker) rather than a single-PID sentinel when spawning more than one process.
| for mine_dir, mode in targets: | ||
| try: | ||
| _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir, "--mode", mode]) | ||
| except OSError: |
There was a problem hiding this comment.
The subprocesses here are spawned via sys.executable, but the module-level _mempalace_python() docstring explicitly calls out that sys.executable may be a system Python without chromadb when invoked by hooks. To avoid background/sync mines failing silently in those environments, consider using _mempalace_python() for the interpreter in these spawned mempalace mine commands.
| def _mine_sync(transcript_path: str = ""): | ||
| """Run mempalace mine synchronously (for precompact -- data must land first).""" | ||
| mine_dir = _get_mine_dir(transcript_path) | ||
| if not mine_dir: | ||
| """Run mempalace mine synchronously for every target (precompact).""" | ||
| targets = _get_mine_targets(transcript_path) | ||
| if not targets: | ||
| return | ||
| try: | ||
| STATE_DIR.mkdir(parents=True, exist_ok=True) | ||
| log_path = STATE_DIR / "hook.log" | ||
| with open(log_path, "a") as log_f: | ||
| subprocess.run( | ||
| [sys.executable, "-m", "mempalace", "mine", mine_dir], | ||
| stdout=log_f, | ||
| stderr=log_f, | ||
| timeout=60, | ||
| ) | ||
| except (OSError, subprocess.TimeoutExpired): | ||
| pass | ||
| STATE_DIR.mkdir(parents=True, exist_ok=True) | ||
| log_path = STATE_DIR / "hook.log" | ||
| for mine_dir, mode in targets: | ||
| try: | ||
| with open(log_path, "a") as log_f: | ||
| subprocess.run( | ||
| [ | ||
| sys.executable, | ||
| "-m", | ||
| "mempalace", | ||
| "mine", | ||
| mine_dir, | ||
| "--mode", | ||
| mode, | ||
| ], | ||
| stdout=log_f, | ||
| stderr=log_f, | ||
| timeout=60, | ||
| ) | ||
| except (OSError, subprocess.TimeoutExpired): |
There was a problem hiding this comment.
_mine_sync() now potentially runs two separate mempalace mine commands sequentially (projects + convos), each with timeout=60. In the “both set” case this can push total hook runtime well past common harness hook timeouts (e.g., 30s), and if it times out mid-way the more important transcript ingest might not run. Consider prioritizing the transcript/convos target first (or running the projects ingest async / with a smaller timeout) so precompact reliably captures the active transcript before compaction proceeds.
| def test_get_mine_targets_transcript_path_traversal_rejected(tmp_path): | ||
| """Transcript paths with '..' components are rejected (no convos target).""" | ||
| with patch.dict("os.environ", {}, clear=True): | ||
| targets = _get_mine_targets("../../etc/passwd") |
There was a problem hiding this comment.
This test claims to verify '..' path traversal rejection, but the provided input ("../../etc/passwd") lacks a .json/.jsonl suffix, so it is rejected by the extension check instead and doesn't actually exercise the traversal guard. To test the traversal behavior specifically, use a .jsonl/.json path that includes '..' components (e.g. "../t.jsonl") and assert that it yields no convos target.
| targets = _get_mine_targets("../../etc/passwd") | |
| targets = _get_mine_targets("../t.jsonl") |
| # interpolating into shell — same contract as mempal_save_hook.sh. | ||
| eval $(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c " | ||
| import sys, json, re | ||
| data = json.load(sys.stdin) | ||
| sid = data.get('session_id', 'unknown') | ||
| tp = data.get('transcript_path', '') | ||
| safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s)) | ||
| print(f'SESSION_ID=\"{safe(sid)}\"') | ||
| print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"') | ||
| " 2>/dev/null) |
There was a problem hiding this comment.
Using eval $(...) here increases the blast radius of any parsing/sanitization mistake (it executes code, not just data assignment). Since you only need two scalar values, prefer a non-eval approach (e.g., have Python print NUL/line-delimited values and read them into variables) so the shell never executes generated code.
| # interpolating into shell — same contract as mempal_save_hook.sh. | |
| eval $(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c " | |
| import sys, json, re | |
| data = json.load(sys.stdin) | |
| sid = data.get('session_id', 'unknown') | |
| tp = data.get('transcript_path', '') | |
| safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s)) | |
| print(f'SESSION_ID=\"{safe(sid)}\"') | |
| print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"') | |
| " 2>/dev/null) | |
| # assigning in shell — same contract as mempal_save_hook.sh. | |
| mapfile -t _mempal_parsed < <(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c " | |
| import sys, json, re | |
| data = json.load(sys.stdin) | |
| sid = data.get('session_id', 'unknown') | |
| tp = data.get('transcript_path', '') | |
| safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s)) | |
| print(safe(sid)) | |
| print(safe(tp)) | |
| " 2>/dev/null) | |
| SESSION_ID="${_mempal_parsed[0]:-unknown}" | |
| TRANSCRIPT_PATH="${_mempal_parsed[1]:-}" |
| # Run ingest synchronously so memories land before compaction. Two | ||
| # independent targets — both run if both are set: | ||
| # 1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos | ||
| # 2. MEMPAL_DIR → --mode projects | ||
| if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then |
There was a problem hiding this comment.
Before mining dirname "$TRANSCRIPT_PATH", consider validating that TRANSCRIPT_PATH is actually a transcript file (e.g., require .jsonl/.json) and reject traversal segments like ... The Python hook’s _validate_transcript_path enforces these constraints, but the shell hook currently mines any existing file path the harness provides.
| # Run ingest synchronously so memories land before compaction. Two | |
| # independent targets — both run if both are set: | |
| # 1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos | |
| # 2. MEMPAL_DIR → --mode projects | |
| if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then | |
| is_valid_transcript_path() { | |
| local path="$1" | |
| [ -n "$path" ] || return 1 | |
| case "$path" in | |
| *.json|*.jsonl) ;; | |
| *) return 1 ;; | |
| esac | |
| case "/$path/" in | |
| */../*|*/./*) return 1 ;; | |
| esac | |
| return 0 | |
| } | |
| # Run ingest synchronously so memories land before compaction. Two | |
| # independent targets — both run if both are set: | |
| # 1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos | |
| # 2. MEMPAL_DIR → --mode projects | |
| if is_valid_transcript_path "$TRANSCRIPT_PATH" && [ -f "$TRANSCRIPT_PATH" ]; then |
| mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \ | ||
| >> "$STATE_DIR/hook.log" 2>&1 & |
There was a problem hiding this comment.
The transcript-derived mine target only checks -f "$TRANSCRIPT_PATH" before running, so any existing file path (including ones with .. segments) would be accepted and its parent directory mined. Consider adding lightweight validation (e.g., require .jsonl/.json and reject .. path components) before invoking mempalace mine "$(dirname ...)" --mode convos, to align with the Python hook’s transcript_path validation.
| mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \ | |
| >> "$STATE_DIR/hook.log" 2>&1 & | |
| case "$TRANSCRIPT_PATH" in | |
| *.json|*.jsonl) | |
| case "$TRANSCRIPT_PATH" in | |
| ..|../*|*/../*|*/..) | |
| echo "[$(date '+%H:%M:%S')] Skipping invalid transcript path: $TRANSCRIPT_PATH" >> "$STATE_DIR/hook.log" | |
| ;; | |
| *) | |
| mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \ | |
| >> "$STATE_DIR/hook.log" 2>&1 & | |
| ;; | |
| esac | |
| ;; | |
| *) | |
| echo "[$(date '+%H:%M:%S')] Skipping non-transcript path: $TRANSCRIPT_PATH" >> "$STATE_DIR/hook.log" | |
| ;; | |
| esac |
…review) Address Copilot review on #1231: 1. Stop double-mining the transcript on the Python side. ``_get_mine_targets`` now returns only the ``MEMPAL_DIR`` projects target — the convos target for the transcript dir is dropped because ``_ingest_transcript`` already handles it on every hook fire. The duplicate spawn was using ``sys.executable`` (vs ``_mempalace_python()``) and a different ``--wing``, so each Stop/PreCompact event was writing the same transcript into two wings under asymmetric interpreters and overwriting the single ``_MINE_PID_FILE`` lock. 2. ``_maybe_auto_ingest`` and ``_mine_sync`` now spawn via ``_mempalace_python()`` so the resolved interpreter matches the venv that owns mempalace (matters under GUI-launched harnesses where ``sys.executable`` may resolve to a system Python without chromadb). 3. Replace ``eval $(...)`` in both shell hooks with a ``mapfile``-based reader. Sanitized values are still emitted by the same Python parser, but the shell now does plain variable assignment instead of executing the parser's stdout — smaller blast radius if the sanitizer is ever bypassed. 4. Mirror ``_validate_transcript_path`` in the shell hooks via a ``is_valid_transcript_path`` helper — extension + traversal-segment rejection, parity with the Python validator. The convos mine in each shell hook is now gated on the validator instead of bare ``-f``. 5. Tighten the ``..`` traversal test that previously exercised the suffix gate by mistake (``../../etc/passwd`` lacks ``.json[l]``). Use ``.jsonl`` paths with traversal segments to actually hit the ``..`` rejection branch. 6. README: add a one-liner pointing at ``mempalace sweep`` for users who want per-message recall on top of the file-level chunks the hooks produce. The sweeper was undiscoverable previously. Tests: 1418 passed, 1 skipped (full suite minus benchmarks).
|
Pushed
The shell-side additive structure (separate Tests: |
`bash` on the Windows CI runner resolves to `wsl.exe` which fails with "Windows Subsystem for Linux has no installed distributions." The shell hooks themselves are POSIX-only — Windows users use the Python entry point — so the bash-subprocess exercise is non-applicable on win32. The static-grep validator tests still run on every platform, so the shell-side validation is still asserted under Windows; only the live bash invocation is skipped.
Bumps every version source from 3.3.3 to 3.3.4: - pyproject.toml - mempalace/version.py (canonical) - .claude-plugin/plugin.json - .claude-plugin/marketplace.json - .codex-plugin/plugin.json - README.md badge Dates the CHANGELOG section and adds entries for the bug fixes that landed this cycle (#1135, #1191, #1230, #1231) plus expands the #1194 entry to credit the lookup-side recovery path from #1197. Pre-tag verification: - 1441 passed, 1 skipped (full suite minus benchmarks, all platforms) - ruff check + format clean - 44/44 in test_version_consistency + test_readme_claims (6-file sync) - JPH invariant: pyproject.toml + .claude-plugin/plugin.json both reference mempalace-mcp - Wheel build + fresh-venv install: mempalace --version reports 3.3.4, mempalace-mcp --help works (catches the v3.3.2-class regression)
Previously the save and precompact shell hooks only auto-ingested when
MEMPAL_DIR was set, and even then ran the project-files miner against
the JSONL transcript. Result: the most common configuration in the wild
(MEMPAL_DIR set to a project dir) silently skipped the conversation
mine — the hook would block, the AI would call mempalace_diary_write,
but the raw transcript chunks never reached the palace.
Make the auto-ingest additive instead:
- TRANSCRIPT_PATH (always present from Claude Code) → parent dir,
--mode convos. Always runs when the path passes validation.
- MEMPAL_DIR (user-configured) → --mode projects. Independent target,
runs in addition to (not in place of) the transcript mine.
Adds an `is_valid_transcript_path` helper to both hooks (extension
allowlist + traversal-segment rejection) so junk paths bail before
reaching dirname/the miner — defence-in-depth over the existing
``[ -f ... ]`` check. Also tightens precompact_hook to sanitize
SESSION_ID and read transcript_path the same way the save hook does.
Adapted from upstream MemPalace eb4de04 (PR MemPalace#1231 by @igorls) +
fe56797 (PR MemPalace#1231 review). The Python ``hooks_cli.py`` portion of
those commits is N/A here — this fork ships hooks via Docker Compose,
not the marketplace flow, and the eval-removal in the shell parser
also doesn't apply because the fork already uses three separate
``python3 -c`` calls instead of a single ``eval $(...)``. The
behaviour change (always-mine-transcript) and the validator are the
load-bearing portable parts.
Co-Authored-By: Igor Lins e Silva <[email protected]>
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
22 commits from upstream/develop, including: - HNSW capacity divergence detection + BM25-only sqlite fallback when vector layer is unloadable (MemPalace#1222 / MemPalace#1227 — vector_disabled kwarg routes search through chromadb-bypass path on segfault risk). - HNSW index bloat prevention via batch_size + sync_threshold metadata on collection create (MemPalace#1191). - Hooks: always mine the active transcript as convos, additive to MEMPAL_DIR (MemPalace#1230 / MemPalace#1231). Restructured into `_get_mine_targets()` list approach + separate `_ingest_transcript()`. Daemon-strict gate preserved on each entry point. - Wing-name normalization for hyphenated dirs across miner / convo_miner / palace_graph (MemPalace#1194 / MemPalace#1195 / MemPalace#1197). - `narrow _fix_blob_seq_ids` shim + `repair --mode max-seq-id` for legacy 0.6.x BLOB-poisoned palaces (MemPalace#1135). Conflict resolution: - README.md: kept fork-shaped narrative, dropped upstream's sweep tip injection (per fork-readme handling memory). - hooks/README.md: adopted upstream's accurate `MEMPAL_DIR` additive description; kept `MEMPAL_PYTHON` env-var name (matches actual `hooks/*.sh` scripts in both forks). - mempalace/cli.py: consolidated duplicate `--mode` argparse declarations into one with all 4 choices (rebuild/legacy/reorganize/max-seq-id). - mempalace/hooks_cli.py: adopted upstream's `_get_mine_targets()` + `_ingest_transcript()` shape; added `_daemon_strict()` guard at entry of `_maybe_auto_ingest`, `_mine_sync`, and `_ingest_transcript` so the daemon-strict architecture still skips local writes. - mempalace/mcp_server.py: kept both `kind=` and `vector_disabled=` kwargs on the `search_memories` call. - mempalace/searcher.py: kept fork's `_count_in_scope`, `_sqlite_fallback_and_scope`, `_apply_kind_text_filter` AND upstream's `_bm25_only_via_sqlite`. Both `kind` and `vector_disabled` parameters on `search_memories`. Tests: 1510 passing (up from 1366 — upstream brought new test suites).
…alace#1231 review) Address Copilot review on MemPalace#1231: 1. Stop double-mining the transcript on the Python side. ``_get_mine_targets`` now returns only the ``MEMPAL_DIR`` projects target — the convos target for the transcript dir is dropped because ``_ingest_transcript`` already handles it on every hook fire. The duplicate spawn was using ``sys.executable`` (vs ``_mempalace_python()``) and a different ``--wing``, so each Stop/PreCompact event was writing the same transcript into two wings under asymmetric interpreters and overwriting the single ``_MINE_PID_FILE`` lock. 2. ``_maybe_auto_ingest`` and ``_mine_sync`` now spawn via ``_mempalace_python()`` so the resolved interpreter matches the venv that owns mempalace (matters under GUI-launched harnesses where ``sys.executable`` may resolve to a system Python without chromadb). 3. Replace ``eval $(...)`` in both shell hooks with a ``mapfile``-based reader. Sanitized values are still emitted by the same Python parser, but the shell now does plain variable assignment instead of executing the parser's stdout — smaller blast radius if the sanitizer is ever bypassed. 4. Mirror ``_validate_transcript_path`` in the shell hooks via a ``is_valid_transcript_path`` helper — extension + traversal-segment rejection, parity with the Python validator. The convos mine in each shell hook is now gated on the validator instead of bare ``-f``. 5. Tighten the ``..`` traversal test that previously exercised the suffix gate by mistake (``../../etc/passwd`` lacks ``.json[l]``). Use ``.jsonl`` paths with traversal segments to actually hit the ``..`` rejection branch. 6. README: add a one-liner pointing at ``mempalace sweep`` for users who want per-message recall on top of the file-level chunks the hooks produce. The sweeper was undiscoverable previously. Tests: 1418 passed, 1 skipped (full suite minus benchmarks).
Catches up xdev-patches with 112 commits from MemPalace/develop, including: - v3.3.4 release - MemPalace#1262/MemPalace#1289 ChromaDB collection-reopen crash fix (relevant to long-running MCP server & mempalace-api) - MemPalace#1287 HNSW divergence floor fix - MemPalace#1288 BLOB seq_id decode in repair - MemPalace#1180 cross-wing tunnels by shared topics - MemPalace#1194 wing-slug normalization for hyphenated dirs Conflict resolution: hooks_cli.py and mcp_server.py both had local patches (6ef44cb route CC transcripts via convo_miner; 3fad61d allow leading dash) that overlap with upstream fixes (MemPalace#1231, MemPalace#1194). Took upstream entirely on those two files — upstream's version handles separate transcript/project ingest, uses _mempalace_python(), and adds _pin_hnsw_threads. The local config.py regex relaxation auto-merged cleanly and is preserved. Safety tag: pre-upstream-merge-20260501-091227 (rollback target). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Catches up on a month of upstream work. Highlights pulled in: - MemPalace#1306 searcher: hybrid candidate union (vector ∪ BM25 reranking pool) - MemPalace#1325 mcp security: omit absolute paths from tool_get_drawer / tool_status - MemPalace#1322 chroma: wire quarantine_stale_hnsw to prevent SIGSEGV on stale HNSW - MemPalace#1320 mcp: forward valid_to / source params in kg_add / kg_invalidate - MemPalace#1321 cli: honor --palace flag in cmd_init - MemPalace#1314 kg temporal params fix - MemPalace#1244 cli: cmd_compress writes to mempalace_closets so palace can read - MemPalace#1243 mcp: case-insensitive agent name in diary_write/diary_read - MemPalace#1303 mcp_server: pass embedding_function= on collection reopen - MemPalace#1076/MemPalace#1077 hooks: quote CLAUDE_PLUGIN_ROOT / CODEX_PLUGIN_ROOT in hooks.json - Various ruff format passes on touched files Conflict resolution (CHANGELOG.md only — code files all auto-merged): - 3.3.5 unreleased section header from upstream kept above 3.3.4 - 3.3.4 section: kept our 2026-04-30 release date; merged upstream's new MemPalace#1299 SIGSEGV-on-default-EF entry in alongside our existing topic-tunnels (MemPalace#1194/MemPalace#1195/MemPalace#1197), HNSW-bloat (MemPalace#1191), max_seq_id (MemPalace#1135), and auto-ingest (MemPalace#1230/MemPalace#1231) entries. Kept our richer topic-tunnels detail (upstream's version was a strict subset). xdev patches preserved (still on this branch, untouched by merge): - 6ef44cb fix(hooks): route CC transcripts via convo_miner with cwd-based wings - 3fad61d fix(config): allow leading dash in wing names Not pushed to origin — run tests locally and decide when to push. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Summary
Follow-up to #1230. That PR fixed
--mode convosfor one specific case: a Python hook user withMEMPAL_DIRunset. Two configurations were still silently broken:MEMPAL_DIRset to a project directory — convos never mined; the hook ran the projects miner onMEMPAL_DIRand ignored the transcript path entirely.MEMPAL_DIRset to a conversations directory (per the oldhooks/README.mdrecommendation) — the projects miner ran on JSONL, ingesting it as if it were source code.Both shell hooks (
hooks/mempal_save_hook.sh,hooks/mempal_precompact_hook.sh) had the sameMEMPAL_DIR-overrides-TRANSCRIPT_PATHbug and were missing--modeon every spawnedmempalace mine. The PreCompact shell hook didn't even readtranscript_pathfrom stdin.This PR closes all of those.
Approach
Make the auto-ingest additive, not "MEMPAL_DIR overrides transcript":
_get_mine_diris renamed to_get_mine_targetsand now returnslist[tuple[str, str]].MEMPAL_DIR(when valid) contributes(dir, "projects").(parent_dir, "convos").The shell save and precompact hooks get the same dual-target structure (each writes one independent
mempalace mine ... --mode <mode>call). The precompact shell hook also gainedtranscript_pathparsing so it can run the convos mine synchronously before context is compressed.hooks/README.mdupdated to describeMEMPAL_DIRaccurately: it is a project files target, never a convos target. The active conversation transcript is mined automatically regardless ofMEMPAL_DIR.Why this exists
Milla flagged that users had been told their conversations were stored verbatim, but with the original
MEMPAL_DIR-overrides-transcript semantics + the missing--modeflag, that was only true for one of four common configurations. After this PR, the auto-mine path produces verbatim convo drawers for every supported hook configuration.Stack
This branch was cut from
fix/hooks-convos-mode-1232(#1230). The two are textually adjacent but logically independent — either can land first; if #1230 lands first, this rebases cleanly.Test plan
_get_mine_targetscovers all four cases (none / env-only / transcript-only / both)test_maybe_auto_ingest_with_both_setasserts twoPopenspawns with the right--modeper spawntest_mine_sync_with_both_setasserts the same for the precompact-sync pathtest_hook_mines_transcript_pathupdated to assertdirname "$TRANSCRIPT_PATH"+--mode convosare present in the shell hook (the previous assertion requiredMINE_DIR, which no longer exists)1416 passed, 1 skippedruff check+ruff format --checkclean