fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR by igorls · Pull Request #1231 · MemPalace/mempalace

igorls · 2026-04-27T03:33:01Z

Summary

Follow-up to #1230. That PR fixed --mode convos for one specific case: a Python hook user with MEMPAL_DIR unset. Two configurations were still silently broken:

MEMPAL_DIR set to a project directory — convos never mined; the hook ran the projects miner on MEMPAL_DIR and ignored the transcript path entirely.
MEMPAL_DIR set to a conversations directory (per the old hooks/README.md recommendation) — the projects miner ran on JSONL, ingesting it as if it were source code.

Both shell hooks (hooks/mempal_save_hook.sh, hooks/mempal_precompact_hook.sh) had the same MEMPAL_DIR-overrides-TRANSCRIPT_PATH bug and were missing --mode on every spawned mempalace mine. The PreCompact shell hook didn't even read transcript_path from stdin.

This PR closes all of those.

Approach

Make the auto-ingest additive, not "MEMPAL_DIR overrides transcript":

_get_mine_dir is renamed to _get_mine_targets and now returns list[tuple[str, str]].
MEMPAL_DIR (when valid) contributes (dir, "projects").
A valid transcript JSONL contributes (parent_dir, "convos").
Both can appear together — the hook then spawns one ingest per target.

The shell save and precompact hooks get the same dual-target structure (each writes one independent mempalace mine ... --mode <mode> call). The precompact shell hook also gained transcript_path parsing so it can run the convos mine synchronously before context is compressed.

hooks/README.md updated to describe MEMPAL_DIR accurately: it is a project files target, never a convos target. The active conversation transcript is mined automatically regardless of MEMPAL_DIR.

Why this exists

Milla flagged that users had been told their conversations were stored verbatim, but with the original MEMPAL_DIR-overrides-transcript semantics + the missing --mode flag, that was only true for one of four common configurations. After this PR, the auto-mine path produces verbatim convo drawers for every supported hook configuration.

Stack

This branch was cut from fix/hooks-convos-mode-1232 (#1230). The two are textually adjacent but logically independent — either can land first; if #1230 lands first, this rebases cleanly.

Test plan

_get_mine_targets covers all four cases (none / env-only / transcript-only / both)
New test_maybe_auto_ingest_with_both_set asserts two Popen spawns with the right --mode per spawn
New test_mine_sync_with_both_set asserts the same for the precompact-sync path
test_hook_mines_transcript_path updated to assert dirname "$TRANSCRIPT_PATH" + --mode convos are present in the shell hook (the previous assertion required MINE_DIR, which no longer exists)
Full suite green: 1416 passed, 1 skipped
ruff check + ruff format --check clean

The Stop and PreCompact hooks spawn `mempalace mine <dir>` with no `--mode` flag, which defaults to `projects` in cli.py. When MEMPAL_DIR is unset, _get_mine_dir falls back to the parent of the transcript JSONL — and miner.py's READABLE_EXTENSIONS includes `.jsonl`, so the projects miner happily ingests Claude Code session JSONL as if it were source code instead of conversation. Make _get_mine_dir return (dir, mode): MEMPAL_DIR keeps `projects`, the JSONL fallback yields `convos`. Both _maybe_auto_ingest and _mine_sync now thread the mode into the spawned command.

- Normalize MEMPAL_DIR via Path.expanduser().resolve() so ~/proj paths are correctly accepted instead of falling through to transcript fallback - Replace bare Path.expanduser().is_file() transcript check with the existing _validate_transcript_path() which adds .resolve(), enforces .jsonl/.json extension, and rejects '..' path-traversal components - Update tests to compare resolved paths (cross-platform correctness) - Add tests for tilde expansion, path-traversal rejection, and non-jsonl extension rejection in _get_mine_dir Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/f69176c7-d752-40ef-ba71-d0e4adc3a689 Co-authored-by: igorls <[email protected]>

…MEMPAL_DIR #1230 fixed --mode convos for the case where MEMPAL_DIR was unset, but left two configurations broken: - MEMPAL_DIR set to a project dir: convos never mined (MEMPAL_DIR overrode the transcript path); only project files were ingested. - MEMPAL_DIR set to a conversations dir per the old hooks/README: the projects miner ran on JSONL — same wrong-miner behaviour. The shell hooks (mempal_save_hook.sh, mempal_precompact_hook.sh) had the same MEMPAL_DIR-overrides-transcript bug AND were missing --mode on every spawned `mempalace mine` call. Make the auto-ingest *additive*. _get_mine_dir → _get_mine_targets, returning a list of (dir, mode) pairs: - MEMPAL_DIR (when valid) contributes (dir, "projects") - A valid transcript JSONL contributes (parent, "convos") - Both can appear together; the hook spawns one ingest per target Same change applied to the shell save and precompact hooks. Precompact also gained transcript_path parsing so it can run the convos mine synchronously before context is compressed. hooks/README.md updated to describe MEMPAL_DIR as a project-files target, never a convos target.

Copilot

Pull request overview

This PR fixes MemPalace Claude Code/Codex hooks so the active transcript is always mined as conversations (--mode convos) while MEMPAL_DIR (when set) becomes an additional project-files mining target (--mode projects), rather than overriding transcript ingestion.

Changes:

Refactors Python hook auto-ingest targeting from a single directory to a list of (dir, mode) targets and spawns one mine per target.
Updates shell save/precompact hooks to mine transcript parent dir with --mode convos and mine MEMPAL_DIR with --mode projects.
Updates tests and hook documentation to reflect the additive targeting semantics.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`mempalace/hooks_cli.py`	Switches auto-ingest to multi-target `(dir, mode)` and spawns mines per target.
`hooks/mempal_save_hook.sh`	Mines transcript dir as convos and `MEMPAL_DIR` as projects (both optional/additive).
`hooks/mempal_precompact_hook.sh`	Adds transcript_path parsing and mines transcript synchronously as convos before compaction.
`hooks/README.md`	Clarifies `MEMPAL_DIR` is for project files only; transcript mining is automatic.
`tests/test_hooks_cli.py`	Adds/updates tests for multi-target spawning and mode correctness.
`tests/test_save_hook_mines.py`	Updates assertions to require transcript-driven `--mode convos` mining in the save hook.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T03:39:06Z

+    if mempal_dir:
+        resolved = Path(mempal_dir).expanduser().resolve()
+        if resolved.is_dir():
+            targets.append((str(resolved), "projects"))
+    path = _validate_transcript_path(transcript_path)
+    if path is not None and path.is_file():
+        targets.append((str(path.parent), "convos"))
+    return targets


_get_mine_targets() adds a convos target for the transcript directory, but hook_stop()/hook_precompact() already call _ingest_transcript(transcript_path), which spawns a convos mine as well (with a different --wing). This can result in duplicate conversation ingestion (and potentially into different wings), plus extra load on every save/precompact. Consider consolidating to a single transcript-ingest path (either remove the transcript-derived target here and leave transcript mining to _ingest_transcript, or remove/adjust _ingest_transcript and ensure the spawned mine uses the intended --wing consistently).

Copilot · 2026-04-27T03:39:06Z

 def _maybe_auto_ingest(transcript_path: str = ""):
-    """Run mempalace mine in background if a mine directory is available."""
-    mine_dir = _get_mine_dir(transcript_path)
-    if not mine_dir:
+    """Run mempalace mine in background for every available target."""
+    targets = _get_mine_targets(transcript_path)
+    if not targets:
        return
    if _mine_already_running():
        _log("Skipping auto-ingest: mine already running")
        return
-    try:
-        _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir])
-    except OSError:
-        pass
+    for mine_dir, mode in targets:
+        try:
+            _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir, "--mode", mode])
+        except OSError:
+            pass


_maybe_auto_ingest() now spawns multiple background mempalace mine processes when both targets are present, but _spawn_mine() writes a single PID to _MINE_PID_FILE each time. The second spawn overwrites the first PID, so _mine_already_running() can return false while the first mine is still running, allowing overlapping mines on later hook fires. Consider tracking multiple PIDs (e.g., one PID file per target/mode) or switching to a filesystem lock (flock/portalocker) rather than a single-PID sentinel when spawning more than one process.

Copilot · 2026-04-27T03:39:06Z

+    for mine_dir, mode in targets:
+        try:
+            _spawn_mine([sys.executable, "-m", "mempalace", "mine", mine_dir, "--mode", mode])
+        except OSError:


The subprocesses here are spawned via sys.executable, but the module-level _mempalace_python() docstring explicitly calls out that sys.executable may be a system Python without chromadb when invoked by hooks. To avoid background/sync mines failing silently in those environments, consider using _mempalace_python() for the interpreter in these spawned mempalace mine commands.

Copilot · 2026-04-27T03:39:06Z

 def _mine_sync(transcript_path: str = ""):
-    """Run mempalace mine synchronously (for precompact -- data must land first)."""
-    mine_dir = _get_mine_dir(transcript_path)
-    if not mine_dir:
+    """Run mempalace mine synchronously for every target (precompact)."""
+    targets = _get_mine_targets(transcript_path)
+    if not targets:
        return
-    try:
-        STATE_DIR.mkdir(parents=True, exist_ok=True)
-        log_path = STATE_DIR / "hook.log"
-        with open(log_path, "a") as log_f:
-            subprocess.run(
-                [sys.executable, "-m", "mempalace", "mine", mine_dir],
-                stdout=log_f,
-                stderr=log_f,
-                timeout=60,
-            )
-    except (OSError, subprocess.TimeoutExpired):
-        pass
+    STATE_DIR.mkdir(parents=True, exist_ok=True)
+    log_path = STATE_DIR / "hook.log"
+    for mine_dir, mode in targets:
+        try:
+            with open(log_path, "a") as log_f:
+                subprocess.run(
+                    [
+                        sys.executable,
+                        "-m",
+                        "mempalace",
+                        "mine",
+                        mine_dir,
+                        "--mode",
+                        mode,
+                    ],
+                    stdout=log_f,
+                    stderr=log_f,
+                    timeout=60,
+                )
+        except (OSError, subprocess.TimeoutExpired):


_mine_sync() now potentially runs two separate mempalace mine commands sequentially (projects + convos), each with timeout=60. In the “both set” case this can push total hook runtime well past common harness hook timeouts (e.g., 30s), and if it times out mid-way the more important transcript ingest might not run. Consider prioritizing the transcript/convos target first (or running the projects ingest async / with a smaller timeout) so precompact reliably captures the active transcript before compaction proceeds.

Copilot · 2026-04-27T03:39:07Z

+def test_get_mine_targets_transcript_path_traversal_rejected(tmp_path):
+    """Transcript paths with '..' components are rejected (no convos target)."""
+    with patch.dict("os.environ", {}, clear=True):
+        targets = _get_mine_targets("../../etc/passwd")


This test claims to verify '..' path traversal rejection, but the provided input ("../../etc/passwd") lacks a .json/.jsonl suffix, so it is rejected by the extension check instead and doesn't actually exercise the traversal guard. To test the traversal behavior specifically, use a .jsonl/.json path that includes '..' components (e.g. "../t.jsonl") and assert that it yields no convos target.

Suggested change

targets = _get_mine_targets("../../etc/passwd")

targets = _get_mine_targets("../t.jsonl")

Copilot · 2026-04-27T03:39:07Z

+# interpolating into shell — same contract as mempal_save_hook.sh.
+eval $(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c "
+import sys, json, re
+data = json.load(sys.stdin)
+sid = data.get('session_id', 'unknown')
+tp = data.get('transcript_path', '')
+safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s))
+print(f'SESSION_ID=\"{safe(sid)}\"')
+print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"')
+" 2>/dev/null)


Using eval $(...) here increases the blast radius of any parsing/sanitization mistake (it executes code, not just data assignment). Since you only need two scalar values, prefer a non-eval approach (e.g., have Python print NUL/line-delimited values and read them into variables) so the shell never executes generated code.

Suggested change

# interpolating into shell — same contract as mempal_save_hook.sh.

eval $(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c "

import sys, json, re

data = json.load(sys.stdin)

sid = data.get('session_id', 'unknown')

tp = data.get('transcript_path', '')

safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s))

print(f'SESSION_ID=\"{safe(sid)}\"')

print(f'TRANSCRIPT_PATH=\"{safe(tp)}\"')

" 2>/dev/null)

# assigning in shell — same contract as mempal_save_hook.sh.

mapfile -t _mempal_parsed < <(echo "$INPUT" | "$MEMPAL_PYTHON_BIN" -c "

import sys, json, re

data = json.load(sys.stdin)

sid = data.get('session_id', 'unknown')

tp = data.get('transcript_path', '')

safe = lambda s: re.sub(r'[^a-zA-Z0-9_/.\-~]', '', str(s))

print(safe(sid))

print(safe(tp))

" 2>/dev/null)

SESSION_ID="${_mempal_parsed[0]:-unknown}"

TRANSCRIPT_PATH="${_mempal_parsed[1]:-}"

Copilot · 2026-04-27T03:39:07Z

+# Run ingest synchronously so memories land before compaction. Two
+# independent targets — both run if both are set:
+#   1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos
+#   2. MEMPAL_DIR → --mode projects
+if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then


Before mining dirname "$TRANSCRIPT_PATH", consider validating that TRANSCRIPT_PATH is actually a transcript file (e.g., require .jsonl/.json) and reject traversal segments like ... The Python hook’s _validate_transcript_path enforces these constraints, but the shell hook currently mines any existing file path the harness provides.

Suggested change

# Run ingest synchronously so memories land before compaction. Two

# independent targets — both run if both are set:

# 1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos

# 2. MEMPAL_DIR → --mode projects

if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then

is_valid_transcript_path() {

local path="$1"

[ -n "$path" ] || return 1

case "$path" in

*.json|*.jsonl) ;;

*) return 1 ;;

esac

case "/$path/" in

*/../*|*/./*) return 1 ;;

esac

return 0

}

# Run ingest synchronously so memories land before compaction. Two

# independent targets — both run if both are set:

# 1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos

# 2. MEMPAL_DIR → --mode projects

if is_valid_transcript_path "$TRANSCRIPT_PATH" && [ -f "$TRANSCRIPT_PATH" ]; then

Copilot · 2026-04-27T03:39:08Z

+        mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \
+            >> "$STATE_DIR/hook.log" 2>&1 &


The transcript-derived mine target only checks -f "$TRANSCRIPT_PATH" before running, so any existing file path (including ones with .. segments) would be accepted and its parent directory mined. Consider adding lightweight validation (e.g., require .jsonl/.json and reject .. path components) before invoking mempalace mine "$(dirname ...)" --mode convos, to align with the Python hook’s transcript_path validation.

Suggested change

mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \

>> "$STATE_DIR/hook.log" 2>&1 &

case "$TRANSCRIPT_PATH" in

*.json|*.jsonl)

case "$TRANSCRIPT_PATH" in

..|../*|*/../*|*/..)

echo "[$(date '+%H:%M:%S')] Skipping invalid transcript path: $TRANSCRIPT_PATH" >> "$STATE_DIR/hook.log"

;;

*)

mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \

>> "$STATE_DIR/hook.log" 2>&1 &

;;

esac

;;

*)

echo "[$(date '+%H:%M:%S')] Skipping non-transcript path: $TRANSCRIPT_PATH" >> "$STATE_DIR/hook.log"

;;

esac

…review) Address Copilot review on #1231: 1. Stop double-mining the transcript on the Python side. ``_get_mine_targets`` now returns only the ``MEMPAL_DIR`` projects target — the convos target for the transcript dir is dropped because ``_ingest_transcript`` already handles it on every hook fire. The duplicate spawn was using ``sys.executable`` (vs ``_mempalace_python()``) and a different ``--wing``, so each Stop/PreCompact event was writing the same transcript into two wings under asymmetric interpreters and overwriting the single ``_MINE_PID_FILE`` lock. 2. ``_maybe_auto_ingest`` and ``_mine_sync`` now spawn via ``_mempalace_python()`` so the resolved interpreter matches the venv that owns mempalace (matters under GUI-launched harnesses where ``sys.executable`` may resolve to a system Python without chromadb). 3. Replace ``eval $(...)`` in both shell hooks with a ``mapfile``-based reader. Sanitized values are still emitted by the same Python parser, but the shell now does plain variable assignment instead of executing the parser's stdout — smaller blast radius if the sanitizer is ever bypassed. 4. Mirror ``_validate_transcript_path`` in the shell hooks via a ``is_valid_transcript_path`` helper — extension + traversal-segment rejection, parity with the Python validator. The convos mine in each shell hook is now gated on the validator instead of bare ``-f``. 5. Tighten the ``..`` traversal test that previously exercised the suffix gate by mistake (``../../etc/passwd`` lacks ``.json[l]``). Use ``.jsonl`` paths with traversal segments to actually hit the ``..`` rejection branch. 6. README: add a one-liner pointing at ``mempalace sweep`` for users who want per-message recall on top of the file-level chunks the hooks produce. The sweeper was undiscoverable previously. Tests: 1418 passed, 1 skipped (full suite minus benchmarks).

igorls · 2026-04-27T05:27:10Z

Pushed fe56797 addressing the Copilot review:

Removed the duplicate transcript target on the Python side. _get_mine_targets now returns only the MEMPAL_DIR projects target. Transcript ingestion stays in _ingest_transcript, which was already firing on every hook event. Side effect: the PID-file overwrite (Copilot Integrating MemPalace with SoulForge's code intelligence system #2) and the 60s+60s sync timeout stack (Copilot Using AAAK as language for agents #4) are gone — Python path now spawns at most one _spawn_mine per call.
Switched the spawned mempalace mine interpreter to _mempalace_python() in both _maybe_auto_ingest and _mine_sync (Copilot feat: Hermes memory provider integration #3).
Replaced eval $(...) in both shell hooks with a mapfile reader against one-value-per-line sanitized stdout (Copilot fix: handle Windows file-lock errors in test cleanup #6).
Mirrored _validate_transcript_path in the shell hooks as is_valid_transcript_path and gated the convos mine on it (Copilot I see in the doc it explains how to collaborate with others people. But how to set this between people with different devices #7, Non-interactive mode for all commands (agent-friendly) #8).
Tightened the traversal test to use a .jsonl path with .. segments so the traversal branch is actually exercised (Copilot feat: add OpenAI Codex CLI JSONL normalizer #5).

The shell-side additive structure (separate MEMPAL_DIR and transcript mines) is preserved — the shell hooks have no _ingest_transcript equivalent, so the additive shape is correct there.

Tests: 1418 passed, 1 skipped (full suite minus benchmarks); ruff check + format clean.

`bash` on the Windows CI runner resolves to `wsl.exe` which fails with "Windows Subsystem for Linux has no installed distributions." The shell hooks themselves are POSIX-only — Windows users use the Python entry point — so the bash-subprocess exercise is non-applicable on win32. The static-grep validator tests still run on every platform, so the shell-side validation is still asserted under Windows; only the live bash invocation is skipped.

Bumps every version source from 3.3.3 to 3.3.4: - pyproject.toml - mempalace/version.py (canonical) - .claude-plugin/plugin.json - .claude-plugin/marketplace.json - .codex-plugin/plugin.json - README.md badge Dates the CHANGELOG section and adds entries for the bug fixes that landed this cycle (#1135, #1191, #1230, #1231) plus expands the #1194 entry to credit the lookup-side recovery path from #1197. Pre-tag verification: - 1441 passed, 1 skipped (full suite minus benchmarks, all platforms) - ruff check + format clean - 44/44 in test_version_consistency + test_readme_claims (6-file sync) - JPH invariant: pyproject.toml + .claude-plugin/plugin.json both reference mempalace-mcp - Wheel build + fresh-venv install: mempalace --version reports 3.3.4, mempalace-mcp --help works (catches the v3.3.2-class regression)

@igorls

Previously the save and precompact shell hooks only auto-ingested when MEMPAL_DIR was set, and even then ran the project-files miner against the JSONL transcript. Result: the most common configuration in the wild (MEMPAL_DIR set to a project dir) silently skipped the conversation mine — the hook would block, the AI would call mempalace_diary_write, but the raw transcript chunks never reached the palace. Make the auto-ingest additive instead: - TRANSCRIPT_PATH (always present from Claude Code) → parent dir, --mode convos. Always runs when the path passes validation. - MEMPAL_DIR (user-configured) → --mode projects. Independent target, runs in addition to (not in place of) the transcript mine. Adds an `is_valid_transcript_path` helper to both hooks (extension allowlist + traversal-segment rejection) so junk paths bail before reaching dirname/the miner — defence-in-depth over the existing ``[ -f ... ]`` check. Also tightens precompact_hook to sanitize SESSION_ID and read transcript_path the same way the save hook does. Adapted from upstream MemPalace eb4de04 (PR MemPalace#1231 by @igorls) + fe56797 (PR MemPalace#1231 review). The Python ``hooks_cli.py`` portion of those commits is N/A here — this fork ships hooks via Docker Compose, not the marketplace flow, and the eval-removal in the shell parser also doesn't apply because the fork already uses three separate ``python3 -c`` calls instead of a single ``eval $(...)``. The behaviour change (always-mine-transcript) and the validator are the load-bearing portable parts. Co-Authored-By: Igor Lins e Silva <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

22 commits from upstream/develop, including: - HNSW capacity divergence detection + BM25-only sqlite fallback when vector layer is unloadable (MemPalace#1222 / MemPalace#1227 — vector_disabled kwarg routes search through chromadb-bypass path on segfault risk). - HNSW index bloat prevention via batch_size + sync_threshold metadata on collection create (MemPalace#1191). - Hooks: always mine the active transcript as convos, additive to MEMPAL_DIR (MemPalace#1230 / MemPalace#1231). Restructured into `_get_mine_targets()` list approach + separate `_ingest_transcript()`. Daemon-strict gate preserved on each entry point. - Wing-name normalization for hyphenated dirs across miner / convo_miner / palace_graph (MemPalace#1194 / MemPalace#1195 / MemPalace#1197). - `narrow _fix_blob_seq_ids` shim + `repair --mode max-seq-id` for legacy 0.6.x BLOB-poisoned palaces (MemPalace#1135). Conflict resolution: - README.md: kept fork-shaped narrative, dropped upstream's sweep tip injection (per fork-readme handling memory). - hooks/README.md: adopted upstream's accurate `MEMPAL_DIR` additive description; kept `MEMPAL_PYTHON` env-var name (matches actual `hooks/*.sh` scripts in both forks). - mempalace/cli.py: consolidated duplicate `--mode` argparse declarations into one with all 4 choices (rebuild/legacy/reorganize/max-seq-id). - mempalace/hooks_cli.py: adopted upstream's `_get_mine_targets()` + `_ingest_transcript()` shape; added `_daemon_strict()` guard at entry of `_maybe_auto_ingest`, `_mine_sync`, and `_ingest_transcript` so the daemon-strict architecture still skips local writes. - mempalace/mcp_server.py: kept both `kind=` and `vector_disabled=` kwargs on the `search_memories` call. - mempalace/searcher.py: kept fork's `_count_in_scope`, `_sqlite_fallback_and_scope`, `_apply_kind_text_filter` AND upstream's `_bm25_only_via_sqlite`. Both `kind` and `vector_disabled` parameters on `search_memories`. Tests: 1510 passing (up from 1366 — upstream brought new test suites).

…alace#1231 review) Address Copilot review on MemPalace#1231: 1. Stop double-mining the transcript on the Python side. ``_get_mine_targets`` now returns only the ``MEMPAL_DIR`` projects target — the convos target for the transcript dir is dropped because ``_ingest_transcript`` already handles it on every hook fire. The duplicate spawn was using ``sys.executable`` (vs ``_mempalace_python()``) and a different ``--wing``, so each Stop/PreCompact event was writing the same transcript into two wings under asymmetric interpreters and overwriting the single ``_MINE_PID_FILE`` lock. 2. ``_maybe_auto_ingest`` and ``_mine_sync`` now spawn via ``_mempalace_python()`` so the resolved interpreter matches the venv that owns mempalace (matters under GUI-launched harnesses where ``sys.executable`` may resolve to a system Python without chromadb). 3. Replace ``eval $(...)`` in both shell hooks with a ``mapfile``-based reader. Sanitized values are still emitted by the same Python parser, but the shell now does plain variable assignment instead of executing the parser's stdout — smaller blast radius if the sanitizer is ever bypassed. 4. Mirror ``_validate_transcript_path`` in the shell hooks via a ``is_valid_transcript_path`` helper — extension + traversal-segment rejection, parity with the Python validator. The convos mine in each shell hook is now gated on the validator instead of bare ``-f``. 5. Tighten the ``..`` traversal test that previously exercised the suffix gate by mistake (``../../etc/passwd`` lacks ``.json[l]``). Use ``.jsonl`` paths with traversal segments to actually hit the ``..`` rejection branch. 6. README: add a one-liner pointing at ``mempalace sweep`` for users who want per-message recall on top of the file-level chunks the hooks produce. The sweeper was undiscoverable previously. Tests: 1418 passed, 1 skipped (full suite minus benchmarks).

Catches up xdev-patches with 112 commits from MemPalace/develop, including: - v3.3.4 release - MemPalace#1262/MemPalace#1289 ChromaDB collection-reopen crash fix (relevant to long-running MCP server & mempalace-api) - MemPalace#1287 HNSW divergence floor fix - MemPalace#1288 BLOB seq_id decode in repair - MemPalace#1180 cross-wing tunnels by shared topics - MemPalace#1194 wing-slug normalization for hyphenated dirs Conflict resolution: hooks_cli.py and mcp_server.py both had local patches (6ef44cb route CC transcripts via convo_miner; 3fad61d allow leading dash) that overlap with upstream fixes (MemPalace#1231, MemPalace#1194). Took upstream entirely on those two files — upstream's version handles separate transcript/project ingest, uses _mempalace_python(), and adds _pin_hnsw_threads. The local config.py regex relaxation auto-merged cleanly and is preserved. Safety tag: pre-upstream-merge-20260501-091227 (rollback target). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Catches up on a month of upstream work. Highlights pulled in: - MemPalace#1306 searcher: hybrid candidate union (vector ∪ BM25 reranking pool) - MemPalace#1325 mcp security: omit absolute paths from tool_get_drawer / tool_status - MemPalace#1322 chroma: wire quarantine_stale_hnsw to prevent SIGSEGV on stale HNSW - MemPalace#1320 mcp: forward valid_to / source params in kg_add / kg_invalidate - MemPalace#1321 cli: honor --palace flag in cmd_init - MemPalace#1314 kg temporal params fix - MemPalace#1244 cli: cmd_compress writes to mempalace_closets so palace can read - MemPalace#1243 mcp: case-insensitive agent name in diary_write/diary_read - MemPalace#1303 mcp_server: pass embedding_function= on collection reopen - MemPalace#1076/MemPalace#1077 hooks: quote CLAUDE_PLUGIN_ROOT / CODEX_PLUGIN_ROOT in hooks.json - Various ruff format passes on touched files Conflict resolution (CHANGELOG.md only — code files all auto-merged): - 3.3.5 unreleased section header from upstream kept above 3.3.4 - 3.3.4 section: kept our 2026-04-30 release date; merged upstream's new MemPalace#1299 SIGSEGV-on-default-EF entry in alongside our existing topic-tunnels (MemPalace#1194/MemPalace#1195/MemPalace#1197), HNSW-bloat (MemPalace#1191), max_seq_id (MemPalace#1135), and auto-ingest (MemPalace#1230/MemPalace#1231) entries. Kept our richer topic-tunnels detail (upstream's version was a strict subset). xdev patches preserved (still on this branch, untouched by merge): - 6ef44cb fix(hooks): route CC transcripts via convo_miner with cwd-based wings - 3fad61d fix(config): allow leading dash in wing names Not pushed to origin — run tests locally and decide when to push. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

igorls and others added 3 commits April 26, 2026 23:25

Copilot AI review requested due to automatic review settings April 27, 2026 03:33

igorls requested review from bensig and milla-jovovich as code owners April 27, 2026 03:33

Copilot started reviewing on behalf of igorls April 27, 2026 03:33 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

igorls merged commit bc5d3fa into develop Apr 27, 2026
6 checks passed

igorls mentioned this pull request Apr 27, 2026

chore(release): v3.3.4 #1232

Merged

This was referenced May 2, 2026

fix(hooks): drop transcript-parent fallback that caused unbounded ingest #1199

Closed

Stop + PreCompact hooks auto-run mempalace mine on chat transcript parent with default flags → polluted mega-wing, no opt-out #1083

Closed

rmdes mentioned this pull request May 3, 2026

Stop hook: 1.9 TB palace bloat + ChromaDB Rust bindings segfault despite #1231 fix #1329

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR#1231

fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR#1231
igorls merged 5 commits intodevelopfrom
fix/hooks-convos-additive-mining

igorls commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

igorls commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	targets = _get_mine_targets("../../etc/passwd")
	targets = _get_mine_targets("../t.jsonl")

-# Run ingest synchronously so memories land before compaction. Two
-# independent targets — both run if both are set:
-#   1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos
-#   2. MEMPAL_DIR → --mode projects
-if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then
+is_valid_transcript_path() {
+    local path="$1"
+    [ -n "$path" ] || return 1
+    case "$path" in
+        *.json|*.jsonl) ;;
+        *) return 1 ;;
+    esac
+    case "/$path/" in
+        */../*|*/./*) return 1 ;;
+    esac
+    return 0
+}
+# Run ingest synchronously so memories land before compaction. Two
+# independent targets — both run if both are set:
+#   1. TRANSCRIPT_PATH (from Claude Code) → parent dir, --mode convos
+#   2. MEMPAL_DIR → --mode projects
+if is_valid_transcript_path "$TRANSCRIPT_PATH" && [ -f "$TRANSCRIPT_PATH" ]; then

		mempalace mine "$(dirname "$TRANSCRIPT_PATH")" --mode convos \
		>> "$STATE_DIR/hook.log" 2>&1 &

Conversation

igorls commented Apr 27, 2026

Summary

Approach

Why this exists

Stack

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

igorls commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants