Skip to content

chore(release): v3.3.4#1232

Merged
milla-jovovich merged 122 commits intomainfrom
chore/release-3.3.4-prep
May 1, 2026
Merged

chore(release): v3.3.4#1232
milla-jovovich merged 122 commits intomainfrom
chore/release-3.3.4-prep

Conversation

@igorls
Copy link
Copy Markdown
Member

@igorls igorls commented Apr 27, 2026

Release prep for v3.3.4. Bumps every version source from 3.3.3 to 3.3.4, dates the CHANGELOG, and adds entries for the bug fixes that landed this cycle.

Version sync

Six hand-edited locations + uv.lock auto-synced from pyproject.toml:

  • pyproject.toml
  • mempalace/version.py (canonical)
  • .claude-plugin/plugin.json
  • .claude-plugin/marketplace.json
  • .codex-plugin/plugin.json
  • README.md badge
  • uv.lock (auto-synced from pyproject.toml)

test_version_consistency + test_readme_claims cover the 6-file invariant — both green.

CHANGELOG additions this cycle

The pre-existing 3.3.4 draft already documented the larger items (init/mine prompt, topic tunnels, corpus-origin, CLI search rerank, graceful Ctrl-C, idempotent init). This PR adds Bug Fixes entries for what landed since:

Pre-tag verification

  • Full suite: 1441 passed, 1 skipped (Linux 3.9/3.11/3.13, Windows, macOS, all green on develop)
  • ruff check + ruff format --check clean
  • 44/44 in test_version_consistency + test_readme_claims
  • JPH invariant: pyproject.toml and .claude-plugin/plugin.json both reference mempalace-mcp
  • Wheel build + fresh-venv install: mempalace --version3.3.4, mempalace-mcp --help works (catches the v3.3.2-class regression class)

After merge

Standard release flow:

  1. Tag v3.3.4 on main (not develop) — GitHub Releases shows it as Latest
  2. python -m buildtwine checktwine upload
  3. gh release create v3.3.4 --notes-from-tag --latest
  4. Verify: fresh venv, pip install mempalace==3.3.4, mempalace-mcp --help

shaun0927 and others added 30 commits April 16, 2026 12:11
- repair.py: wrap upsert loop in try/except; restore from backup on
  failure instead of leaving a partially rebuilt collection
- migrate.py: replace non-atomic rmtree+move with rename-aside swap
  so a crash between the two calls does not destroy both copies
- cli.py: use offset += len(batch["ids"]) with empty-batch guard
  instead of fixed offset += batch_size to prevent skipping drawers
agent_name and entry are validated via sanitize_name/sanitize_content,
but topic is stored raw into ChromaDB metadata. Apply the same
sanitize_name guard to reject null bytes, path traversal, and
oversized payloads.
- repair.py: define backup_path before the conditional block so it is
  always in scope when the except handler references it
- migrate.py: restore old palace from .old if both os.rename and
  shutil.move fail during the swap step
Bumps [actions/deploy-pages](https://github.com/actions/deploy-pages) from 4 to 5.
- [Release notes](https://github.com/actions/deploy-pages/releases)
- [Commits](actions/deploy-pages@v4...v5)

---
updated-dependencies:
- dependency-name: actions/deploy-pages
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [actions/upload-pages-artifact](https://github.com/actions/upload-pages-artifact) from 3 to 5.
- [Release notes](https://github.com/actions/upload-pages-artifact/releases)
- [Commits](actions/upload-pages-artifact@v3...v5)

---
updated-dependencies:
- dependency-name: actions/upload-pages-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Fixes four issues causing silent hook failures:

1. **Relative paths** → Absolute paths (/absolute/path/to/hooks/...)
   Claude Code resolves hooks from working directory, not repo root.

2. **Wrong matcher** → Stop uses *, PreCompact has no matcher
   PreCompact doesn't use matcher (only Stop hooks do).

3. **Missing timeout** → Added timeout: 30 to both hooks
   Matches hooks/README.md specification.

4. **Ambiguous target** → Specified ~/.claude/settings.local.json
   Clarified global vs project-scoped config.

Also added executable chmod instructions and path replacement note.

Fixes #1037
shutil.move() can partially create palace_path before raising, which would
trip a bare os.replace(stale_path, palace_path) rollback (dest exists).

- Switch the primary swap to os.replace so same-filesystem moves stay atomic
- Branch on errno.EXDEV before falling back to shutil.move, so real errors
  (permissions, EIO) surface instead of silently attempting a slow copy
- Extract rollback into _restore_stale_palace which clears any partial
  destination and, if the restore itself fails, logs both stale_path and
  palace_path so the operator can recover by hand

Adds three regression tests covering clean rollback, partial-copy cleanup,
and logged failure on rollback-failure.

Flagged by the Qodo reviewer on #935.
Enable setup-python's built-in pip cache on all CI jobs to avoid
re-downloading ~300 MB of dependencies (chromadb, onnxruntime, hnswlib)
on every run.

Bump macOS and Windows from Python 3.9 to 3.11 -- Linux matrix already
covers 3.9 compatibility, and 3.11 is faster on these platforms.
~/.mempalace/tunnels.json (introduced in #790) was created via plain
open(..., "w") with no chmod, and its parent dir via os.makedirs()
without mode=0o700. On Linux with default umask 022 both end up
world-readable (0o644 / 0o755).

Tunnels reveal cross-wing connections — which projects, people, and
rooms the user has explicitly linked — so they are sensitive metadata
that should not be readable by other local users on shared systems.

Apply the same 0o700 / 0o600 pattern that #814 established for the
other sensitive palace files. Chmod calls are wrapped in try/except
(OSError, NotImplementedError) for Windows / unsupported-filesystem
compatibility.

Closes #1165
The miner upserted one drawer per ChromaDB call, paying tokenizer +
ONNX session setup per chunk. The embedding device was CPU-only because
no EmbeddingFunction was ever wired through the backend.

Two changes, each a speedup in its own right; stacked they give ~10x
end-to-end on a medium corpus (20 files, 568 drawers):

1. Batched upsert. `process_file` and `_file_chunks_locked` now collect
   all chunks of a file into a single `collection.upsert(...)` so the
   embedding model runs one forward pass per file instead of N.

2. Hardware-accelerated embedding function. New `mempalace/embedding.py`
   wraps `ONNXMiniLM_L6_V2` with configurable `preferred_providers`.
   `MEMPALACE_EMBEDDING_DEVICE` (or `embedding_device` in config.json)
   selects auto / cpu / cuda / coreml / dml. Unavailable accelerators
   log a warning and fall back to CPU.

   The factory subclasses `ONNXMiniLM_L6_V2` and spoofs its `name()` to
   `"default"` so the persisted EF identity matches existing palaces
   created with ChromaDB's bare `DefaultEmbeddingFunction` -- same
   model, same 384-dim vectors, no rebuild needed when turning GPU on.

   `ChromaBackend.get_collection` / `create_collection` now pass the
   resolved EF on every call so miner writes and searcher reads agree.

Benchmarks (i9-12900KF + RTX 3090, medium scenario, 568 drawers):

  per-chunk + CPU   19.77s ·  29 drw/s   (baseline)
  batched   + CPU    8.07s ·  70 drw/s   (2.4x)
  batched   + CUDA   2.15s · 264 drw/s   (9.2x)

Reproducible via `benchmarks/mine_bench.py`.

Install paths:
  pip install mempalace[gpu]       # NVIDIA CUDA
  pip install mempalace[dml]       # DirectML (Windows)
  pip install mempalace[coreml]    # macOS Neural Engine

Mine header now prints `Device: cpu|cuda|...` so users can confirm the
accelerator engaged.
perf(mining): batch per-chunk upserts + optional GPU acceleration
When two wings have one or more confirmed TOPIC labels in common, the
miner now drops a symmetric tunnel between them at mine time so the
palace graph reflects shared themes (frameworks, vendors, recurring
concepts).

- llm_refine: TOPIC label routes to a dedicated `topics` bucket so the
  signal survives confirmation instead of getting collapsed into
  `uncertain` and dropped.
- entity_detector / project_scanner: bucket plumbed through the
  detection pipeline; `confirm_entities` returns confirmed topics
  alongside people/projects.
- miner.add_to_known_entities: optional `wing` parameter records the
  confirmed topics under `topics_by_wing` in
  `~/.mempalace/known_entities.json`. Wing names do NOT leak into the
  flat known-name set used by drawer-tagging.
- palace_graph: `compute_topic_tunnels` and `topic_tunnels_for_wing`
  create symmetric tunnels via the existing `create_tunnel` API so they
  share dedup and persistence with explicit tunnels.
- miner.mine: post-file-loop pass calls `topic_tunnels_for_wing` for
  the freshly-mined wing. Failures are logged but never abort the mine.
- config: `topic_tunnel_min_count` knob (env
  `MEMPALACE_TOPIC_TUNNEL_MIN_COUNT` or `~/.mempalace/config.json`),
  default 1.

Tests cover topic persistence through init->mine, tunnel creation when
wings share a topic, no tunnel below threshold, cross-wing tunnel
retrieval via `list_tunnels`, dedup on recompute, case-insensitive
overlap, and the end-to-end mine-time wiring.

Out of scope for this PR (called out in the PR body): manifest-
dependency overlap, per-topic allow/deny lists, search-result surfacing.
… field

Previously a cross-wing topic tunnel for "Angular" stored the room as
"Angular" — colliding with a wing's literal folder-derived "Angular" room
at follow_tunnels/list_tunnels read time, and exposing raw topic strings
(which may contain characters rejected by sanitize_name) to the MCP
surface.

Topic tunnels now store their room as "topic:<original-casing>" and carry
kind="topic" on the stored dict. Explicit tunnels get kind="explicit"
(default). follow_tunnels("wing", "Angular") on a literal Angular room
no longer surfaces topic connections for the same name, and any LLM
scanning list_tunnels has a visible discriminator.
feat(graph): cross-wing tunnels by shared topics (#1180)
chore: add OpenArena owner claim verification file
Three tightly-coupled search-quality fixes for v3.3.3:

1. CLI `mempalace search` now routes through the same `_hybrid_rank`
   the MCP path already used. Drawers whose text contains every query
   term but embed as file-tree noise (directory listings, diffs, log
   fragments) were scoring cosine distance >= 1.0 — the display formula
   `max(0, 1 - dist)` then floored every result to `Match: 0.0`, with
   no way for the user to tell a lexical match from a total miss. BM25
   catches these cleanly; the display surfaces both `cosine=` and
   `bm25=` so users see which component is firing.

2. Legacy-palace distance-metric warning. Palaces created before
   `hnsw:space=cosine` was consistently set silently use ChromaDB's
   default L2 metric, which breaks the cosine-similarity formula (L2
   distances routinely exceed 1.0 on normalized 384-dim vectors). The
   search path now detects this at query time and prints a one-line
   notice pointing at `mempalace repair`. Only fires for legacy
   palaces; new palaces already set cosine correctly.

3. Invariant tests pinning `hnsw:space=cosine` on every collection-
   creation path — legacy `get_or_create_collection`, legacy
   `create_collection`, RFC 001 `get_collection(create=True)`, the
   public `palace.get_collection`, and a round-trip through reopen.
   Locks down the correctness that new-user palaces already have so a
   future refactor can't silently regress it.

Also adds a `metadata` property to `ChromaCollection` so callers can
read the underlying hnsw:space without reaching into `_collection`.

Tests:
- New regression: simulate three candidates at distance 1.5 (cosine=0),
  one containing query terms — must rank first with non-zero bm25.
- New: legacy metric (empty or non-cosine) produces stderr warning.
- New: correctly-configured palace produces no warning.
- New: all five creation paths pin cosine metadata.

All existing tests still pass.
`test_fresh_palace_via_full_stack_gets_cosine` used `tempfile.Temporary-
Directory()` as a context manager, which tries to delete the temp path
on exit. On Windows, ChromaDB still holds SQLite file handles to
`chroma.sqlite3` when the context closes, producing:

    PermissionError: [WinError 32] The process cannot access the file
    because it is being used by another process: '...\\chroma.sqlite3'
    NotADirectoryError: [WinError 267] The directory name is invalid

Other tests in the same file use pytest's `tmp_path` fixture, which
defers cleanup to session end (when the process is exiting and the
file-lock contention is moot). Align this one with the rest of the
file.

CLAUDE.md already documents the 80% Windows coverage allowance due to
"ChromaDB file lock cleanup" — the fix is to stop fighting the lock.
fix(search): CLI hybrid rerank, legacy-metric warning, invariant tests (3.3.4)
`mempalace init` now ends with a `Mine this directory now? [Y/n]`
prompt and runs `mine()` in-process when accepted; `--yes` skips the
prompt and auto-mines for non-interactive callers. Declining prints
the resume command. Removes the "remember to type the next command"
friction since rooms + entities just got set up.

`mempalace mine` now wraps its main loop in `try / except
KeyboardInterrupt` and prints `files_processed`, `drawers_filed`, and
`last_file` before exiting with code 130 on Ctrl-C. Re-mining is safe
because deterministic drawer IDs make the upsert idempotent. The
hooks PID lock at `~/.mempalace/hook_state/mine.pid` is now actively
removed in a `finally` when its entry points at us, on clean exit,
error, or interrupt — preventing the next hook fire from briefly
waiting on a stale PID.

Closes #1181, #1182.
…ore mine prompt

Reviewer feedback on the previous commit flagged two real problems:

1. Overloading --yes to also auto-mine was a silent behaviour change for
   scripted callers. Today --yes only auto-accepts entities — making it
   ALSO trigger a multi-minute ChromaDB write breaks every script that
   currently runs `mempalace init --yes <dir>` for the fast non-interactive
   entity path. Add a separate `--auto-mine` flag instead. Combinations:

     mempalace init --yes <dir>              # entities auto, STILL prompt mine
     mempalace init --auto-mine <dir>        # prompt entities, skip mine prompt
     mempalace init --yes --auto-mine <dir>  # fully non-interactive

   --yes behaviour is now identical to pre-PR.

2. The mine prompt was firing without telling the user how big the job
   was. On a real corpus mine takes minutes-to-tens-of-minutes; hitting
   Enter on default-Y with no size cue is a footgun. Show a one-line
   estimate computed from scan_project (the same walk we hand into mine)
   BEFORE the prompt:

     ~423 files (~12 MB) would be mined into this palace.
     Mine this directory now? [Y/n]

   The estimate uses a single corpus walk: scan_project's output is
   passed into mine() via a new optional files= kwarg, so we never walk
   the tree twice.

Tests: replaced the old "--yes auto-mines" assertion with a regression
guard that --yes alone STILL prompts; added coverage for --auto-mine
alone, --yes --auto-mine together, and the pre-prompt estimate line.
The "Skipped. Run mempalace mine <dir>" hint after declining the init
prompt and the "Re-run mempalace mine <dir> to resume" hint after a
Ctrl-C interruption both interpolated project_dir without shell-quoting.
A path containing spaces or metacharacters produced a copy-paste-broken
command.

Both spots now use shlex.quote(project_dir). Adds regression tests
covering each hint with a path that contains a space.
The pre-existing test_maybe_run_mine_prompt_declined_prints_hint
asserted the bare unquoted form `mempalace mine {tmp_path}`. After
the production code switched to shlex.quote on the resume hint, this
passed on Linux/macOS (POSIX paths have no characters that trigger
quoting) but failed on Windows where backslashes always get wrapped
in single quotes.

Mirror the production code in the assertion via shlex.quote so it's
portable across platforms; do the same for the two new
spaces-in-path tests for consistency.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Release prep for v3.3.4, syncing version sources and updating release notes while bundling a large set of reliability/features work around mining, search, hooks, tunnels, repair, and MCP safety.

Changes:

  • Bump package/version references to 3.3.4 and date/update the CHANGELOG for the release.
  • Improve operational robustness: per-palace mine locking, HNSW divergence probing + sqlite fallback, safer migration rollback, hook behavior fixes/validation, and search hybrid rerank + legacy-metric warning.
  • Add/expand features and supporting tests: cross-wing topic tunnels, embedding-device selection (GPU/CoreML/DML), corpus-origin detection scaffolding/tests, plus several new invariants/regression suites.

Reviewed changes

Copilot reviewed 60 out of 61 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_searcher.py Adds regression tests for BM25 None docs + CLI hybrid rerank + legacy-metric warning.
tests/test_save_hook_mines.py Tightens shell hook transcript mining expectations and validates transcript-path guards.
tests/test_repair.py Adds truncation-safety guard tests and max-seq-id repair coverage.
tests/test_project_scanner.py Updates detected-entity dict shape to always include topics.
tests/test_palace_locks.py New cross-process tests for per-palace non-blocking mine lock + back-compat alias.
tests/test_palace_graph_tunnels.py Adds permissions regression + topic tunnels + wing normalization tests for tunnel helpers.
tests/test_palace_graph.py Ensures graph build skips None metadata safely.
tests/test_miner.py Adds wing normalization test, bounded upsert batching tests, topic tunnel E2E tests, and Ctrl‑C handling tests.
tests/test_migrate.py Adds tests for robust rollback helper when migration swap fails.
tests/test_llm_refine.py Updates expectations for TOPIC routing into a dedicated topics bucket.
tests/test_llm_client.py Adds tests for provider endpoint “external service” heuristic + Tailscale CGNAT handling.
tests/test_known_entities_registry.py Adds topics_by_wing persistence/behavior tests for tunnel signal source.
tests/test_hooks_cli.py Updates hook CLI tests for new mine-target semantics and transcript validation.
tests/test_hnsw_capacity.py New suite for HNSW capacity probe + BM25-only sqlite fallback + status output behavior.
tests/test_entity_detector.py Updates detector empty/missing-file shape to include topics.
tests/test_embedding.py New tests for embedding provider resolution, warning behavior, and EF caching.
tests/test_corpus_origin.py New TDD suite for corpus-origin heuristic + LLM-confirmation parsing behavior.
tests/test_convo_miner_unit.py Adds test coverage for bounded upsert batching in convo mining.
tests/test_config.py Adds config tests for embedding device normalization and wing normalization helper.
tests/test_collection_metric_invariant.py New invariant tests ensuring all collection creation paths set hnsw:space=cosine.
tests/test_cli.py Adds tests for wing normalization at init and init→mine prompt/flags behavior.
tests/conftest.py Resets ChromaBackend quarantine state between tests to avoid leakage.
pyproject.toml Bumps version to 3.3.4 and adds optional deps for GPU/CoreML/DML onnxruntime variants.
openarena-claim.txt Adds OpenArena owner-claim verification token.
mempalace/version.py Bumps canonical __version__ to 3.3.4.
mempalace/searcher.py Adds None-safe tokenization, CLI hybrid BM25 rerank, legacy metric warning, and sqlite BM25 fallback.
mempalace/room_detector_local.py Routes wing slug creation through normalize_wing_name.
mempalace/project_scanner.py Adds topics bucket to detected shape and threads corpus-origin context into discovery/refine.
mempalace/palace_graph.py Adds wing normalization in tunnel helpers, logging warnings, topic tunnels, and restricts tunnels.json permissions.
mempalace/palace.py Introduces mine_palace_lock non-blocking per-palace lock + mine_global_lock alias.
mempalace/migrate.py Makes palace swap safer (rename aside + rollback helper) with EXDEV handling.
mempalace/mcp_server.py Adds HNSW probe-driven vector-disable flag, sqlite status fallback, and threads HNSW tuning into collection creation.
mempalace/llm_refine.py Routes TOPICs into topics bucket and adds corpus-origin prompt preamble support.
mempalace/llm_client.py Adds local-vs-external endpoint heuristic used for privacy warnings.
mempalace/hooks_cli.py Refactors auto-ingest into explicit (dir, mode) targets and uses the correct interpreter for spawned mines.
mempalace/entity_detector.py Adds topics bucket, corpus-origin persona reclassification, and updates confirm_entities to return topics.
mempalace/embedding.py New embedding function factory supporting hardware acceleration with provider-based caching.
mempalace/convo_miner.py Adds bounded upsert batching and wing normalization for convo mines.
mempalace/config.py Adds normalize_wing_name, embedding_device, and topic-tunnel threshold config plumbing.
hooks/mempal_save_hook.sh Makes transcript mining explicit (--mode convos), makes MEMPAL_DIR additive, adds transcript validator, removes eval-based parsing.
hooks/mempal_precompact_hook.sh Same as save hook: transcript parsing/validation and dual-target mining with correct modes.
hooks/README.md Documents new additive MEMPAL_DIR semantics and always-on transcript mining.
examples/HOOKS_TUTORIAL.md Updates Claude Code hook config guidance and executable-path expectations.
benchmarks/mine_bench.py New benchmark script comparing unbatched vs batched mining and device impact.
README.md Updates version badge and adds note about periodic sweep for per-message recall.
CLAUDE.md Clarifies “local-first” stance as “no external API by default” with BYOK support.
CHANGELOG.md Adds dated 3.3.4 section with added items and bug-fix notes.
.github/workflows/version-guard.yml Updates checkout action major version reference.
.github/workflows/deploy-docs.yml Updates checkout/pages actions major versions.
.github/workflows/ci.yml Adds pip caching and updates action versions + windows/macos python versions.
.codex-plugin/plugin.json Bumps plugin version to 3.3.4.
.claude-plugin/plugin.json Bumps plugin version to 3.3.4.
.claude-plugin/marketplace.json Bumps marketplace version to 3.3.4.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/searcher.py
Comment on lines 333 to +360
@@ -281,33 +344,215 @@ def search(query: str, palace_path: str, wing: str = None, room: str = None, n_r
print(f" Room: {room}")
print(f"{'=' * 60}\n")

for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists), 1):
similarity = round(max(0.0, 1 - dist), 3)
meta = meta or {}
for i, hit in enumerate(hits, 1):
vec_sim = round(max(0.0, 1 - hit["distance"]), 3)
bm25 = hit.get("bm25_score", 0.0)
meta = hit["metadata"]
source = Path(meta.get("source_file", "?")).name
wing_name = meta.get("wing", "?")
room_name = meta.get("room", "?")

print(f" [{i}] {wing_name} / {room_name}")
print(f" Source: {source}")
print(f" Match: {similarity}")
print(f" Match: cosine={vec_sim} bm25={bm25}")
print()
# Print the verbatim text, indented
for line in doc.strip().split("\n"):
for line in hit["text"].strip().split("\n"):
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search() now tolerates None docs during BM25 scoring, but it can still crash when printing results: hit["text"].strip() will raise if Chroma returns a None document (the same production scenario mentioned in the new BM25 regression tests). Normalize doc to an empty string when building hits (or when printing) so reranked results are always renderable.

Copilot uses AI. Check for mistakes.
Comment thread mempalace/mcp_server.py
Comment on lines +124 to +130
_vector_disabled = False
_vector_disabled_reason = ""
# Optional[dict] (not ``dict | None``) keeps Python 3.9 import-time
# parsing happy — PEP 604 unions in annotations only became unconditional
# at module-eval time in 3.10.
_vector_capacity_status = None # type: Optional[dict]

Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_vector_capacity_status = None # type: Optional[dict] uses Optional without importing it. Ruff/pyflakes typically flags undefined names inside type comments, and type checkers will too. Either import Optional from typing at the top of the module, or switch to a real annotation that doesn’t require a type comment.

Copilot uses AI. Check for mistakes.
Comment on lines 427 to 435
Returns:
{
"people": [...entity dicts...],
"projects": [...entity dicts...],
"uncertain":[...entity dicts...],
# Only present when corpus_origin reclassifies at least one
# candidate as an agent persona:
"agent_personas": [...entity dicts...],
}
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detect_entities now always returns a topics bucket (even in the empty-candidates path), but the docstring’s documented return shape doesn’t mention it. Please update the Returns section to include "topics": [...] so downstream callers/tests relying on the stable dict shape have an accurate contract.

Copilot uses AI. Check for mistakes.
Comment thread mempalace/palace_graph.py
Comment on lines +39 to +50
def _normalize_wing(wing: str | None) -> str | None:
"""Normalize a wing name for consistent lookup.

``init`` stores wing names with hyphens and spaces replaced by underscores
(e.g. ``mempalace_public``). Callers that pass the raw directory name
(``mempalace-public``) would silently miss. This helper aligns the lookup
key with the stored metadata.
"""
if wing is None:
return None
return wing.lower().replace(" ", "_").replace("-", "_")

Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

palace_graph._normalize_wing() duplicates the normalization rule that was just centralized as config.normalize_wing_name(). Keeping two implementations risks future drift (and _normalize_wing also doesn’t strip whitespace). Prefer calling the shared helper (and handling None/empty) so init/mine/tunnel lookup all stay aligned by construction.

Copilot uses AI. Check for mistakes.
Comment thread mempalace/mcp_server.py
Comment on lines 634 to +640
def tool_check_duplicate(content: str, threshold: float = 0.9):
col = _get_collection()
if not col:
return _no_palace()
if _vector_disabled:
# Without a usable HNSW we can't compute cosine similarity for
# near-duplicate detection. Report the limitation rather than
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the #1222 divergence mode, opening a Chroma PersistentClient/collection can segfault. tool_check_duplicate() calls _get_collection() before checking _vector_disabled, so it can still crash the server in exactly the scenario this flag is meant to protect. Probe/check _vector_disabled first (via _refresh_vector_disabled_flag()) and short-circuit without touching chromadb when vector search is disabled.

Copilot uses AI. Check for mistakes.
milla-jovovich and others added 7 commits April 27, 2026 00:44
Adds api_key_source provenance ('flag' | 'env' | None) to LLMProvider
so cmd_init can distinguish a key passed via --llm-api-key (explicit
opt-in) from one silently picked up via OPENAI_API_KEY / ANTHROPIC_API_KEY
shell env (stray credential).

When the endpoint is external AND api_key_source == 'env', init now
prints a blocking [y/N] prompt before any data is sent. Anything other
than 'y' drops the LLM and falls back to heuristics-only.

Adds --accept-external-llm flag for CI / non-interactive bypass.

Completes the UX gap in #1224: the URL-based warning was informational
and init kept running, so a user who didn't notice the line had already
leaked. The consent prompt is the actual gate; explicit flag-passed keys
remain treated as already-consented.
Adds a fifth format adapter to mempalace.normalize alongside the
existing Claude Code, Codex, Claude.ai, ChatGPT, and Slack parsers.
After this lands, mempalace mine --mode convos ingests Gemini CLI
session history without manual export.

Why now: Claude Code and Codex CLI are already supported by convo_miner;
adding Gemini closes the major-CLI-tool coverage gap. After this lands,
the README's "verbatim conversation history" promise is honestly
delivered for all three top-tier API-keyed coding CLIs (Claude Code,
Codex CLI, Gemini CLI), not just two of them. This is the third leg
of the trio Aya pushed for so the public claim matches the actual
ingest pipeline.

Gemini CLI stores sessions at ~/.gemini/tmp/<project_hash>/chats/ as
JSONL. The on-disk schema (per google-gemini/gemini-cli#15292):

    {"type":"session_metadata","sessionId":"...","projectHash":"...",...}
    {"type":"user","id":"msg1","content":[{"text":"Hello"}]}
    {"type":"gemini","id":"msg2","content":[{"text":"Hi"}]}
    {"type":"message_update","id":"msg2","tokens":{"input":10,"output":5}}

The new _try_gemini_jsonl parser:

  - requires a session_metadata record so it does not false-positive
    against Claude Code or Codex JSONL passing through the dispatch
    chain in _try_normalize_json
  - extracts user/gemini message text from each entry's content array
    of {"text": "..."} blocks, joining multiple blocks per message
    in order
  - skips message_update entries (token-count deltas with no message
    text) and any other unknown record types
  - returns None when fewer than two conversational messages are
    present, mirroring the codex parser's >=2-message guard

Test coverage: 9 new unit tests in tests/test_normalize.py mirroring
the codex test pattern - happy path, multi-turn, missing session
metadata, message_update skip, single-message rejection, multi-block
content concatenation, empty content skip, malformed-line resilience,
and explicit no-match against codex JSONL fixtures. Schema-level only;
real Gemini CLI session fixtures are a follow-up once a real user file
is available.

Closes part of #59 (the Gemini CLI portion of the umbrella request).
feat(privacy): blocking consent gate for env-fallback LLM API keys
feat(normalize): Gemini CLI session JSONL adapter
@igorls
Copy link
Copy Markdown
Member Author

igorls commented Apr 28, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

@copilot apply changes based on the comments in this thread

Applied the requested review-thread changes in 4cfbd60. Focused regression tests, ruff checks, Code Review, and CodeQL all pass.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • chroma-onnx-models.s3.amazonaws.com
    • Triggering command: /usr/bin/python python -m pytest tests/test_searcher.py tests/test_empty_chromadb_results.py tests/test_mcp_server.py tests/test_entity_detector.py tests/test_palace_graph.py tests/test_palace_graph_tunnels.py -q (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

igorls and others added 15 commits April 28, 2026 21:52
The fix landed this cycle and is documented under 3.3.4. The 3.3.0
Bug Fixes block is shipped history and shouldn't grow new entries
retroactively.
The PR documenting the fix is #1232; referencing it from inside its
own changelog entry is circular.
The capacity probe added in #1227 hardcoded a 2,000-row floor for the
"diverged" decision. The comment justifying that number explicitly tied
it to chromadb's *default* sync_threshold of 1,000 — "Two synchronization
windows worth (2 × sync_threshold = 2000) is a safe steady-state ceiling".

#1191 then bumped sync_threshold to 50,000 via _HNSW_BLOAT_GUARD without
updating the floor. Result: any palace created with the bloat guard
flips between OK and DIVERGED on every flush cycle. Steady-state
divergence sits at 0–50K (the natural queue depth), and the 2,000 floor
trips the guardrail the moment the queue exceeds 10% of sqlite_count.
The MCP server then routes search to BM25-only and disables duplicate
detection for ~80% of the write cycle on actively-mined ≥100K palaces,
even though chromadb is behaving correctly.

This change reads the configured `hnsw:sync_threshold` from
`collection_metadata` per palace and scales the floor to 2 × that value.
The 10% relative term and the original #1222 detection capability are
unchanged — a 91%-missing-of-192K palace (the actual #1222 reproducer)
still trips, regardless of whether the collection was created with
sync_threshold=1000 or 50000.

Behavior summary:

| Collection's sync_threshold | New floor | Old floor |
|---|---|---|
| Missing (legacy palace)     | 2000      | 2000 (unchanged)
| 1000 (chromadb default)     | 2000      | 2000 (unchanged)
| 50000 (#1191 bloat guard)   | 100000    | 2000 (the bug)

Tests:
- test_capacity_status_tolerates_lag_under_large_sync_threshold (regression
  for the #1191/#1227 conflict — 100K sqlite + 50K HNSW + sync=50K → OK)
- test_capacity_status_still_flags_real_corruption_under_large_sync (#1222
  shape with bloat-guard collection — still detects corruption)
- test_capacity_status_default_threshold_when_no_sync_metadata (legacy
  palaces without the metadata row use the 2000 fallback floor)
- test_unflushed_path_also_uses_dynamic_floor (the never-flushed branch
  scales too — 30K under sync_threshold=50000 is no longer flagged)

All 18 pre-existing tests in tests/test_hnsw_capacity.py and 45 tests
in tests/test_backends.py still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…1254)

`_compute_heuristic_seq_id` ran `int(row[0])` directly on the result
of `MAX(e.seq_id)`. On palaces where chromadb 1.5.x has been writing
seq_ids natively (8-byte big-endian uint64 BLOB), that raises
`ValueError: invalid literal for int() with base 10: b'...'` before
the dry-run can print, leaving users with no path through the
recovery feature added in #1135 — the only documented un-poison
route for palaces hit by the original PR #664 shim bug.

Decode BLOB return values via `int.from_bytes(val, "big")` and
keep the existing `int(val)` path for INTEGER rows. Regression
test seeds a BLOB row in `embeddings.seq_id` and asserts the
heuristic surfaces the correct integer.
…uristic

fix(repair): decode BLOB embeddings.seq_id in max-seq-id heuristic (#1254)
…th-sync-threshold

fix(repair): scale HNSW divergence floor with hnsw:sync_threshold
fix(storage): stop ChromaDB from crashing when reopening an existing …
#1262)

#1262 split `get_or_create_collection` into `get_collection` + fallback
`create_collection` inside `ChromaBackend.get_collection`, fixing the
chromadb 1.5.x Rust-binding SIGSEGV that fires when stored collection
metadata differs from the call-site's `_HNSW_BLOAT_GUARD` payload.

The MCP server's `_get_collection(create=True)` carries the same metadata
payload at `mcp_server.py:287` and routes through chromadb's Python
client directly, bypassing the backend layer. Both `tool_add_drawer`
and `tool_diary_write` reach this site on every invocation, and the
Stop hook fires `mempalace_diary_write` at session end — which was
exactly the crash path #1089 named.

Apply the same try/except split here so legacy palaces whose stored
metadata predates the bloat-guard expansion no longer crash on the
MCP-server reopen path. Regression test patches
`get_or_create_collection` at the chromadb client class level (not the
instance — chromadb's mtime-change detection rebuilds the client between
calls, so an instance-level spy doesn't survive) and asserts the second
`_get_collection(create=True)` call never reaches it.
…pen-crash

fix(mcp_server): split get_or_create_collection on reopen (follow-up to #1262)
Three fixes landed on develop after the initial release-prep cut and
were brought in via the develop merge. Document them in the 3.3.4
Bug Fixes section so the release notes reflect what users will
actually receive.

- #1287 - HNSW divergence floor scales with hnsw:sync_threshold
  (resolves a silent-fallback regression introduced by the
  interaction between #1191 and #1227 in this release)
- #1262 - ChromaBackend get_or_create_collection split, fixing the
  stop-hook SIGSEGV class on legacy palaces with mismatched stored
  metadata (#1089)
- #1288 / #1254 - repair --mode max-seq-id heuristic now decodes
  BLOB-typed embeddings.seq_id, restoring the un-poison path added
  in #1135 for palaces where chromadb 1.5.x writes seq_ids natively
The release was originally cut on 2026-04-27 but did not tag that day.
Three additional bug fixes have been folded in since then (#1262,
#1287, #1288) and the actual tag will happen on 2026-04-30. Update
the header date to match.
The same try/except split that #1262 applied at the backend layer
(ChromaBackend.get_collection) was needed at the parallel call site
in mcp_server._get_collection(create=True), which carries the same
metadata payload directly to chromadb's Python client. Both reopen
paths in mempalace now bypass get_or_create_collection on existing
collections, closing the SIGSEGV class for both tool_add_drawer
and tool_diary_write (the Stop hook's path).
@milla-jovovich milla-jovovich merged commit d53129e into main May 1, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.