BUG: mempalace_search returns stale results after mid-session CLI mine — cached HNSW client not invalidated

## Summary

After a live Claude Code session has connected to the MCP server, running `mempalace mine` from another process (e.g., the CLI) does not make the newly-filed drawers visible to `mempalace_search`. The MCP server's `mempalace_status` *does* reflect the new total_drawers, so the failure is subtle and easy to miss.

## Reproduction

1. Start a Claude Code session with the mempalace plugin active (MCP server running).
2. Call `mempalace_status` → note `total_drawers` (e.g., 18,941).
3. Call `mempalace_search "some query"` → note top 5 results.
4. From a bash tool in the same session, run `py -3.10 -m mempalace mine /path/to/new/content --wing X`.
5. Call `mempalace_status` again → `total_drawers` is correctly updated (e.g., 19,208). **Good.**
6. Call `mempalace_search "some query"` with the **same** query → results are unchanged from step 3, even when the newly-mined content should rank in the top-N.

## Symptom signature

Comparing MCP search to a fresh `chromadb.PersistentClient` opened via the CLI against the same palace, for identical queries:

| # | Fresh client (CLI/REPL) | MCP server (cached) |
|---|---|---|
| 1 | 0.291 `agent-aa9b45c` | 0.291 `agent-aa9b45c` ✓ |
| 2 | 0.271 `agent-a6570a…` | 0.271 `agent-a6570a…` ✓ |
| 3 | 0.233 `agent-af34ef…` | 0.233 `agent-af34ef…` ✓ |
| 4 | **0.231 `tender-shimmying-glade.md`** (newly mined) | 0.155 `agent-a6570a…` |
| 5 | **0.181 `tender-shimmying-glade.md`** (newly mined) | 0.134 `agent-a35abd…` |

The first N positions tend to match because dominant older content wins regardless, but positions N+1 onward quietly return lower-scoring older drawers instead of correctly-ranked new ones. Easy to miss unless you A/B against a fresh process.

## Root cause

Since #135, `mcp_server.py` caches the ChromaDB client and collection as module globals, set once on first use:

https://github.com/milla-jovovich/mempalace/blob/main/mempalace/mcp_server.py#L103-L126

```python
_client_cache = None
_collection_cache = None

def _get_client():
    global _client_cache
    if _client_cache is None:
        _client_cache = chromadb.PersistentClient(path=_config.palace_path)
    return _client_cache
```

Two compounding factors:

1. `_client_cache` / `_collection_cache` are never invalidated.
2. ChromaDB's `PersistentClient` additionally holds a per-path `SharedSystemClient` singleton with an in-memory HNSW index frozen at client creation. Even if you re-instantiate `PersistentClient` without clearing `SharedSystemClient`, you get the same stale index back.

Why `mempalace_status` still looks correct: `col.count()` reads SQLite directly, which external writes hit. Query path uses the frozen HNSW, which they don't. Hence: count moves, results don't.

## Proposed fix

mtime-triggered cache invalidation on `palace/chroma.sqlite3`, rate-limited to once every 2s, with a defensive fallback if `SharedSystemClient.clear_system_cache` ever moves in a future chromadb release. Tested locally against chromadb 0.6.3 on Windows.

```python
import time

_client_cache = None
_collection_cache = None
_cache_sqlite_mtime = 0.0
_cache_last_check = 0.0
_CACHE_CHECK_INTERVAL = 2.0


def _sqlite_mtime():
    try:
        return os.path.getmtime(os.path.join(_config.palace_path, "chroma.sqlite3"))
    except OSError:
        return 0.0


def _maybe_invalidate_cache():
    global _client_cache, _collection_cache, _cache_sqlite_mtime, _cache_last_check
    now = time.monotonic()
    if now - _cache_last_check < _CACHE_CHECK_INTERVAL:
        return
    _cache_last_check = now
    current = _sqlite_mtime()
    if _cache_sqlite_mtime == 0.0:
        _cache_sqlite_mtime = current
        return
    if current > _cache_sqlite_mtime:
        logger.info("palace mtime changed; clearing chromadb client cache")
        try:
            from chromadb.api.client import SharedSystemClient
            SharedSystemClient.clear_system_cache()
        except Exception as e:
            logger.warning(f"clear_system_cache failed: {e}")
        _client_cache = None
        _collection_cache = None
        _cache_sqlite_mtime = current


def _get_client():
    global _client_cache
    _maybe_invalidate_cache()
    if _client_cache is None:
        _client_cache = chromadb.PersistentClient(path=_config.palace_path)
    return _client_cache
```

Design notes:
- Rate-limited `stat()` (2s) to avoid filesystem hammering on every tool call.
- First call records the initial mtime without invalidating (preserves cold-start behavior).
- `clear_system_cache` is imported lazily and wrapped in try/except — a chromadb API change won't crash the MCP server; it'll log a warning and fall through to the current stale-cache behavior.
- Writes that go through the MCP server itself (`_get_collection(create=True)`) continue to work because they share the same client.

## Alternatives considered

- **New `mempalace_refresh` MCP tool that nulls the caches on demand.** Smaller surface, but requires the caller (Claude) to know when to call it — exactly the situation where silent staleness bites.
- **Fresh client per call (revert #135).** Correct but throws away the perf win from #135 on every tool call.
- **File-watch (watchdog/inotify).** Cleaner, but adds a dep and a background thread for what is effectively a cheap stat.

## Environment

- mempalace 3.1.0 (installed via pip)
- chromadb 0.6.3
- Python 3.10, Windows 11
- Claude Code plugin `milla-jovovich/mempalace` v3.0.14

Happy to turn this into a PR with tests if the mtime-stat approach is the direction you'd like to take.

#	Fresh client (CLI/REPL)	MCP server (cached)
1	0.291 `agent-aa9b45c`	0.291 `agent-aa9b45c` ✓
2	0.271 `agent-a6570a…`	0.271 `agent-a6570a…` ✓
3	0.233 `agent-af34ef…`	0.233 `agent-af34ef…` ✓
4	0.231 `tender-shimmying-glade.md` (newly mined)	0.155 `agent-a6570a…`
5	0.181 `tender-shimmying-glade.md` (newly mined)	0.134 `agent-a35abd…`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: mempalace_search returns stale results after mid-session CLI mine — cached HNSW client not invalidated #608

Summary

Reproduction

Symptom signature

Root cause

Proposed fix

Alternatives considered

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: mempalace_search returns stale results after mid-session CLI mine — cached HNSW client not invalidated #608

Description

Summary

Reproduction

Symptom signature

Root cause

Proposed fix

Alternatives considered

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions