perf: cache ChromaDB PersistentClient instead of re-instantiating per call#135
perf: cache ChromaDB PersistentClient instead of re-instantiating per call#135bensig merged 2 commits intoMemPalace:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reduces MCP tool-call latency by caching the ChromaDB PersistentClient/collection in the MCP server process, and updates the project’s packaging/test setup alongside a version bump.
Changes:
- Cache
chromadb.PersistentClientand the target collection inmempalace/mcp_server.pyto avoid re-opening SQLite + reloading indexes on every request. - Add/expand pytest coverage for MCP server handlers, the programmatic search API, the dialect, and the temporal knowledge graph (with shared isolation fixtures).
- Update packaging metadata/build configuration (move to
hatchling, add uv-style dependency groups), removerequirements.txt, and bump__version__to 3.0.0.
Reviewed changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
mempalace/mcp_server.py |
Adds module-level caching for ChromaDB client/collection used by MCP tools. |
mempalace/__init__.py |
Updates package version constant to 3.0.0. |
pyproject.toml |
Switches build backend to hatchling and updates dev dependency definitions. |
requirements.txt |
Removed in favor of pyproject.toml / uv workflow. |
tests/conftest.py |
Adds isolation fixtures (HOME redirection) and MCP cache reset for test independence. |
tests/test_mcp_server.py |
Adds unit/integration tests for MCP protocol dispatch + tool handlers. |
tests/test_searcher.py |
Adds tests for search_memories programmatic API behavior. |
tests/test_knowledge_graph.py |
Adds tests for KnowledgeGraph CRUD + temporal queries + stats/timeline. |
tests/test_dialect.py |
Adds tests for dialect compression/encoding/decoding utilities. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """MemPalace — Give your AI a memory. No API key required.""" | ||
|
|
||
| __version__ = "2.0.0" | ||
| __version__ = "3.0.0" | ||
|
|
||
| from .cli import main |
There was a problem hiding this comment.
__version__ is bumped to 3.0.0 here, but the MCP protocol serverInfo.version reported by mempalace.mcp_server.handle_request() is still hardcoded to 2.0.0. Consider deriving the MCP server version from mempalace.__version__ (or importlib.metadata.version("mempalace")) so clients see a consistent version.
| yield | ||
| try: | ||
| from mempalace import mcp_server | ||
| mcp_server._client_cache = None | ||
| mcp_server._collection_cache = None | ||
| except (ImportError, AttributeError): | ||
| pass |
There was a problem hiding this comment.
_reset_mcp_cache only clears the MCP server caches in teardown. If a previous test populated the cache, the next test will still start with whatever state is left until teardown runs again. Resetting the cache in the fixture setup (before yield) as well would make isolation deterministic and reduce the chance of temp-dir cleanup issues due to lingering open ChromaDB handles.
| yield | |
| try: | |
| from mempalace import mcp_server | |
| mcp_server._client_cache = None | |
| mcp_server._collection_cache = None | |
| except (ImportError, AttributeError): | |
| pass | |
| def _clear_cache(): | |
| try: | |
| from mempalace import mcp_server | |
| mcp_server._client_cache = None | |
| mcp_server._collection_cache = None | |
| except (ImportError, AttributeError): | |
| pass | |
| _clear_cache() | |
| yield | |
| _clear_cache() |
| def test_no_palace_returns_error(self): | ||
| result = search_memories("anything", "/nonexistent/path") |
There was a problem hiding this comment.
This test uses a hard-coded absolute path (/nonexistent/path) to simulate a missing palace. That can behave differently across OSes (and could even be creatable in privileged CI), making the test flaky and potentially leaving junk on the filesystem. Prefer using a per-test temp path (e.g. tmp_path / "missing") that is guaranteed not to exist.
| def test_no_palace_returns_error(self): | |
| result = search_memories("anything", "/nonexistent/path") | |
| def test_no_palace_returns_error(self, tmp_path): | |
| missing_palace_path = tmp_path / "missing" | |
| result = search_memories("anything", str(missing_palace_path)) |
| def test_no_palace_returns_error(self, monkeypatch, config, kg): | ||
| _patch_mcp_server(monkeypatch, config, "/nonexistent/path", kg) | ||
| from mempalace.mcp_server import tool_status | ||
|
|
||
| result = tool_status() | ||
| assert "error" in result | ||
|
|
There was a problem hiding this comment.
This test uses a hard-coded absolute path (/nonexistent/path) to simulate a missing palace. That can behave differently across OSes (and could even be creatable in privileged CI), making the test flaky and potentially leaving junk on the filesystem. Prefer using a per-test temp path (e.g. tmp_path / "missing") that is guaranteed not to exist.
|
Hey Igor — CI is failing, looks like it needs a rebase on main (we've merged a bunch of changes recently). Can you rebase and push? |
07f1f86 to
e4d938c
Compare
|
Merged your other 5 PRs — this one now has conflicts from those merges. Can you rebase on main one more time? |
… call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR MemPalace#131. 92 tests pass.
…ion and portability
e4d938c to
47696be
Compare
Audited PR MemPalace#135 (ChromaDB singleton cache): none of its 3 commits port to this fork. The client-caching perf gain is already achieved via the module-level get_db() singleton in db.py, and the companion test fixes target upstream-only test files (test_mcp_server.py, test_searcher.py).
Problem
_get_collection()creates a newchromadb.PersistentClient(path=...)on every MCP tool call. This re-opens the SQLite metadata DB and reloads the HNSW index each time, adding unnecessary latency.Finding:
HIGH — PersistentClient re-instantiation
Fix
Cache the client and collection at module level:
The MCP server runs as a subprocess (one per session), so the cache lifetime matches the session. No concurrency concerns — MCP is single-threaded stdin/stdout.
92 tests pass (includes test infrastructure from PR #131 + a
_reset_mcp_cachefixture for test isolation).