test: expand coverage from 20 to 92 tests, migrate to uv#131
test: expand coverage from 20 to 92 tests, migrate to uv#131bensig merged 3 commits intoMemPalace:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR significantly expands the project’s test coverage (focused on the MCP server, search API, knowledge graph, and AAAK dialect) and updates packaging/tooling to use uv-friendly dependency management and hatchling, including aligning the library __version__ with pyproject.toml.
Changes:
- Added new pytest suite + shared fixtures, increasing coverage across multiple previously untested modules.
- Migrated packaging to
hatchlingand moved dev dependencies to[dependency-groups]; removedrequirements.txt. - Fixed
mempalace.__version__mismatch.
Reviewed changes
Copilot reviewed 7 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/conftest.py | Adds shared fixtures for isolated palace and KG instances. |
| tests/test_searcher.py | Adds tests for programmatic search_memories API. |
| tests/test_mcp_server.py | Adds tool-handler and protocol dispatch tests for MCP server. |
| tests/test_knowledge_graph.py | Adds temporal KG tests (CRUD, temporal queries, invalidation, timeline, stats). |
| tests/test_dialect.py | Adds tests for AAAK dialect compression/encoding/decoding and helpers. |
| pyproject.toml | Switches build backend to hatchling and moves dev deps to dependency-groups. |
| requirements.txt | Removes legacy requirements file. |
| mempalace/init.py | Updates __version__ to 3.0.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import chromadb | ||
|
|
There was a problem hiding this comment.
chromadb is imported but never used in this test module. With Ruff’s F rules enabled, this will be flagged as an unused import; please remove it (the fixtures already set up ChromaDB via seeded_collection).
| import chromadb |
| def _patch_mcp_server(monkeypatch, config, palace_path, kg): | ||
| """Patch the mcp_server module globals to use test fixtures.""" | ||
| from mempalace import mcp_server | ||
| import chromadb | ||
|
|
There was a problem hiding this comment.
Importing mempalace.mcp_server inside tests triggers module-level initialization (_kg = KnowledgeGraph()) which creates/initializes ~/.mempalace/knowledge_graph.sqlite3 on import. This means the test suite can write to a developer’s real home directory even though fixtures aim to isolate state; consider setting HOME (and/or patching mempalace.knowledge_graph.DEFAULT_KG_PATH) to a temp dir before importing the module, or reworking tests to import after applying such a patch.
| import pytest | ||
|
|
||
|
|
There was a problem hiding this comment.
pytest is imported but not referenced anywhere in this file (fixtures don’t require an explicit import). With Ruff F401 enabled, this will be flagged as an unused import; please remove it.
| import pytest |
| def _patch_mcp_server(monkeypatch, config, palace_path, kg): | ||
| """Patch the mcp_server module globals to use test fixtures.""" | ||
| from mempalace import mcp_server | ||
| import chromadb | ||
|
|
||
| monkeypatch.setattr(mcp_server, "_config", config) | ||
| monkeypatch.setattr(mcp_server, "_kg", kg) | ||
|
|
||
|
|
There was a problem hiding this comment.
chromadb is imported inside _patch_mcp_server but not used, and the palace_path parameter is currently unused. This is dead code and (for the import) will be flagged by Ruff; please remove the unused import and either use or drop the unused parameter to keep the helper minimal and clear.
|
|
||
| from mempalace.config import MempalaceConfig | ||
| from mempalace.knowledge_graph import KnowledgeGraph | ||
|
|
||
|
|
There was a problem hiding this comment.
Tests currently don’t globally isolate ~/HOME. Since parts of the library (e.g., KnowledgeGraph() default path via os.path.expanduser('~/.mempalace/...')) can create files on import/use, consider adding an autouse=True fixture here that sets HOME (and Windows equivalents) to a temp dir for the duration of the test session to guarantee no writes to a real user profile.
| from mempalace.config import MempalaceConfig | |
| from mempalace.knowledge_graph import KnowledgeGraph | |
| _TEST_HOME_DIR = tempfile.mkdtemp(prefix="mempalace_home_") | |
| _ORIGINAL_HOME_ENV = { | |
| "HOME": os.environ.get("HOME"), | |
| "USERPROFILE": os.environ.get("USERPROFILE"), | |
| "HOMEDRIVE": os.environ.get("HOMEDRIVE"), | |
| "HOMEPATH": os.environ.get("HOMEPATH"), | |
| } | |
| os.environ["HOME"] = _TEST_HOME_DIR | |
| os.environ["USERPROFILE"] = _TEST_HOME_DIR | |
| drive, path = os.path.splitdrive(_TEST_HOME_DIR) | |
| os.environ["HOMEDRIVE"] = drive or "" | |
| os.environ["HOMEPATH"] = path or _TEST_HOME_DIR | |
| from mempalace.config import MempalaceConfig | |
| from mempalace.knowledge_graph import KnowledgeGraph | |
| @pytest.fixture(scope="session", autouse=True) | |
| def isolated_home_dir(): | |
| """Redirect user-home expansion to a temp directory for the test session.""" | |
| yield _TEST_HOME_DIR | |
| for env_name, original_value in _ORIGINAL_HOME_ENV.items(): | |
| if original_value is None: | |
| os.environ.pop(env_name, None) | |
| else: | |
| os.environ[env_name] = original_value | |
| shutil.rmtree(_TEST_HOME_DIR, ignore_errors=True) |
| [project.scripts] | ||
| mempalace = "mempalace:main" | ||
|
|
||
| [project.optional-dependencies] | ||
| dev = ["pytest>=7.0", "build>=1.0", "twine>=4.0"] | ||
| [dependency-groups] | ||
| dev = ["pytest>=7.0", "ruff>=0.4.0"] | ||
|
|
||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" |
There was a problem hiding this comment.
Dropping [project.optional-dependencies].dev in favor of [dependency-groups].dev breaks the documented contributor install command pip install -e ".[dev]" (see CONTRIBUTING.md:10). Consider either updating the docs or keeping a dev extra (possibly mirroring the dependency-group) so non-uv workflows still work.
Security: - Filter tool arguments against input_schema before dispatch (prevents arbitrary kwargs injection via **tool_args unpacking) - Add length validation to all string parameters (content, query, wing, room, agent_name, entity, subject, predicate, object) - Clamp integer parameters (limit, last_n) to [1, 100] Constants: - MAX_CONTENT_LENGTH = 50,000 chars - MAX_QUERY_LENGTH = 2,000 chars - MAX_FIELD_LENGTH = 200 chars - MAX_RESULTS_LIMIT = 100 Includes test infrastructure from PR MemPalace#131 (uv migration + expanded tests). New validation tests: extra kwargs filtering, oversized input rejection, limit clamping. 96 tests pass.
… call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Enable WAL journal mode in _conn() for better concurrent read performance and reduced SQLITE_BUSY risk - Add LIMIT 100 to entity-filtered timeline query (was unbounded, while global timeline already had LIMIT 100) Findings: MemPalace#8 (HIGH — no WAL mode), MemPalace#22 (LOW — inconsistent limits) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
Prevents OOM when the palace grows large. The following unbounded metadata fetches now have a safety cap: - tool_status: col.get(include=['metadatas'], limit=10000) - tool_list_wings: same - tool_list_rooms: same (including wing-filtered variant) - tool_get_taxonomy: same - Layer1.generate: col.get(include=['documents','metadatas'], limit=10000) Layer2 already had a limit parameter — no change needed. Finding: MemPalace#3 (CRITICAL — unbounded data fetching causes OOM) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: MemPalace#12 (HIGH), MemPalace#5 (MEDIUM), MemPalace#15 (LOW) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: MemPalace#11 (HIGH — add ignores updates), MemPalace#6 (MEDIUM — TOCTOU), MemPalace#13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
The save hook uses SESSION_ID in file paths (state_dir/). A crafted session_id value like '../../etc/cron.d/evil' could write state files outside the intended directory. Strip everything except [a-zA-Z0-9_-] from SESSION_ID, defaulting to 'unknown' if empty after sanitization. Finding: MemPalace#4 (HIGH — path traversal via SESSION_ID) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Tighten chromadb dependency from >=0.4.0,<1 to >=0.5.0,<0.7 (the collection API changed significantly across majors; this pins to the tested range) - Add optional 'spellcheck' extras for the undeclared autocorrect dependency used in spellcheck.py - Add PEP 561 py.typed marker for type checker support Findings: MemPalace#10 (HIGH — chromadb range too wide), MemPalace#30 (LOW — undeclared autocorrect), MemPalace#32 (LOW — missing py.typed) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
|
Hey Igor — CI is failing, looks like it needs a rebase on main (we've merged a bunch of changes recently). Can you rebase and push? |
- Migrate from setuptools to hatchling build backend - Add dependency-groups (PEP 735) for dev tooling (pytest, ruff) - Remove redundant requirements.txt in favor of uv.lock - Fix __version__ mismatch (2.0.0 -> 3.0.0 to match pyproject.toml) New test files: - conftest.py: shared fixtures (isolated palace, KG, ChromaDB collection) - test_knowledge_graph.py: 17 tests (entity CRUD, temporal queries, timeline) - test_mcp_server.py: 25 tests (protocol dispatch, read/write/KG/diary tools) - test_searcher.py: 7 tests (search_memories API, filters, error handling) - test_dialect.py: 13 tests (AAAK compression, entity/emotion detection, zettel encoding) All 92 tests pass on Python 3.13 with chromadb 0.6.3.
…tests, restore dev extra
135d4c6 to
cd8b245
Compare
- Switch CI install step from `pip install -r requirements.txt` to `pip install -e ".[dev]"` since requirements.txt was removed - Add noqa: E402 to intentionally-late imports in conftest.py (HOME must be isolated before mempalace imports) - Remove unused KnowledgeGraph import in test_knowledge_graph.py - Apply ruff formatting to test files
|
CI should pass now |
PR MemPalace#147 renamed compression_stats fields (ratio -> size_ratio, compressed_chars -> summary_chars) and switched count_tokens to a word-based heuristic, but the test_dialect tests from PR MemPalace#131 still assert the old API and fail on main. Bring TestCompressionStats.test_stats in line with the current dict keys (size_ratio, summary_chars, summary_tokens_est) and update test_count_tokens to match the word-based formula, with extra coverage for the empty and single-word edge cases around max(1, ...). This unblocks CI on main, which currently fails on these two tests.
… call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR MemPalace#131. 92 tests pass.
Prevents OOM when the palace grows large. The following unbounded metadata fetches now have a safety cap: - tool_status: col.get(include=['metadatas'], limit=10000) - tool_list_wings: same - tool_list_rooms: same (including wing-filtered variant) - tool_get_taxonomy: same - Layer1.generate: col.get(include=['documents','metadatas'], limit=10000) Layer2 already had a limit parameter — no change needed. Finding: MemPalace#3 (CRITICAL — unbounded data fetching causes OOM) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: MemPalace#12 (HIGH), MemPalace#5 (MEDIUM), MemPalace#15 (LOW) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: MemPalace#11 (HIGH — add ignores updates), MemPalace#6 (MEDIUM — TOCTOU), MemPalace#13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Enable WAL journal mode in _conn() for better concurrent read performance and reduced SQLITE_BUSY risk - Add LIMIT 100 to entity-filtered timeline query (was unbounded, while global timeline already had LIMIT 100) Findings: MemPalace#8 (HIGH — no WAL mode), MemPalace#22 (LOW — inconsistent limits) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
- Tighten chromadb dependency from >=0.4.0,<1 to >=0.5.0,<0.7 (the collection API changed significantly across majors; this pins to the tested range) - Add optional 'spellcheck' extras for the undeclared autocorrect dependency used in spellcheck.py - Add PEP 561 py.typed marker for type checker support Findings: MemPalace#10 (HIGH — chromadb range too wide), MemPalace#30 (LOW — undeclared autocorrect), MemPalace#32 (LOW — missing py.typed) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
The save hook uses SESSION_ID in file paths (state_dir/). A crafted session_id value like '../../etc/cron.d/evil' could write state files outside the intended directory. Strip everything except [a-zA-Z0-9_-] from SESSION_ID, defaulting to 'unknown' if empty after sanitization. Finding: MemPalace#4 (HIGH — path traversal via SESSION_ID) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
… call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR MemPalace#131. 92 tests pass.
MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: MemPalace#11 (HIGH — add ignores updates), MemPalace#6 (MEDIUM — TOCTOU), MemPalace#13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
… call The MCP server previously created a new PersistentClient on every tool call via _get_collection(). This incurs HNSW index loading overhead on each request. Cache the client and collection at module level. The cache resets naturally on process restart (MCP runs as a subprocess). Also adds a _reset_mcp_cache fixture to conftest.py for test isolation. Includes test infrastructure from PR MemPalace#131. 92 tests pass.
MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: MemPalace#11 (HIGH — add ignores updates), MemPalace#6 (MEDIUM — TOCTOU), MemPalace#13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
MCP tool_add_drawer: - Make drawer_id content-based: hash full content instead of content[:100] + timestamp. Same content → same ID, eliminating TOCTOU race conditions - Switch from col.add() to col.upsert() so re-filing with updated content updates the existing drawer miner.add_drawer: - Switch from collection.add() to collection.upsert() so re-mining a modified file updates instead of silently failing - Remove the try/except catching 'already exists' — upsert handles this naturally Findings: MemPalace#11 (HIGH — add ignores updates), MemPalace#6 (MEDIUM — TOCTOU), MemPalace#13 (MEDIUM — non-deterministic IDs) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
Summary
Expands the test suite from 20 → 92 tests covering previously untested modules, and migrates the project to uv for dependency management.
Test Coverage Added
conftest.pytest_knowledge_graph.pyas_of), invalidation, timeline limits, statstest_mcp_server.pytest_searcher.pysearch_memoriesAPI: wing/room filters, result limits, error handling, response shapetest_dialect.pyTooling Changes
setuptools→hatchling(lighter, uv-native)[dependency-groups](PEP 735) —pytest>=7.0,ruff>=0.4.0requirements.txt(redundant withuv.lock)uv.lockfor reproducible installsBug Fix (trivial)
__version__mismatch:__init__.pysaid2.0.0whilepyproject.tomlsaid3.0.0How to run
All 92 tests pass on Python 3.13 with chromadb 0.6.3 (~16s).