fix: security hardening, test expansion (640→719), and 5 open issue fixes (#535, #536, #521, #478, #531) by mrdeeme · Pull Request #542 · MemPalace/mempalace

mrdeeme · 2026-04-10T15:00:47Z

Summary

Comprehensive audit producing 12 code fixes, 79 new tests (640 → 719 total), and 5 additional bug fixes for open issues. All 612 unit tests pass.

Commit 1: Security hardening, concurrency safety, and test suite expansion

Code Fixes (12 total)

Security (`mcp_server.py`)

Sanitize 8 str(e) leaks in error responses → generic messages + logger.error
Remove duplicate _client_cache declaration
Update docstring to reflect all 19 tools

Data Integrity (`convo_miner.py`)

Replace collection.add() with collection.upsert() to prevent duplicate ID crashes
Fix diary_write to use upsert instead of add

Concurrency (`knowledge_graph.py`)

Add threading.Lock for thread-safe SQLite access in WAL mode

Performance (`palace_graph.py`)

Add TTL cache (30s) + count-based invalidation for build_graph()
Replace hardcoded limit=10000 with _iter_all_metadata() batched pagination (1000/batch)
Singleton ChromaDB client keyed by palace_path (was creating new client per call)

Error Handling (`mcp_server.py`)

Replace bare except Exception: pass with proper error handling + warning field

Cleanup (`split_mega_files.py`)

Remove hardcoded personal names from fallback list → empty list

Test Suite Expansion (640 → 719 tests, +79 net new)

Category	File	Tests
New	`test_palace.py`	15 — `get_collection`, `file_already_mined`, `SKIP_DIRS`
New	`test_concurrency.py`	7 — parallel search, concurrent KG writes, MCP
New	`test_failure_modes.py`	16 — error paths: palace, config, searcher, KG, MCP
Expanded	`test_config.py`	4 → 24 — precedence, sanitize_name/content, invalid configs
Expanded	`test_convo_miner.py`	1 → 12 — decomposed monolith into focused scenarios
Expanded	`test_version_consistency.py`	2 → 7 — semver, module consistency, server info

Benchmark Assertion Thresholds (5 files)

test_memory_profile.py: RSS growth limits (<100MB/<150MB)
test_search_bench.py: concurrent error count (0) + p95 latency (<5s)
test_recall_threshold.py: recall@10 ≥ 50% at ≤1000 drawers
test_palace_boost.py: wing/room boost ≥ −10% degradation
test_chromadb_stress.py: batch insertion faster than sequential

Commit 2: Fix 5 open issues (#535, #536, #521, #478, #531)

#535 — UnicodeEncodeError on Windows (cp1251/cp1252)

Add sys.stdout/stderr.reconfigure(encoding='utf-8', errors='replace') at the CLI entry point (cli.py), fixing checkmark (U+2713) and CJK crashes across all subcommands.

#536 — Overly broad emotion regex misclassifies 66% of content

Remove wildcard regex \*[^*]+\* from EMOTION_MARKERS in general_extractor.py. This matched all Markdown bold/italic, routing technical content to the emotional room.

#521 — hnswlib SIGSEGV on modified-file re-mine (macOS ARM64)

Add collection.delete(where={'source_file': ...}) before the upsert loop in miner.py::process_file(). Converts re-mines from upsert-over-existing (which hits hnswlib's thread-unsafe updatePoint/repairConnectionsForUpdate) into clean delete+insert.

#478 — `miner.py::status()` truncates at 10,000 drawers

Replace hardcoded limit=10000 with batched pagination (1000/batch), matching the fix already applied to mcp_server.py.

#531 — README chromadb version mismatch

Update README from chromadb>=0.4.0 to chromadb>=0.5.0,<0.7 to match pyproject.toml.

Validation

612 unit tests passed, 0 failed, 1 skipped (Unix-only permissions on Windows)
719 total tests collected (unit + benchmark)

Closes #535, closes #536, closes #521, closes #478, closes #531.

Copilot

Pull request overview

This PR focuses on hardening MemPalace’s core runtime (MCP server, palace graph, knowledge graph, convo miner) while substantially expanding test coverage, including new concurrency and failure-mode suites and additional benchmark assertions.

Changes:

Harden MCP server responses and metadata scanning (paged iteration, warning fields, reduced error-detail leakage).
Improve data integrity and performance (Chroma upserts; palace graph caching; reduced full-scan patterns).
Expand automated testing (+new integration, concurrency, and failure-mode tests; benchmark threshold assertions).

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`mempalace/mcp_server.py`	Adds paginated metadata iteration + safer error handling/messages; updates tool surface documentation.
`mempalace/palace_graph.py`	Introduces TTL/count-based caching and a per-`palace_path` client cache to reduce repeated O(N) scans.
`mempalace/knowledge_graph.py`	Adds a lock and helper methods intended to improve SQLite concurrency safety.
`mempalace/convo_miner.py`	Switches from `add()` to `upsert()` when writing drawers to avoid duplicate-ID crashes.
`mempalace/split_mega_files.py`	Removes hardcoded fallback personal names.
`tests/test_palace.py`	New tests for shared palace utilities (collection creation, permissions, mined-file detection).
`tests/test_concurrency.py`	New tests exercising concurrent search, MCP calls, and KG operations.
`tests/test_failure_modes.py`	New tests covering degraded/error scenarios across palace/config/search/KG/MCP.
`tests/test_convo_miner.py`	Refactors/expands convo miner tests into multiple focused integration scenarios.
`tests/test_config.py`	Adds config precedence/edge-case tests plus sanitizer unit tests.
`tests/test_version_consistency.py`	Expands version consistency checks across package/module/MCP initialize response.
`tests/benchmarks/test_search_bench.py`	Adds concurrency error and p95 latency thresholds (benchmark-only).
`tests/benchmarks/test_recall_threshold.py`	Adds recall@10 minimum threshold at small scales (benchmark-only).
`tests/benchmarks/test_palace_boost.py`	Adds minimum “boost” threshold for filtering degradation (benchmark-only).
`tests/benchmarks/test_memory_profile.py`	Adds RSS growth thresholds for leak detection (benchmark-only).
`tests/benchmarks/test_chromadb_stress.py`	Adds speedup assertion for batched insertion vs sequential (benchmark-only).
`.gitignore`	Adds an ignore entry related to the new instructions file.
`.github/instructions/codacy.instructions.md`	Adds Codacy MCP-server instruction content.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T15:08:44Z

+        """JSON array instead of object — .get() will fail, but config
+        catches the exception and falls back to defaults."""
+        (tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")
+        cfg = MempalaceConfig(config_dir=str(tmp_path))
+        # _file_config will be [1,2,3] — accessing .get() raises AttributeError
+        # The config is loaded successfully but properties that call .get() may fail.
+        # Verify the property doesn't crash at construction time.
+        assert cfg is not None
+        # Accessing collection_name will call list.get() which raises.
+        # This reveals the config doesn't guard against non-dict JSON.
+        with pytest.raises(AttributeError):
+            _ = cfg.collection_name


This test currently asserts that a malformed config.json (JSON array) leads to an AttributeError when accessing collection_name. That bakes in a crashy behavior for invalid configs; it would be more robust to harden MempalaceConfig to treat non-dict JSON as {} and update this test to assert fallback defaults instead of expecting an exception.

Suggested change

"""JSON array instead of object — .get() will fail, but config

catches the exception and falls back to defaults."""

(tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")

cfg = MempalaceConfig(config_dir=str(tmp_path))

# _file_config will be [1,2,3] — accessing .get() raises AttributeError

# The config is loaded successfully but properties that call .get() may fail.

# Verify the property doesn't crash at construction time.

assert cfg is not None

# Accessing collection_name will call list.get() which raises.

# This reveals the config doesn't guard against non-dict JSON.

with pytest.raises(AttributeError):

_ = cfg.collection_name

"""JSON array instead of object should fallback to default config values."""

(tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")

cfg = MempalaceConfig(config_dir=str(tmp_path))

default_dir = tmp_path / "defaults"

default_dir.mkdir()

default_cfg = MempalaceConfig(config_dir=str(default_dir))

# Malformed-but-valid JSON that is not an object should be treated like

# an empty config and use the same defaults as a clean config dir.

assert cfg is not None

assert cfg.collection_name == default_cfg.collection_name

assert cfg.palace_path == default_cfg.palace_path

Copilot · 2026-04-10T15:08:45Z

-
-
-# ==================== WRITE-AHEAD LOG ====================
+# ==================== WRITE-AHEAD LOG ==


Section header comment appears to have a typo/inconsistency (# ==================== WRITE-AHEAD LOG ==). Consider keeping the delimiter consistent (same number of = as other section headers) to avoid accidental grep/format mismatches.

Suggested change

# ==================== WRITE-AHEAD LOG ==

# ==================== WRITE-AHEAD LOG ====================

Copilot · 2026-04-10T15:08:45Z

+import pytest
+


pytest is imported but not used in this module. If Ruff/pyflakes linting runs on tests, this will raise an unused-import error (F401).

Suggested change

import pytest

Copilot · 2026-04-10T15:08:46Z

 import tempfile
 import shutil


tempfile and shutil are imported but not used in this updated test file. Consider removing them to avoid unused-import lint failures (e.g., Ruff F401) if tests are linted.

Suggested change

import tempfile

import shutil

Copilot · 2026-04-10T15:08:46Z

+
+import threading
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from unittest.mock import MagicMock, patch


MagicMock and patch are imported but not used anywhere in this module. Consider removing them to avoid unused-import lint failures (e.g., Ruff F401).

Suggested change

from unittest.mock import MagicMock, patch

Copilot · 2026-04-10T15:08:46Z

+
+import json
+import os
+from unittest.mock import MagicMock, patch, PropertyMock


PropertyMock is imported but not used in this module. Consider removing it to avoid unused-import lint failures (e.g., Ruff F401).

Suggested change

from unittest.mock import MagicMock, patch, PropertyMock

from unittest.mock import MagicMock, patch

Copilot · 2026-04-10T15:08:47Z

+    def _execute(self, sql, params=()):
+        """Thread-safe execute wrapper."""
+        with self._lock:
+            return self._conn().execute(sql, params)
+


Potential deadlock: _execute() acquires self._lock and then calls _conn(), which also acquires self._lock (non-reentrant threading.Lock). Any call path using _execute/_executemany/_commit can hang. Consider switching to threading.RLock() or restructuring so _conn() doesn’t lock, and then ensure all DB operations consistently go through the locked helpers (currently many methods still call conn.execute(...) directly).

Copilot · 2026-04-10T15:08:47Z

+    description: Configuration for AI behavior when interacting with Codacy's MCP Server
+    applyTo: '**'
+---
+---


The top of this file has an extra --- separator (line 5) and the YAML front-matter keys are indented. Many front-matter parsers expect a single --- ... --- block with top-level keys; please fix the front matter so tooling that consumes .github/instructions/*.md can parse it reliably.

Suggested change

description: Configuration for AI behavior when interacting with Codacy's MCP Server

applyTo: '**'

---

---

description: Configuration for AI behavior when interacting with Codacy's MCP Server

applyTo: '**'

---

web3guru888

Review — PR #542 (mrdeeme)

Solid contribution. Saw your threading work in #459 and this is a clear step up in scope. Here's my take:

Strengths:

Paginated _iter_all_metadata() — this is the correct fix for #478. PR #539 took a caching approach that just cached the truncated result; batched pagination actually resolves the root cause. 1000-item batches are sane for typical palace sizes.
KG threading.Lock() — concurrent MCP tool calls absolutely can race on SQLite writes. Good catch. The _execute()/_executemany()/_commit() wrapper pattern keeps the locking contained.
add() → upsert() in convo_miner — this was flagged as a correctness issue back in the #298 thread. Right fix.
79 new tests covering concurrency, failure modes, and benchmark thresholds — this alone would be worth merging. test_concurrency.py in particular fills a real gap.
#521 (hnswlib SIGSEGV): delete-before-upsert is the right approach given hnswlib's non-thread-safe updatePoint. Clean.

Concerns:

#536 overlap with #543 — both PRs remove \*[^*]+\* from EMOTION_MARKERS, but #543 goes further: it replaces with a more targeted emote pattern and adds _strip_markdown(). This PR's approach (bare removal) is simpler but may over-remove. The maintainer should resolve the overlap before merging either.
Codacy instructions file — .github/instructions/codacy.instructions.md is Copilot agent context, not repo documentation. It shouldn't be committed to the public repo. The .gitignore entry also uses a Windows-style backslash path separator (instructions\codacy...) which won't match on Linux/macOS — so it won't actually be ignored on those platforms even if it were intentional.
Minor: split_mega_files.py removing hardcoded personal names from the fallback list → empty list is correct (those shouldn't ship), but worth a changelog note if any users depend on that heuristic.

Bottom line: The core of this PR (pagination fix, KG locking, upsert, test suite) is merge-worthy. I'd ask for the Codacy file to be dropped and the #536 approach coordinated with the author of #543 before merge.

[MemPalace-AGI integration — 208 discoveries, 710 KG entities | dashboard]

web3guru888 · 2026-04-10T16:33:48Z

Solid security hardening PR — let me break down what stands out.

str(e) sanitization in MCP error responses — This is important work. Error message leaks in MCP stdio responses can expose internal paths, file names, and—in the worst case—partial secrets embedded in exception strings. The pattern of switching to generic user-facing messages while routing full context to logger.error is exactly right. Glad to see this applied consistently across 8 call sites rather than patched piecemeal.

add() → upsert() in convo_miner.py — Real correctness fix. Multiple mine runs on the same conversation would previously crash on duplicate IDs, which means any retry or re-index workflow was broken. Worth validating that the upsert is truly idempotent: if the same conversation produces the same content hash, the upsert should be a no-op. If the hash changes between runs (e.g., due to timestamp inclusion), you'd get silent data mutation instead of a crash — which is harder to debug. Worth a quick sanity check on the hashing logic.

threading.Lock for SQLite in WAL mode — Correct. WAL provides read/write concurrency at the SQLite level, but Python-level threading still needs coordination around the connection object. The lock closes that gap.

Issue coverage — Fixing #535, #536, #521, #478, and #531 in one pass is efficient for maintainers. The 79 new tests (640 → 719) are the right safety net for a wide-surface diff. One caution though: mega-PRs that touch security, concurrency, and data integrity simultaneously can obscure interactions between changes. I'd suggest the reviewer runs the full test suite on just the security-only commit (the str(e) sanitization) in isolation — confirming the error-suppression changes don't accidentally mask real failures during development. It's easy for a generic-message wrapper to hide a logic error if the surrounding test doesn't assert on the error path.

Overall: The str(e) sanitization alone would be worth merging; getting five issue fixes in the same sweep with full test coverage is a real contribution. Nice work @mrdeeme.

[MemPalace-AGI integration — production stats at https://milla-jovovich.github.io/mempalace/integrations/mempalace-agi/]

bensig · 2026-04-11T06:17:53Z

@mrdeeme conflicts here

… (640->719 tests) - Sanitize 8 str(e) leaks in error responses -> generic messages + logger.error - Remove duplicate _client_cache declaration - Update docstring to reflect all 19 tools - Replace collection.add() with collection.upsert() to prevent duplicate ID crashes - Fix diary_write to use upsert instead of add - Add threading.Lock for thread-safe SQLite access in WAL mode - Add TTL cache (30s) + count-based invalidation for build_graph() - Replace hardcoded limit=10000 with _iter_all_metadata() batched pagination (1000/batch) - Singleton ChromaDB client keyed by palace_path (was creating new client per call) - Replace bare 'except Exception: pass' with proper error handling + warning field - Remove hardcoded personal names from fallback list -> empty list - test_palace.py: 15 tests for previously untested palace.py core module - test_concurrency.py: 7 tests for thread safety (parallel search, KG writes, MCP) - test_failure_modes.py: 16 tests for error paths (palace, config, searcher, KG, MCP) - test_config.py: 4 -> 24 tests (precedence, sanitize_name/content, invalid configs) - test_convo_miner.py: 1 -> 12 tests (decomposed monolith into focused scenarios) - test_version_consistency.py: 2 -> 7 tests (semver, module consistency, server info) - test_memory_profile.py: RSS growth limits (<100MB/<150MB) - test_search_bench.py: concurrent error count (0) + p95 latency (<5s) - test_recall_threshold.py: recall@10 >= 50% at <=1000 drawers - test_palace_boost.py: wing/room boost >= -10% degradation - test_chromadb_stress.py: batch insertion faster than sequential - .github/instructions/codacy.instructions.md: Codacy MCP integration config - .gitignore: updated entries All 612 unit tests pass (1 skipped: Unix-only permissions on Windows).

…, MemPalace#478, MemPalace#531) - Add sys.stdout/stderr.reconfigure(encoding='utf-8', errors='replace') at CLI entry point (cli.py), fixing checkmark (U+2713) and CJK crashes in miner.py, convo_miner.py, split_mega_files.py, and entity_detector.py. - Remove wildcard regex \\\*[^*]+\\*\ from EMOTION_MARKERS in general_extractor.py. This pattern matched all Markdown bold/italic text, routing technical content to the 'emotional' room. - Add collection.delete(where={'source_file': ...}) before the upsert loop in miner.py::process_file(). Converts re-mines from upsert-over-existing (which hits hnswlib's thread-unsafe updatePoint/repairConnectionsForUpdate path) into clean delete+insert, bypassing the race entirely. - Replace hardcoded limit=10000 with batched pagination (1000/batch) in miner.py::status(), matching the fix already applied to mcp_server.py. - Update README from 'chromadb>=0.4.0' to 'chromadb>=0.5.0,<0.7' to match pyproject.toml. All 612 unit tests pass.

…rdening - Fix potential deadlock in KnowledgeGraph: threading.Lock → threading.RLock (reentrant) since _execute/_commit call _conn which also acquires the lock - Harden MempalaceConfig: non-dict JSON (e.g. [1,2,3]) now treated as empty config instead of crashing on .get() calls - Remove codacy.instructions.md from repo (not suited for public PRs) - Fix inconsistent section header in mcp_server.py - Remove unused imports: PropertyMock, MagicMock, patch, tempfile, shutil, pytest - Update test_json_array_instead_of_object to assert fallback defaults

…, MemPalace#549, MemPalace#541, MemPalace#608) - MemPalace#586: dry-run crash with TypeError when room=None — return 'general' fallback - MemPalace#602: claude.ai export sender/role field mismatch + large file skip warning - MemPalace#477: tool_search limit clamped to [1, 100] to prevent memory exhaustion - MemPalace#549: save hook no longer counts tool_result messages as human exchanges - MemPalace#541: remove non-existent Discussions reference from CONTRIBUTING.md - MemPalace#608: mtime-based cache invalidation for stale HNSW search results

…emPalace#477 MemPalace#549 MemPalace#608) - test_miner: process_file returns 'general' (not None) for tiny/unreadable files - test_normalize: sender fallback in _try_claude_ai_json (flat, privacy, mixed) - test_mcp_server: limit clamp (high/low/negative), cache invalidation (mtime/rate-limit/clear) - test_convo_miner: oversized file prints warning message - conftest: reset _cache_sqlite_mtime and _cache_last_check between tests - lint: fix unused imports (MIN_CHUNK_SIZE, os) and add missing pytest import

bensig · 2026-04-12T04:41:31Z

Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution!

Copilot AI review requested due to automatic review settings April 10, 2026 15:00

Copilot started reviewing on behalf of mrdeeme April 10, 2026 15:01 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

mrdeeme changed the title ~~fix: security hardening, concurrency safety, and test suite expansion (640→719 tests)~~ fix: security hardening, test expansion (640→719), and 5 open issue fixes (#535, #536, #521, #478, #531) Apr 10, 2026

This was referenced Apr 10, 2026

fix: strip Markdown before emotion scoring to prevent false classification #543

Open

fix: security hardening, performance, and input validation #539

Closed

web3guru888 reviewed Apr 10, 2026

View reviewed changes

web3guru888 mentioned this pull request Apr 10, 2026

--extract general dumps Markdown-bold content into emotional room via overly broad \*[^*]+\* regex #536

Closed

web3guru888 mentioned this pull request Apr 10, 2026

feat: add LanceDB backend abstraction and migration path #574

Open

mrdeeme added 3 commits April 11, 2026 08:01

mrdeeme force-pushed the main branch from 6f9acfb to 4b62093 Compare April 11, 2026 11:05

mrdeeme requested review from bensig and milla-jovovich as code owners April 11, 2026 11:06

mrdeeme added 2 commits April 11, 2026 08:44

bensig changed the base branch from main to develop April 11, 2026 22:21

bensig requested a review from igorls as a code owner April 11, 2026 22:21

bensig closed this Apr 12, 2026



		# ==================== WRITE-AHEAD LOG ====================
		# ==================== WRITE-AHEAD LOG ==

	from unittest.mock import MagicMock, patch, PropertyMock
	from unittest.mock import MagicMock, patch

Conversation

mrdeeme commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commit 1: Security hardening, concurrency safety, and test suite expansion

Code Fixes (12 total)

Security (mcp_server.py)

Data Integrity (convo_miner.py)

Concurrency (knowledge_graph.py)

Performance (palace_graph.py)

Error Handling (mcp_server.py)

Cleanup (split_mega_files.py)

Test Suite Expansion (640 → 719 tests, +79 net new)

Benchmark Assertion Thresholds (5 files)

Commit 2: Fix 5 open issues (#535, #536, #521, #478, #531)

#535 — UnicodeEncodeError on Windows (cp1251/cp1252)

#536 — Overly broad emotion regex misclassifies 66% of content

#521 — hnswlib SIGSEGV on modified-file re-mine (macOS ARM64)

#478 — miner.py::status() truncates at 10,000 drawers

#531 — README chromadb version mismatch

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

web3guru888 commented Apr 10, 2026

Uh oh!

bensig commented Apr 11, 2026

Uh oh!

bensig commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mrdeeme commented Apr 10, 2026 •

edited

Loading

Security (`mcp_server.py`)

Data Integrity (`convo_miner.py`)

Concurrency (`knowledge_graph.py`)

Performance (`palace_graph.py`)

Error Handling (`mcp_server.py`)

Cleanup (`split_mega_files.py`)

#478 — `miner.py::status()` truncates at 10,000 drawers