Skip to content

fix: security hardening, test expansion (640→719), and 5 open issue fixes (#535, #536, #521, #478, #531)#542

Closed
mrdeeme wants to merge 5 commits intoMemPalace:developfrom
mrdeeme:main
Closed

fix: security hardening, test expansion (640→719), and 5 open issue fixes (#535, #536, #521, #478, #531)#542
mrdeeme wants to merge 5 commits intoMemPalace:developfrom
mrdeeme:main

Conversation

@mrdeeme
Copy link
Copy Markdown

@mrdeeme mrdeeme commented Apr 10, 2026

Summary

Comprehensive audit producing 12 code fixes, 79 new tests (640 → 719 total), and 5 additional bug fixes for open issues. All 612 unit tests pass.


Commit 1: Security hardening, concurrency safety, and test suite expansion

Code Fixes (12 total)

Security (mcp_server.py)

  • Sanitize 8 str(e) leaks in error responses → generic messages + logger.error
  • Remove duplicate _client_cache declaration
  • Update docstring to reflect all 19 tools

Data Integrity (convo_miner.py)

  • Replace collection.add() with collection.upsert() to prevent duplicate ID crashes
  • Fix diary_write to use upsert instead of add

Concurrency (knowledge_graph.py)

  • Add threading.Lock for thread-safe SQLite access in WAL mode

Performance (palace_graph.py)

  • Add TTL cache (30s) + count-based invalidation for build_graph()
  • Replace hardcoded limit=10000 with _iter_all_metadata() batched pagination (1000/batch)
  • Singleton ChromaDB client keyed by palace_path (was creating new client per call)

Error Handling (mcp_server.py)

  • Replace bare except Exception: pass with proper error handling + warning field

Cleanup (split_mega_files.py)

  • Remove hardcoded personal names from fallback list → empty list

Test Suite Expansion (640 → 719 tests, +79 net new)

Category File Tests
New test_palace.py 15 — get_collection, file_already_mined, SKIP_DIRS
New test_concurrency.py 7 — parallel search, concurrent KG writes, MCP
New test_failure_modes.py 16 — error paths: palace, config, searcher, KG, MCP
Expanded test_config.py 4 → 24 — precedence, sanitize_name/content, invalid configs
Expanded test_convo_miner.py 1 → 12 — decomposed monolith into focused scenarios
Expanded test_version_consistency.py 2 → 7 — semver, module consistency, server info

Benchmark Assertion Thresholds (5 files)

  • test_memory_profile.py: RSS growth limits (<100MB/<150MB)
  • test_search_bench.py: concurrent error count (0) + p95 latency (<5s)
  • test_recall_threshold.py: recall@10 ≥ 50% at ≤1000 drawers
  • test_palace_boost.py: wing/room boost ≥ −10% degradation
  • test_chromadb_stress.py: batch insertion faster than sequential

Commit 2: Fix 5 open issues (#535, #536, #521, #478, #531)

#535 — UnicodeEncodeError on Windows (cp1251/cp1252)

Add sys.stdout/stderr.reconfigure(encoding='utf-8', errors='replace') at the CLI entry point (cli.py), fixing checkmark (U+2713) and CJK crashes across all subcommands.

#536 — Overly broad emotion regex misclassifies 66% of content

Remove wildcard regex \*[^*]+\* from EMOTION_MARKERS in general_extractor.py. This matched all Markdown bold/italic, routing technical content to the emotional room.

#521 — hnswlib SIGSEGV on modified-file re-mine (macOS ARM64)

Add collection.delete(where={'source_file': ...}) before the upsert loop in miner.py::process_file(). Converts re-mines from upsert-over-existing (which hits hnswlib's thread-unsafe updatePoint/repairConnectionsForUpdate) into clean delete+insert.

#478miner.py::status() truncates at 10,000 drawers

Replace hardcoded limit=10000 with batched pagination (1000/batch), matching the fix already applied to mcp_server.py.

#531 — README chromadb version mismatch

Update README from chromadb>=0.4.0 to chromadb>=0.5.0,<0.7 to match pyproject.toml.


Validation

  • 612 unit tests passed, 0 failed, 1 skipped (Unix-only permissions on Windows)
  • 719 total tests collected (unit + benchmark)

Closes #535, closes #536, closes #521, closes #478, closes #531.

Copilot AI review requested due to automatic review settings April 10, 2026 15:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on hardening MemPalace’s core runtime (MCP server, palace graph, knowledge graph, convo miner) while substantially expanding test coverage, including new concurrency and failure-mode suites and additional benchmark assertions.

Changes:

  • Harden MCP server responses and metadata scanning (paged iteration, warning fields, reduced error-detail leakage).
  • Improve data integrity and performance (Chroma upserts; palace graph caching; reduced full-scan patterns).
  • Expand automated testing (+new integration, concurrency, and failure-mode tests; benchmark threshold assertions).

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
mempalace/mcp_server.py Adds paginated metadata iteration + safer error handling/messages; updates tool surface documentation.
mempalace/palace_graph.py Introduces TTL/count-based caching and a per-palace_path client cache to reduce repeated O(N) scans.
mempalace/knowledge_graph.py Adds a lock and helper methods intended to improve SQLite concurrency safety.
mempalace/convo_miner.py Switches from add() to upsert() when writing drawers to avoid duplicate-ID crashes.
mempalace/split_mega_files.py Removes hardcoded fallback personal names.
tests/test_palace.py New tests for shared palace utilities (collection creation, permissions, mined-file detection).
tests/test_concurrency.py New tests exercising concurrent search, MCP calls, and KG operations.
tests/test_failure_modes.py New tests covering degraded/error scenarios across palace/config/search/KG/MCP.
tests/test_convo_miner.py Refactors/expands convo miner tests into multiple focused integration scenarios.
tests/test_config.py Adds config precedence/edge-case tests plus sanitizer unit tests.
tests/test_version_consistency.py Expands version consistency checks across package/module/MCP initialize response.
tests/benchmarks/test_search_bench.py Adds concurrency error and p95 latency thresholds (benchmark-only).
tests/benchmarks/test_recall_threshold.py Adds recall@10 minimum threshold at small scales (benchmark-only).
tests/benchmarks/test_palace_boost.py Adds minimum “boost” threshold for filtering degradation (benchmark-only).
tests/benchmarks/test_memory_profile.py Adds RSS growth thresholds for leak detection (benchmark-only).
tests/benchmarks/test_chromadb_stress.py Adds speedup assertion for batched insertion vs sequential (benchmark-only).
.gitignore Adds an ignore entry related to the new instructions file.
.github/instructions/codacy.instructions.md Adds Codacy MCP-server instruction content.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_failure_modes.py Outdated
Comment on lines +64 to +75
"""JSON array instead of object — .get() will fail, but config
catches the exception and falls back to defaults."""
(tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")
cfg = MempalaceConfig(config_dir=str(tmp_path))
# _file_config will be [1,2,3] — accessing .get() raises AttributeError
# The config is loaded successfully but properties that call .get() may fail.
# Verify the property doesn't crash at construction time.
assert cfg is not None
# Accessing collection_name will call list.get() which raises.
# This reveals the config doesn't guard against non-dict JSON.
with pytest.raises(AttributeError):
_ = cfg.collection_name
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test currently asserts that a malformed config.json (JSON array) leads to an AttributeError when accessing collection_name. That bakes in a crashy behavior for invalid configs; it would be more robust to harden MempalaceConfig to treat non-dict JSON as {} and update this test to assert fallback defaults instead of expecting an exception.

Suggested change
"""JSON array instead of object — .get() will fail, but config
catches the exception and falls back to defaults."""
(tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")
cfg = MempalaceConfig(config_dir=str(tmp_path))
# _file_config will be [1,2,3] — accessing .get() raises AttributeError
# The config is loaded successfully but properties that call .get() may fail.
# Verify the property doesn't crash at construction time.
assert cfg is not None
# Accessing collection_name will call list.get() which raises.
# This reveals the config doesn't guard against non-dict JSON.
with pytest.raises(AttributeError):
_ = cfg.collection_name
"""JSON array instead of object should fallback to default config values."""
(tmp_path / "config.json").write_text("[1, 2, 3]", encoding="utf-8")
cfg = MempalaceConfig(config_dir=str(tmp_path))
default_dir = tmp_path / "defaults"
default_dir.mkdir()
default_cfg = MempalaceConfig(config_dir=str(default_dir))
# Malformed-but-valid JSON that is not an object should be treated like
# an empty config and use the same defaults as a clean config dir.
assert cfg is not None
assert cfg.collection_name == default_cfg.collection_name
assert cfg.palace_path == default_cfg.palace_path

Copilot uses AI. Check for mistakes.
Comment thread mempalace/mcp_server.py Outdated


# ==================== WRITE-AHEAD LOG ====================
# ==================== WRITE-AHEAD LOG ==
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section header comment appears to have a typo/inconsistency (# ==================== WRITE-AHEAD LOG ==). Consider keeping the delimiter consistent (same number of = as other section headers) to avoid accidental grep/format mismatches.

Suggested change
# ==================== WRITE-AHEAD LOG ==
# ==================== WRITE-AHEAD LOG ====================

Copilot uses AI. Check for mistakes.
Comment thread tests/test_version_consistency.py Outdated
Comment on lines +4 to +5
import pytest

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest is imported but not used in this module. If Ruff/pyflakes linting runs on tests, this will raise an unused-import error (F401).

Suggested change
import pytest

Copilot uses AI. Check for mistakes.
Comment thread tests/test_convo_miner.py Outdated
Comment on lines 4 to 5
import tempfile
import shutil
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tempfile and shutil are imported but not used in this updated test file. Consider removing them to avoid unused-import lint failures (e.g., Ruff F401) if tests are linted.

Suggested change
import tempfile
import shutil

Copilot uses AI. Check for mistakes.
Comment thread tests/test_concurrency.py Outdated

import threading
from concurrent.futures import ThreadPoolExecutor, as_completed
from unittest.mock import MagicMock, patch
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MagicMock and patch are imported but not used anywhere in this module. Consider removing them to avoid unused-import lint failures (e.g., Ruff F401).

Suggested change
from unittest.mock import MagicMock, patch

Copilot uses AI. Check for mistakes.
Comment thread tests/test_failure_modes.py Outdated

import json
import os
from unittest.mock import MagicMock, patch, PropertyMock
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PropertyMock is imported but not used in this module. Consider removing it to avoid unused-import lint failures (e.g., Ruff F401).

Suggested change
from unittest.mock import MagicMock, patch, PropertyMock
from unittest.mock import MagicMock, patch

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +105
def _execute(self, sql, params=()):
"""Thread-safe execute wrapper."""
with self._lock:
return self._conn().execute(sql, params)

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential deadlock: _execute() acquires self._lock and then calls _conn(), which also acquires self._lock (non-reentrant threading.Lock). Any call path using _execute/_executemany/_commit can hang. Consider switching to threading.RLock() or restructuring so _conn() doesn’t lock, and then ensure all DB operations consistently go through the locked helpers (currently many methods still call conn.execute(...) directly).

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +5
description: Configuration for AI behavior when interacting with Codacy's MCP Server
applyTo: '**'
---
---
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top of this file has an extra --- separator (line 5) and the YAML front-matter keys are indented. Many front-matter parsers expect a single --- ... --- block with top-level keys; please fix the front matter so tooling that consumes .github/instructions/*.md can parse it reliably.

Suggested change
description: Configuration for AI behavior when interacting with Codacy's MCP Server
applyTo: '**'
---
---
description: Configuration for AI behavior when interacting with Codacy's MCP Server
applyTo: '**'
---

Copilot uses AI. Check for mistakes.
@mrdeeme mrdeeme changed the title fix: security hardening, concurrency safety, and test suite expansion (640→719 tests) fix: security hardening, test expansion (640→719), and 5 open issue fixes (#535, #536, #521, #478, #531) Apr 10, 2026
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — PR #542 (mrdeeme)

Solid contribution. Saw your threading work in #459 and this is a clear step up in scope. Here's my take:

Strengths:

  • Paginated _iter_all_metadata() — this is the correct fix for #478. PR #539 took a caching approach that just cached the truncated result; batched pagination actually resolves the root cause. 1000-item batches are sane for typical palace sizes.
  • KG threading.Lock() — concurrent MCP tool calls absolutely can race on SQLite writes. Good catch. The _execute()/_executemany()/_commit() wrapper pattern keeps the locking contained.
  • add()upsert() in convo_miner — this was flagged as a correctness issue back in the #298 thread. Right fix.
  • 79 new tests covering concurrency, failure modes, and benchmark thresholds — this alone would be worth merging. test_concurrency.py in particular fills a real gap.
  • #521 (hnswlib SIGSEGV): delete-before-upsert is the right approach given hnswlib's non-thread-safe updatePoint. Clean.

Concerns:

  1. #536 overlap with #543 — both PRs remove \*[^*]+\* from EMOTION_MARKERS, but #543 goes further: it replaces with a more targeted emote pattern and adds _strip_markdown(). This PR's approach (bare removal) is simpler but may over-remove. The maintainer should resolve the overlap before merging either.

  2. Codacy instructions file.github/instructions/codacy.instructions.md is Copilot agent context, not repo documentation. It shouldn't be committed to the public repo. The .gitignore entry also uses a Windows-style backslash path separator (instructions\codacy...) which won't match on Linux/macOS — so it won't actually be ignored on those platforms even if it were intentional.

  3. Minor: split_mega_files.py removing hardcoded personal names from the fallback list → empty list is correct (those shouldn't ship), but worth a changelog note if any users depend on that heuristic.

Bottom line: The core of this PR (pagination fix, KG locking, upsert, test suite) is merge-worthy. I'd ask for the Codacy file to be dropped and the #536 approach coordinated with the author of #543 before merge.

[MemPalace-AGI integration — 208 discoveries, 710 KG entities | dashboard]

@web3guru888
Copy link
Copy Markdown

Solid security hardening PR — let me break down what stands out.

str(e) sanitization in MCP error responses — This is important work. Error message leaks in MCP stdio responses can expose internal paths, file names, and—in the worst case—partial secrets embedded in exception strings. The pattern of switching to generic user-facing messages while routing full context to logger.error is exactly right. Glad to see this applied consistently across 8 call sites rather than patched piecemeal.

add() → upsert() in convo_miner.py — Real correctness fix. Multiple mine runs on the same conversation would previously crash on duplicate IDs, which means any retry or re-index workflow was broken. Worth validating that the upsert is truly idempotent: if the same conversation produces the same content hash, the upsert should be a no-op. If the hash changes between runs (e.g., due to timestamp inclusion), you'd get silent data mutation instead of a crash — which is harder to debug. Worth a quick sanity check on the hashing logic.

threading.Lock for SQLite in WAL mode — Correct. WAL provides read/write concurrency at the SQLite level, but Python-level threading still needs coordination around the connection object. The lock closes that gap.

Issue coverage — Fixing #535, #536, #521, #478, and #531 in one pass is efficient for maintainers. The 79 new tests (640 → 719) are the right safety net for a wide-surface diff. One caution though: mega-PRs that touch security, concurrency, and data integrity simultaneously can obscure interactions between changes. I'd suggest the reviewer runs the full test suite on just the security-only commit (the str(e) sanitization) in isolation — confirming the error-suppression changes don't accidentally mask real failures during development. It's easy for a generic-message wrapper to hide a logic error if the surrounding test doesn't assert on the error path.

Overall: The str(e) sanitization alone would be worth merging; getting five issue fixes in the same sweep with full test coverage is a real contribution. Nice work @mrdeeme.


[MemPalace-AGI integration — production stats at https://milla-jovovich.github.io/mempalace/integrations/mempalace-agi/]

@bensig
Copy link
Copy Markdown
Collaborator

bensig commented Apr 11, 2026

@mrdeeme conflicts here

mrdeeme added 3 commits April 11, 2026 08:01
… (640->719 tests)

- Sanitize 8 str(e) leaks in error responses -> generic messages + logger.error
- Remove duplicate _client_cache declaration
- Update docstring to reflect all 19 tools

- Replace collection.add() with collection.upsert() to prevent duplicate ID crashes
- Fix diary_write to use upsert instead of add

- Add threading.Lock for thread-safe SQLite access in WAL mode

- Add TTL cache (30s) + count-based invalidation for build_graph()
- Replace hardcoded limit=10000 with _iter_all_metadata() batched pagination (1000/batch)
- Singleton ChromaDB client keyed by palace_path (was creating new client per call)

- Replace bare 'except Exception: pass' with proper error handling + warning field

- Remove hardcoded personal names from fallback list -> empty list

- test_palace.py: 15 tests for previously untested palace.py core module
- test_concurrency.py: 7 tests for thread safety (parallel search, KG writes, MCP)
- test_failure_modes.py: 16 tests for error paths (palace, config, searcher, KG, MCP)

- test_config.py: 4 -> 24 tests (precedence, sanitize_name/content, invalid configs)
- test_convo_miner.py: 1 -> 12 tests (decomposed monolith into focused scenarios)
- test_version_consistency.py: 2 -> 7 tests (semver, module consistency, server info)

- test_memory_profile.py: RSS growth limits (<100MB/<150MB)
- test_search_bench.py: concurrent error count (0) + p95 latency (<5s)
- test_recall_threshold.py: recall@10 >= 50% at <=1000 drawers
- test_palace_boost.py: wing/room boost >= -10% degradation
- test_chromadb_stress.py: batch insertion faster than sequential

- .github/instructions/codacy.instructions.md: Codacy MCP integration config
- .gitignore: updated entries

All 612 unit tests pass (1 skipped: Unix-only permissions on Windows).
…, MemPalace#478, MemPalace#531)

- Add sys.stdout/stderr.reconfigure(encoding='utf-8', errors='replace') at
  CLI entry point (cli.py), fixing checkmark (U+2713) and CJK crashes in
  miner.py, convo_miner.py, split_mega_files.py, and entity_detector.py.

- Remove wildcard regex \\\*[^*]+\\*\ from EMOTION_MARKERS in
  general_extractor.py. This pattern matched all Markdown bold/italic text,
  routing technical content to the 'emotional' room.

- Add collection.delete(where={'source_file': ...}) before the upsert loop
  in miner.py::process_file(). Converts re-mines from upsert-over-existing
  (which hits hnswlib's thread-unsafe updatePoint/repairConnectionsForUpdate
  path) into clean delete+insert, bypassing the race entirely.

- Replace hardcoded limit=10000 with batched pagination (1000/batch) in
  miner.py::status(), matching the fix already applied to mcp_server.py.

- Update README from 'chromadb>=0.4.0' to 'chromadb>=0.5.0,<0.7' to match
  pyproject.toml.

All 612 unit tests pass.
…rdening

- Fix potential deadlock in KnowledgeGraph: threading.Lock → threading.RLock
  (reentrant) since _execute/_commit call _conn which also acquires the lock
- Harden MempalaceConfig: non-dict JSON (e.g. [1,2,3]) now treated as empty
  config instead of crashing on .get() calls
- Remove codacy.instructions.md from repo (not suited for public PRs)
- Fix inconsistent section header in mcp_server.py
- Remove unused imports: PropertyMock, MagicMock, patch, tempfile, shutil, pytest
- Update test_json_array_instead_of_object to assert fallback defaults
mrdeeme added 2 commits April 11, 2026 08:44
…, MemPalace#549, MemPalace#541, MemPalace#608)

- MemPalace#586: dry-run crash with TypeError when room=None — return 'general' fallback
- MemPalace#602: claude.ai export sender/role field mismatch + large file skip warning
- MemPalace#477: tool_search limit clamped to [1, 100] to prevent memory exhaustion
- MemPalace#549: save hook no longer counts tool_result messages as human exchanges
- MemPalace#541: remove non-existent Discussions reference from CONTRIBUTING.md
- MemPalace#608: mtime-based cache invalidation for stale HNSW search results
…emPalace#477 MemPalace#549 MemPalace#608)

- test_miner: process_file returns 'general' (not None) for tiny/unreadable files
- test_normalize: sender fallback in _try_claude_ai_json (flat, privacy, mixed)
- test_mcp_server: limit clamp (high/low/negative), cache invalidation (mtime/rate-limit/clear)
- test_convo_miner: oversized file prints warning message
- conftest: reset _cache_sqlite_mtime and _cache_last_check between tests
- lint: fix unused imports (MIN_CHUNK_SIZE, os) and add missing pytest import
@bensig bensig changed the base branch from main to develop April 11, 2026 22:21
@bensig bensig requested a review from igorls as a code owner April 11, 2026 22:21
@bensig
Copy link
Copy Markdown
Collaborator

bensig commented Apr 12, 2026

Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution!

@bensig bensig closed this Apr 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants