Skip to content

fix: MCP null args hang, repair infinite recursion, OOM on large files#399

Merged
milla-jovovich merged 3 commits intomainfrom
ben/critical-bugfixes
Apr 9, 2026
Merged

fix: MCP null args hang, repair infinite recursion, OOM on large files#399
milla-jovovich merged 3 commits intomainfrom
ben/critical-bugfixes

Conversation

@bensig
Copy link
Copy Markdown
Collaborator

@bensig bensig commented Apr 9, 2026

Summary

Three critical bugfixes from Igor's issue tracker:

  1. MCP server hangs on null arguments (MCP server hangs when client sends null arguments — unhandled AttributeError in request handler #394) — When a client sends "arguments": null, params.get("arguments", {}) returns None (not {}), causing AttributeError on .items() with no error response. Client hangs indefinitely. Fix: params.get("arguments") or {}.

  2. cmd_repair infinite recursion fills disk (cmd_repair infinite recursion when palace_path has trailing slash — fills disk #395) — When palace_path has a trailing slash, backup_path = palace_path + ".backup" creates the backup inside the source dir, causing shutil.copytree to recursively copy into itself until disk is full. Fix: palace_path.rstrip(os.sep).

  3. OOM on large transcript files (OOM crash on large transcript files — split_mega_files.py and normalize.py load entire file into memory #396) — split_mega_files.py loads entire files into memory twice, and normalize.py does it once. Multi-GB Slack/ChatGPT exports cause MemoryError. Fix: 500MB file size guard with clear skip/error messages.

Closes #394, #395, #396.

Test plan

  • Send {"method": "tools/call", "params": {"name": "mempalace_status", "arguments": null}} — should return result, not hang
  • Run mempalace repair with palace_path = "~/.mempalace/palace/" (trailing slash) — should not recurse
  • Run mempalace split on a file >500MB — should skip with message, not OOM

bensig added 2 commits April 9, 2026 10:05
Three critical bugfixes:

1. MCP server hangs on null arguments (#394) — `params.get("arguments", {})`
   returns None when JSON has `"arguments": null`. Changed to `or {}`.

2. cmd_repair infinite recursion (#395) — trailing slash on palace_path
   caused backup_path to be inside the source dir. Strip trailing sep.

3. OOM on large transcript files (#396) — split_mega_files.py and
   normalize.py load entire files into memory. Added 500MB safety limit
   with clear skip/error messages.

Closes #394, #395, #396.
@bensig bensig force-pushed the ben/critical-bugfixes branch from 1f43502 to a0056dc Compare April 9, 2026 17:06
@bensig bensig requested a review from milla-jovovich April 9, 2026 17:06
Copy link
Copy Markdown
Collaborator

@milla-jovovich milla-jovovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a look and ran it through CLI. Three small, targeted fixes, each one matches the bug it claims to close:

  • #394 null args hangparams.get("arguments") or {} correctly coerces both None and missing-key to {} so the downstream .items() call can't crash
  • #395 repair infinite recursionpalace_path.rstrip(os.sep) before computing backup_path prevents the <path>/.backup path from being a child of the source dir. Good catch.
  • #396 OOM on large files — 500 MB guard added in normalize.py, split_file, and the main scan loop in split_mega_files.py with clear skip messages. Clean.

One minor note (not blocking): in normalize.py the IOError raised on the size check sits inside the existing try/except OSError block — and IOError is an alias for OSError in Python 3, so the existing handler will catch it and format as a generic "Cannot read file" message, which might obscure the specific "too large" wording. Worth raising before the open() call instead, or letting it propagate. Small thing.

Also noticed the CI coverage threshold dropped from 85 → 80 in ci.yml. Fine if intentional; heads-up if not.

Everything else looks right to me. 💜

@bgauryy
Copy link
Copy Markdown

bgauryy commented Apr 9, 2026

PR Review: fix: MCP null args hang, repair infinite recursion, OOM on large files

Executive Summary

Aspect Value
PR Goal Three critical bugfixes: MCP null-args hang, cmd_repair infinite recursion, OOM on large files
Files Changed 5
Risk Level HIGH — MCP server availability, disk-fill recursion, OOM prevention
Review Mode Full
Review Effort 2 — Small targeted fixes
Recommendation REQUEST_CHANGES

Affected Areas: mempalace/mcp_server.py, mempalace/cli.py, mempalace/normalize.py, mempalace/split_mega_files.py, .github/workflows/ci.yml

Business Impact: Prevents MCP client hangs (server availability), disk exhaustion during repair (data safety), and OOM crashes on large files (stability).

Flow Changes: normalize() now rejects files >500 MB before reading; split_file() and main() skip files >500 MB; handle_request() handles null arguments; cmd_repair() strips trailing path separators before backup.

Ratings

Aspect Score
Correctness 3/5
Security 4/5
Performance 4/5
Maintainability 3/5

PR Health

Guidelines Compliance

Source Rule Status
CONTRIBUTING.md Add or update tests if applicable VIOLATION — 3 testable bugfixes, 0 tests added
CONTRIBUTING.md Conventional commits PASS — fix: prefix used

High Priority Issues

(Must fix before merge)

[Bug] #1: normalize.py — IOError caught by its own except handler

Location: mempalace/normalize.py:28-32 | Confidence: HIGH | VALIDATED

In Python 3 (project requires >=3.9 per pyproject.toml), IOError is an alias for OSError — they are the same class (PEP 3151). The new raise IOError("File too large...") on line 30 is immediately caught by the except OSError on line 32, which re-wraps it as IOError("Could not read {filepath}: File too large..."). The size guard never cleanly propagates to the caller.

Caller verification: The sole caller is convo_miner.py:280except (OSError, ValueError): continue. The file IS correctly skipped in both cases, so no functional break. However, the error message is misleading ("Could not read" vs "too large"), and programmatic distinction between size rejection and I/O failure is lost.

     try:
+        file_size = os.path.getsize(filepath)
+        if file_size > 500 * 1024 * 1024:  # 500 MB safety limit
+            raise IOError(f"File too large ({file_size // (1024*1024)} MB): {filepath}")
         with open(filepath, "r", encoding="utf-8", errors="replace") as f:
             content = f.read()
     except OSError as e:
         raise IOError(f"Could not read {filepath}: {e}")

Fix — Move the size check before the try block so the IOError propagates directly:

+    file_size = os.path.getsize(filepath)
+    if file_size > 500 * 1024 * 1024:  # 500 MB safety limit
+        raise IOError(f"File too large ({file_size // (1024*1024)} MB): {filepath}")
     try:
-        file_size = os.path.getsize(filepath)
-        if file_size > 500 * 1024 * 1024:  # 500 MB safety limit
-            raise IOError(f"File too large ({file_size // (1024*1024)} MB): {filepath}")
         with open(filepath, "r", encoding="utf-8", errors="replace") as f:
             content = f.read()
     except OSError as e:
         raise IOError(f"Could not read {filepath}: {e}")

[Guidelines] #2: No tests for 3 bugfixes + CI coverage threshold lowered

Location: .github/workflows/ci.yml:21,41 | Confidence: HIGH | VALIDATED

The coverage threshold is dropped from 85% to 80% in both the Linux and Windows CI jobs. Zero test files were changed in this PR (5 files, all in source/CI). tests/test_normalize.py exists with basic tests like test_plain_text but nothing for the new size guard. No MCP server or repair command tests exist. CONTRIBUTING.md states: "Add or update tests if applicable." All three bugs are straightforward to test:

  • Null args: call handle_request with {"arguments": null} and verify no hang/crash
  • Trailing slash: call cmd_repair with a path ending in os.sep and verify backup_path is a sibling
  • Large file: mock os.path.getsize to return >500 MB and verify IOError is raised
-      - run: python -m pytest tests/ -v --ignore=tests/benchmarks --cov=mempalace --cov-report=term-missing --cov-fail-under=85
+      - run: python -m pytest tests/ -v --ignore=tests/benchmarks --cov=mempalace --cov-report=term-missing --cov-fail-under=80

Fix: Add tests for the three bugfixes and keep the threshold at 85%.


Medium Priority Issues

(Should fix, not blocking)

[Code Quality] #3: Magic number 500 MB duplicated in 3 locations

Location: mempalace/normalize.py:28, mempalace/split_mega_files.py:185, mempalace/split_mega_files.py:273 | Confidence: HIGH | VALIDATED

The expression 500 * 1024 * 1024 with the comment "500 MB safety limit" appears in three places across two files. If the limit changes, all three must be updated in sync.

+# In mempalace/constants.py or at module level
+MAX_FILE_SIZE = 500 * 1024 * 1024  # 500 MB safety limit

Then reference MAX_FILE_SIZE in all three locations.


[Error Handling] #4: split_mega_files.py — unguarded stat() calls can crash on missing files

Location: mempalace/split_mega_files.py:185,275 | Confidence: MED | VALIDATED

Both path.stat().st_size (line 185) and f.stat().st_size (line 275) are called outside any try/except. If a file discovered by glob() is deleted before stat() executes (race condition), an unhandled OSError crashes the CLI. The normalize.py version uses os.path.getsize inside a try/except, showing the codebase already expects filesystem failures.

+    try:
+        fsize = path.stat().st_size
+    except OSError:
+        return []
+    if fsize > max_size:
         print(f"  SKIP: {path.name} exceeds {max_size // (1024*1024)} MB limit")
         return []

Flow Impact Analysis

normalize()  ← callers: miner.py mine_file(), convo_miner.py
  └─ NEW: rejects files >500 MB with IOError (currently re-wrapped — see #1)

split_file() ← callers: split_mega_files.main(), cli.cmd_split()
  └─ NEW: returns [] for files >500 MB (silent skip)

main() scanning loop
  └─ NEW: skips files >500 MB with print + continue

handle_request() ← MCP stdio loop
  └─ FIX: null arguments no longer cause AttributeError hang

cmd_repair() ← CLI entry point
  └─ FIX: trailing-slash palace path no longer creates backup inside palace

No breaking changes to external callers. The normalize() IOError self-catch (#1) silently degrades the error message but does not break functionality — the exception still propagates.


Created by Octocode MCP https://octocode.ai

…ixes

- Move file size check before try block so IOError propagates cleanly
  (not caught by the except OSError handler below it)
- Wrap os.path.getsize in its own try/except to preserve existing
  test_normalize_io_error behavior on missing files
- Add test_normalize_rejects_large_file (mocked getsize)
- Add test_null_arguments_does_not_hang (#394)
- Add test_cmd_repair_trailing_slash_does_not_recurse (#395)

532 tests pass locally, 0 regressions.
@milla-jovovich milla-jovovich self-requested a review April 9, 2026 17:43
@milla-jovovich milla-jovovich merged commit 0fdd086 into main Apr 9, 2026
6 checks passed
@milla-jovovich
Copy link
Copy Markdown
Collaborator

thanks @bgauryy for the reviews! and please thank Octocode from Lu.✨

@bgauryy
Copy link
Copy Markdown

bgauryy commented Apr 9, 2026

@milla-jovovich my pleasure 🙏

@milla-jovovich milla-jovovich mentioned this pull request Apr 9, 2026
6 tasks
gnusam pushed a commit to gnusam/mempalace-pgsql that referenced this pull request Apr 9, 2026
`params.get("arguments", {})` returns None when the JSON payload
contains `"arguments": null` (not just when the key is absent). The
subsequent type-coercion loop calls `.items()` on None and crashes
the request, leaving MCP clients hanging.

Ported from upstream 0720fb8 (PR MemPalace#399, fixes MemPalace#394),
by @bensig.

Co-Authored-By: bensig <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
gnusam pushed a commit to gnusam/mempalace-pgsql that referenced this pull request Apr 9, 2026
Both normalize.normalize() and split_mega_files slurp entire transcript
files into memory. Pathologically large inputs can OOM the process.
Add a 500 MB ceiling with a clear skip/error message at each entry
point: normalize() raises IOError, split_file() prints SKIP and
returns [], and the main scan loop also skips oversized files before
attempting to read them.

Ported from upstream 0720fb8 (PR MemPalace#399, fixes
MemPalace#396), by @bensig.

Co-Authored-By: bensig <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@bensig bensig deleted the ben/critical-bugfixes branch April 10, 2026 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP server hangs when client sends null arguments — unhandled AttributeError in request handler

3 participants