Skip to content

memtomem-server leaves stale .server.pid on exit — risks mm uninstall liveness false-positive via PID recycling #387

@memtomem

Description

@memtomem

Problem

memtomem-server writes ~/.memtomem/.server.pid on startup but does not unlink it on exit. After any non-clean termination (SIGKILL, parent-process death, system restart, crash) — and possibly after clean SIGTERM too — the file lingers on disk pointing at a PID that is no longer the server.

This is the mirror image of #384 (liveness check only sees MCP server pid):

Reproduction

# 1. Start a server (any path — a client-spawned one works too)
uvx --from memtomem memtomem-server </dev/null &
SERVER_PID=$!

# 2. Wait for pid file, then kill server abruptly
until [ -f ~/.memtomem/.server.pid ]; do sleep 0.2; done
cat ~/.memtomem/.server.pid                  # → 12345 (matches $SERVER_PID)
kill -9 $SERVER_PID                          # SIGKILL, no cleanup opportunity

# 3. Observe stale pid file
ls -la ~/.memtomem/.server.pid               # → still present
cat ~/.memtomem/.server.pid                  # → 12345, even though PID 12345 is dead

Observed in the wild on 2026-04-23 during first-time-user smoke testing. After pkill -f memtomem-server, ~/.memtomem/.server.pid remained on disk with the dead PID written to it.

Why it matters

mm uninstall probes liveness via os.kill(pid, 0) against the pid file (per #379 / project_mm_uninstall_cmd.md's HIGH 1 review item). Today, os.kill(<dead_pid>, 0) raises ProcessLookupError and the check correctly concludes "not alive" — so a freshly-stale pid file doesn't immediately break uninstall.

The failure mode arrives with PID recycling: once the kernel hands the stale PID to an unrelated process (a browser helper, a shell, anything), os.kill(<recycled_pid>, 0) succeeds, the liveness check says "alive", and mm uninstall refuses to proceed until --force. The user gets a confusing refusal for a process that has nothing to do with memtomem.

Suggested directions

  1. Unlink the pid file on clean exit paths. Register an atexit handler + SIGTERM/SIGINT signal handlers in the server's stdio main loop. Covers SIGTERM and clean shutdown. SIGKILL / system crash cases remain uncoverable by the server itself (design limit of pid files).

  2. Make the liveness check robust to PID recycling. When the pid in the file points to a live process, additionally verify the cmdline includes memtomem-server (via psutil.Process(pid).cmdline() or /proc/{pid}/comm on Linux, ps -o comm= -p <pid> on macOS). Only refuse if it actually looks like a memtomem server.

  3. Move to a lockfile-based schemefcntl.flock on a file descriptor held by the running server. Automatically released on any process termination (normal or abnormal). Sidesteps both stale-pid and PID-recycling failure modes. Bigger change.

(1) alone closes the common case. (1)+(2) closes the documented failure modes. (3) is the principled end-state if the team wants to stop maintaining two separate checks.

Related

Together these three describe a pid-file-centric liveness model that is incomplete from both directions. A combined redesign (option 3, or option 1+2 elsewhere) would close all three.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions