Conversation
closes #817) PR #652 (v0.1.34) made the `mm` CLI Windows-clean by moving three callsites onto portalocker, but deliberately deferred the MCP server itself. The two `fcntl.flock` callsites in `server/__init__.py` sat inside lazy `import fcntl` blocks — keeping `mm` loadable on Windows, but crashing `memtomem-server` the moment `main()` reached the lock acquisition path. The #625 close note named this as the explicit follow-up. Three changes in `server/__init__.py`: - `_try_hold_legacy_flock` early-returns on Windows. The legacy flock exists solely to interlock with pre-0.1.25 `memtomem-server` instances, which were Linux-only by construction (the `mm` CLI did not load on Windows until #652 / 0.1.34). Skipping the probe is semantically correct — the cross-version mutex has no Windows-side counterparty — and sidesteps the question of whether portalocker's Windows backend selection (`MsvcrtLocker` vs `Win32Locker`) uniformly implements `LOCK_SH`. On POSIX, `LOCK_SH | LOCK_NB` swaps to `portalocker.lock(LOCK_SH | LOCK_NB)` 1:1. - `main()`'s XDG-runtime pid lock swaps `LOCK_EX | LOCK_NB` to `portalocker.lock(...)`, with the exception tuple widened to `(portalocker.LockException, BlockingIOError, OSError)` to mirror the canonical pattern in `cli/_liveness.py:probe_pid_file`. - `_install_sigterm_handler` POSIX-gates `signal.signal(SIGTERM, ...)`. Python's SIGTERM is a no-op on Windows (the C runtime does not deliver it), so registering a handler would silently mislead readers into thinking shutdown was wired. The Windows path relies on FastMCP's stdin-EOF teardown + `atexit` instead. Test surface adjustments: - `test_server_sigterm.py`: keep SIGTERM-firing and legacy-flock integration tests POSIX-skipped (with refreshed `reason` strings), add Windows skips to the three signal-capture unit tests, and add a Windows-only `test_install_sigterm_handler_is_noop_on_windows` pin so the no-op contract has a regression net. - `test_contended_server_start_preserves_pid_file_content` is now Windows-runnable: simulator switched to portalocker (`"rb+"` open for `MsvcrtLocker` compatibility), env dict carries both `HOME` and `USERPROFILE`, mode-bit chmods POSIX-gated. - `test_server_uses_tempdir_fallback_when_xdg_unset` is now Windows-runnable: subprocess env sets HOME / USERPROFILE / TMP / TEMP / TMPDIR (Windows reads multiple candidates), uid suffix collapses to 0 when `os.geteuid` is missing, mode-bit assert POSIX-gated. The full reasoning is in the test docstring. - New cross-platform `test_server_main_acquires_portalocker_pid_lock` runs `main()` in-process behind aggressive isolation: stubs `_install_sigterm_handler` and `mcp.run`, pins `Path.home()` and `XDG_RUNTIME_DIR` to tmp paths, captures `atexit.register` calls and runs them LIFO in `finally`. This is the regression net for any future "swap portalocker back to fcntl" attempt. - `test_uninstall_cmd.py::_hold_pid_lock` helper switched to portalocker; `TestServerAliveRefuses` class no longer Windows-skipped. - `test_no_posix_only_imports.py` docstring refreshed — `server/__init__.py` is now portalocker-only, so the lazy-import exemption is general infrastructure rather than that specific callsite. The AST scan logic is unchanged. Out of scope for this PR (separate follow-ups if needed): Windows ODR registration via MSIX manifest / MCPB bundle / `odr.exe`, and flipping the CI Windows job from informational to required (gated on the broader Windows-triage umbrella #634). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…#818) PR #818 review: with SIGTERM POSIX-gated, the only graceful teardown route on Windows is the existing two-step ``atexit`` registration: atexit.register(lambda: _lock_fp.close()) # LIFO: runs second atexit.register(lambda: pid_file.unlink(...)) # LIFO: runs first That ordering is correct on POSIX (unlink-while-open keeps the inode alive until close — defensive against another process replacing the path between operations) but breaks on Windows: NTFS refuses to delete an open or locked handle, so ``unlink`` raises ``PermissionError`` (WinError 32). The MCP host disconnects, atexit fires, the unlink is swallowed by atexit's exception isolation, and a stale ``server.pid`` plus an ignored "Error in atexit._run_exitfuncs" message survive every clean shutdown — and the next start misreads the leftover file. Replace both two-call patterns (XDG path and the legacy-flock path) with a single composite cleanup function that branches on platform: - POSIX: unlink first, then close. Preserves the ``#437`` invariant (delete exactly the inode we own the flock on). - Windows: close first, then unlink. The close releases the portalocker lock and the file handle, after which NTFS lets us unlink. The trade-off — a sub-millisecond window where the path exists without an active holder — is dominated by ``probe_pid_file``, which already opens its own handle to decide alive/dead and so is authoritative independent of this race. The legacy-flock cleanup gets the same composite shape even though ``_try_hold_legacy_flock`` short-circuits to ``None`` on Windows today. That short-circuit is the only thing keeping the legacy path Windows-safe; if a future change relaxes it, the same bug shape would silently re-appear. Using the same pattern in both blocks makes the defense uniform. Test pin: ``test_server_main_acquires_portalocker_pid_lock`` now runs the captured ``atexit`` callbacks and asserts the pid file is gone afterwards. On POSIX the assertion is satisfied by either the old or the new ordering, but on Windows the old ordering would raise inside ``unlink`` and leave the file behind — so any future revert to two ``atexit.register`` calls fails this test on Windows CI rather than shipping a stale-pid bug. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #817. Follow-up to #625 — that issue's close note promised this work as a separate PR if a Windows server target ever landed.
memtomem-server(the MCP server entry point) was the last POSIX-only surface in the codebase: PR #652 swapped the three CLI-sidefcntlcallsites ontoportalocker, but deliberately leftserver/__init__.py's two callsites for a follow-up. With users now wanting to point Windows MCP hosts (Claude Desktop / Code) at memtomem directly, that follow-up is here.Changes
packages/memtomem/src/memtomem/server/__init__.py— three edits:_try_hold_legacy_flockearly-returns on Windows. The legacy flock interlocks with pre-0.1.25memtomem-serverinstances, which were Linux-only by construction. Skipping the probe is semantically correct (no Windows-side counterparty exists) and sidesteps the question of whether portalocker's Windows backend selection (MsvcrtLockervsWin32Locker) uniformly implementsLOCK_SH. On POSIX, theLOCK_SH | LOCK_NBlock swaps toportalocker.lock(...)1:1.main()— XDG-runtime pid lock now usesportalocker.lock(LOCK_EX | LOCK_NB). Exception tuple widened to(portalocker.LockException, BlockingIOError, OSError)to match the canonical pattern incli/_liveness.py:probe_pid_file._install_sigterm_handlerPOSIX-gates thesignal.signal(SIGTERM, ...)registration. Python's SIGTERM is a no-op on Windows; on Windows the teardown path is FastMCP's stdin-EOF +atexit.Tests:
test_server_sigterm.py— three signal-capture unit tests now Windows-skipped; new Windows-onlytest_install_sigterm_handler_is_noop_on_windowspins the no-op contract; integration-skip reasons refreshed;test_contended_server_start_preserves_pid_file_contentandtest_server_uses_tempdir_fallback_when_xdg_unsetnow Windows-runnable (simulator on portalocker, env dict carriesHOME+USERPROFILE+TMP/TEMP/TMPDIR,uidfalls back to0, mode-bit asserts POSIX-gated).test_server_main_acquires_portalocker_pid_lockrunsmain()in-process behind aggressive isolation (stubs_install_sigterm_handlerandmcp.run, capturesatexit.registerand runs callbacks LIFO infinally) — regression net against any future "swap portalocker back to fcntl" attempt.test_uninstall_cmd.py::_hold_pid_lockhelper now portalocker-based;TestServerAliveRefusesclass no longer Windows-skipped.test_no_posix_only_imports.pydocstring refreshed —server/__init__.pyis now portalocker-only, so the lazy-import exemption is general infrastructure rather than that specific callsite. AST scan logic is unchanged.Out of scope (separate follow-ups)
odr.exe). That's packaging, not server code — the Windows MCP overview is registration-side concern and orthogonal to making the server actually run on Windows.continue-on-error: trueto required. That's gated on the Windows-triage umbrella ci: add Windows runner to CI matrix #634.SetConsoleCtrlHandler. Stdin-EOF +atexitcovers MCP-host disconnect cleanly; the explicit-kill story can come later if real users need it.Notes for Windows users
After
memtomem-servershutdown on Windows,~/.memtomem/.server.pidis not created — the legacy-flock probe short-circuits because pre-0.1.25 Windows servers cannot exist. The pid file lives at%LOCALAPPDATA%\Temp\memtomem-0\server.pidand is unlinked viaatexiton host disconnect.Test plan
uv run pytest -m "not ollama" packages/memtomem/tests/test_no_posix_only_imports.py packages/memtomem/tests/test_server_sigterm.py packages/memtomem/tests/test_uninstall_cmd.py— 280 passed, 2 skipped (Windows-only)uv run pytest -m "not ollama" packages/memtomem/tests/— 4228 passed, 9 skippeduv run ruff check packages/memtomem/src && uv run ruff format --check packages/memtomem/srcmemtomem-serverboots on a real Windows host, MCP host (Claude Desktop / Code) lists tools and answersmem_status, pid file appears at%LOCALAPPDATA%\Temp\memtomem-0\server.pidand is unlinked at shutdown.🤖 Generated with Claude Code