Skip to content

[bug]: Windows CI fails on test_server_sigterm.py + TestServerAliveRefuses after #818 #819

@memtomem

Description

@memtomem

Summary

Windows CI on PR #818 (https://github.com/memtomem/memtomem/actions/runs/25419209194/job/74557245591) failed with 6 tests after the fcntl→portalocker swap landed. Two distinct root causes:

Root cause A — Windows LockFileEx(LOCK_EX) blocks reads from other handles.

POSIX flock/POSIX fcntl is advisory and only governs other flock calls; reads from any handle proceed unaffected. Windows LockFileEx with LOCKFILE_EXCLUSIVE_LOCK blocks all I/O on the locked byte range from other handles, including reads. So pid_file.read_text() (which opens a new handle) raises PermissionError (ERROR_LOCK_VIOLATION) while another handle holds the lock.

Failing tests (both at pid_file.read_text() calls):

  • test_server_sigterm.py::test_contended_server_start_preserves_pid_file_content (line 551) — the test holds LOCK_EX via holder and tries to verify content via a separate handle.
  • test_server_sigterm.py::test_server_main_acquires_portalocker_pid_lock (line 636) — main() holds LOCK_EX via _lock_fp and the test tries to verify content via a separate handle.

Both pass on POSIX because flock semantics let reads proceed.

Root cause B — TestServerAliveRefuses class-level skip removal was too coarse.

#818 dropped the class-level Windows skip on TestServerAliveRefuses because the _hold_pid_lock helper now uses portalocker. The lock-acquisition mechanics do work on Windows, but four tests in the class assert POSIX-specific user-facing text (lsof, the recorded pid in the message body) or POSIX-specific --force behavior (unlink-while-open).

Failing tests:

  • test_refuses_when_server_alive_at_legacy_path — asserts str(os.getpid()) in result.output; Windows output omits the pid in favor of "Find the holder via Sysinternals `handle.exe` or Resource Monitor."
  • test_refuses_when_server_alive_at_runtime_path — same shape.
  • test_refuses_with_unknown_pid_branch_when_pid_file_empty — asserts 'lsof' in result.output; Windows uses Sysinternals/Task Manager wording.
  • test_force_overrides_liveness — asserts --force succeeds; on Windows --force cannot wipe an open SQLite DB (per Windows: contract for mm uninstall --force against a live writer is undefined #730 contract) so it correctly refuses.

Fix sketch

A. Read pid-file content through the lock-owning handle (holder.seek(0); holder.read().decode()) in test_contended_server_start_preserves_pid_file_content. Cross-platform safe — POSIX is fine reading via either handle, Windows requires the owning one.

B. In test_server_main_acquires_portalocker_pid_lock, the test cannot reach main()'s closure-scoped _lock_fp. Guard the pid-content assertion on Windows; the alive-probe and post-cleanup deletion checks still run there.

C. Re-add per-test @pytest.mark.skipif(sys.platform == "win32", ...) to the four TestServerAliveRefuses cases above. The remaining tests in the class run cleanly on Windows. Widening the assertions to accept Windows wording could come later; for now the minimum-blast-radius fix is per-test skips.

memtomem version

main (post-#818)

Operating system

Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions