fix(cli): make mm uninstall deletion transactional via stage-then-rmtree#758
Merged
fix(cli): make mm uninstall deletion transactional via stage-then-rmtree#758
mm uninstall deletion transactional via stage-then-rmtree#758Conversation
…mtree (closes #757) The old `_delete_inventory` deleted inventory groups in place. If any group's deletion failed mid-run (permission denied, transient FS error, antivirus interference on Windows, etc.), the user was left with a half-gone state directory — config and memories already wiped while the DB and runtime files survived. Re-running uninstall against that half-cleaned dir was inconsistent and manual recovery impractical. PR #756 closed the Windows-`--force` slice of this hazard via clean refusal; this PR closes the orthogonal partial-failure case for all platforms by making the wipe transactional: 1. Stage every to-be-deleted path into a sibling `<anchor>/.uninstall-staging-<pid>/` dir via `os.replace` — atomic on the same FS, so each move either completes or doesn't. 2. After every group is staged successfully, `shutil.rmtree` the staging dirs to actually delete the data. 3. On any mid-stage failure, roll back: walk staged moves in reverse and `os.replace` each back to its original location. If rollback itself fails, surface the staging dir paths so the user can recover manually. Whole-dir staging for owned subdirs (`config.d`, `memories`, `uploads`) cuts N renames down to 1 per subdir and removes the post-deletion empty-prune dance the old code needed for those paths. Cross-FS layouts (e.g. a bind mount inside `~/.memtomem/`) can't satisfy `os.replace`'s same-FS atomicity guarantee. Detect via upfront `st_dev` probe and refuse with a clear message rather than falling back to copy+delete — the whole point of staging is atomicity, and copy+delete would just reintroduce the half-state risk. The late-detection branch on `EXDEV` from `os.replace` itself covers anything the upfront probe misses (stat race, transient FS errors). `--keep-config` / `--keep-data` still exempt their respective groups from the plan, so kept paths are never moved into staging at all.
Windows CI lane caught the snapshot dict using `str(Path(...))` for its keys — that's `'memories/x.md'` on POSIX but `'memories\\x.md'` on Windows, which broke the hardcoded `assert "memories/x.md" in before` sanity check on the windows-latest runner. Switch to `as_posix()` so keys are platform-stable.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #757.
Summary
_delete_inventorydeleted inventory groups in place; a mid-run failure (perm denied, transient FS error, antivirus) left the user with a half-gone state dir (config + memories wiped, DB + runtime files surviving) and no practical recovery path.os.replaceinto a sibling<anchor>/.uninstall-staging-<pid>/dir (atomic on same-FS), and only after every group is staged do weshutil.rmtreethe staging dirs to actually delete the data.os.replaceeach back to its original location. If rollback itself fails, the staging dir paths are surfaced so the user can recover manually (the original concern from the issue's "rollback granularity" question).config.d,memories,uploads) — one rename per subdir instead of N, and removes the post-deletion empty-prune dance the old code needed for those paths.~/.memtomem/) can't satisfyos.replace's same-FS atomicity guarantee. Detected upfront viast_devprobe and refused with a clear message; late-detection onEXDEVfromos.replacecovers what the upfront probe misses (stat race, transient errors). Detect-and-refuse rather than copy+delete fallback — the whole point of staging is atomicity.--keep-config/--keep-datastill exempt their respective groups from the plan, so kept paths are never moved into staging at all.Acceptance criteria
test_rollback_restores_original_state_on_mid_stage_failure).--keep-config/--keep-data/ partial-error coverage still passes (28 + 1 skipped before, 31 + 1 skipped after).EXDEV(real cross-FS layouts are too environment-dependent for CI; same code path either way).Out of scope
--forcecontract from Windows: contract formm uninstall --forceagainst a live writer is undefined #730 / fix(cli): refusemm uninstall --forceon Windows when writer is alive #756 — that decision stands; the upfront refusal still fires before staging would even start.Test plan
uv run ruff check+ruff format --check(clean)uv run mypy packages/memtomem/src/memtomem/cli/uninstall_cmd.py(clean)uv run pytest packages/memtomem/tests/test_uninstall_cmd.py -m "not ollama"— 31 passed, 1 skipped (Windows-only, as expected on macOS)uv run pytest packages/memtomem/tests/test_cli_*.py -m "not ollama"— 63 passed (no sibling regressions)test_force_refuses_on_windows_when_writer_alivestill fires before staging (refusal precedes_delete_inventory); new tests are platform-agnostic (monkeypatch-based).🤖 Generated with Claude Code