Skip to content

uninstall: transactional wipe via staging dir + atomic rename (Q2 follow-up from #730) #757

@memtomem

Description

@memtomem

Context

Follow-up from #730 (closed by #756). The Windows contract decision in #730 only addressed the case where _delete_inventory is guaranteed to fail (Windows + live writer + --force). This issue covers the orthogonal partial-failure case that affects all platforms.

Problem

Today, _delete_inventory (packages/memtomem/src/memtomem/cli/uninstall_cmd.py:470-541) deletes inventory groups in order: session/pid → fragments → backups → config → memories → uploads → database. If any step fails mid-run (permission denied, transient FS error, antivirus interference on Windows, etc.), the partial-error path at uninstall_cmd.py:647-656 raises _UninstallPartialError and the user is left with a half-gone state directory — config and memories may already be deleted while the DB and runtime files survive.

Current output for partial failure:

Deletion failed at <path>: <error>
  Successfully removed up to: <group>

…which is informative but doesn't help the user recover. Re-running uninstall against a half-cleaned dir is inconsistent (some inventory groups missing). Manually reconstructing state from a half-deletion is impractical.

This was acknowledged but explicitly deferred from #730 / #756 to keep that PR focused on the Windows contract decision.

Proposed approach

Stage the wipe to a sibling directory (e.g., ~/.memtomem/.uninstall-staging/) and os.replace on full success — i.e., make the wipe transactional:

  1. Create a fresh staging dir (e.g., ~/.memtomem/.uninstall-staging-<pid>/).
  2. Move each inventory group's paths into staging instead of deleting in-place. os.rename / os.replace is atomic on the same filesystem.
  3. After all groups are staged successfully, shutil.rmtree(staging_dir) to actually delete.
  4. On any mid-run failure: roll back by moving staged paths back to their original locations (or leaving staging in place and instructing the user to recover).

Open design questions:

  • Cross-FS edge case: if ~/.memtomem/ and the staging sibling end up on different filesystems (rare but possible with bind mounts / symlinks), os.rename falls back to copy+delete and loses atomicity. Document the assumption or detect-and-refuse.
  • Rollback granularity: all-or-nothing rollback is the cleanest model, but if rollback itself fails (e.g., destination perms changed mid-run), we need a clear surfacing path. Probably just leave staging in place + print recovery instructions.
  • --keep-config / --keep-data interaction: the existing flags partially exempt groups from deletion. Staging logic must respect those exemptions (don't move kept paths into staging at all).
  • Externals: external integrations (_probe_external_integrations) are detected-but-unmodified today. Staging shouldn't change that — only _delete_inventory's own paths should be transactional.

Out of scope

Acceptance criteria

  • If any single group's deletion fails, the user's state directory is left in its original (pre-uninstall) state, or all groups are deleted. No half-states.
  • Existing test coverage for _keep_config / _keep_data / partial-error exit still passes.
  • New test: induce a mid-_delete_inventory failure (e.g., monkeypatch shutil.rmtree to raise on the 3rd group) and assert the original state dir is intact.
  • New test: cross-filesystem detection (skipif Windows or skipif no second mount available) — assert refusal-or-fallback behavior matches the chosen design.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions