-
Notifications
You must be signed in to change notification settings - Fork 2
research(tools): transactional ShellExecutor — snapshot+rollback for atomic file operations (arXiv:2512.12806) #2414
Copy link
Copy link
Closed
Labels
P2High value, medium complexityHigh value, medium complexityenhancementNew feature or requestNew feature or requestresearchResearch-driven improvementResearch-driven improvementtoolsTool execution and MCP integrationTool execution and MCP integration
Description
Finding
Fault-Tolerant Sandboxing / Transactional Filesystem (arXiv:2512.12806)
Policy-based interception layer + atomic filesystem transactions with rollback. 100% interception of high-risk commands, 100% rollback success, 14.5% overhead (~1.8s per transaction). Runs headlessly without interactive auth.
Applicability to Zeph
Zeph's ShellExecutor (zeph-tools/src/shell.rs) has audit logging and confirm_patterns blocklist, but no transactional rollback. A snapshot-before-execution + rollback-on-failure mechanism would meaningfully improve safety for autonomous coding sessions.
Proposed design:
- Before executing a shell command that matches a "write" pattern (file creation, deletion, modification), create a filesystem snapshot of affected paths
- If the command fails or the agent decides to undo, restore the snapshot
- Surface as a
transactional = trueflag in[tools.shell]config
Implementation sketch
[tools.shell]
transactional = true # opt-in snapshot+rollback
transaction_scope = ["*.rs", "*.toml", "src/**"] # paths to snapshot// In ShellExecutor: before execution
if config.transactional && affects_watched_paths(&cmd) {
let snapshot = create_snapshot(&affected_paths)?;
let result = execute(cmd).await;
if result.is_err() && config.auto_rollback {
restore_snapshot(snapshot)?;
}
}Priority
P2 — directly improves safety of autonomous coding sessions; relatively self-contained implementation in zeph-tools.
Source
- arXiv:2512.12806 — Fault-Tolerant Sandboxing for Headless LLM Agents
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P2High value, medium complexityHigh value, medium complexityenhancementNew feature or requestNew feature or requestresearchResearch-driven improvementResearch-driven improvementtoolsTool execution and MCP integrationTool execution and MCP integration