feat(session): persist agent session IDs on quit and resume on next launch#238
feat(session): persist agent session IDs on quit and resume on next launch#238forketyfork merged 9 commits intomainfrom
Conversation
…aunch Issue: When Architect quits while an AI agent (Claude, Codex, or Gemini) is running in a terminal, the session context is lost and the user has to manually restart and resume the agent on next launch. Solution: Detect the foreground agent at quit time via macOS sysctl (p_comm for direct binaries, KERN_PROCARGS2 argv[1] for Node.js wrappers), send SIGTERM to the foreground process group, drain PTY output for up to 1.5 s, extract the session UUID from the agent's exit message, and persist agent type + UUID in persistence.toml. On next launch, the resume command is written to the PTY after the shell spawns, restoring the session.
…e checking exit Three bugs found from log analysis: - Claude CLI p_comm is its version string (e.g. "2.1.50"), not "claude", so direct p_comm matching failed; fix: run KERN_PROCARGS2 scan for any unrecognized p_comm and check exec_path (full binary path) in addition to argv[1] - Gemini argv[1] doesn't contain "gemini" in all installs; exec_path check covers it - drainOutputForMs exited immediately after SIGTERM because getForegroundPgrp saw the shell as foreground before we polled for output; fix: remove the early fg_pgrp check so we drain until waitpid, POLLHUP, or the 1.5 s deadline
…t-specific prefix Searching for the last RFC 4122 UUID in the drained PTY output is simpler and more robust than matching per-agent resume command prefixes. It works regardless of how an agent formats its exit message, and doesn't need updating when agents change their CLI syntax. The last UUID in the output is taken, so earlier UUIDs in scrollback don't shadow the session ID printed just before exit.
…t leak Issue: After SIGTERM teardown, codex detection failed silently and no UUID was found in terminal output, with no visibility into why. WaitContext allocations also leaked on quit because the xev loop stops before callbacks fire, leaving the completion.state() != .dead check always skipping the free. Solution: Added debug logging in parseArgv to show exec_path and argv[1] so codex's actual install path is visible in logs. Added a 300-char tail log of the terminal text after drain so we can verify whether the UUID is present at all. Fixed deinit to free WaitContext unconditionally: deinit runs after the event loop stops, so processExitCallback will never fire for pending completions and the old completion.state() guard caused a guaranteed leak for every session reaped via drainOutputForMs.
…ture Issue: Agents never printed their session UUID at quit because SIGTERM causes an immediate unclean exit. Claude only outputs "To resume, run: claude --resume <uuid>" when it processes the /exit command. Solution: Write Ctrl+C + agent-specific exit command to the PTY master fd before draining. Ctrl+C (0x03) interrupts any in-progress generation, then the exit command (/exit for claude and gemini, q for codex) triggers a graceful shutdown that includes the resume UUID. Drain window extended to 2s to give the agent time to print the UUID and terminate. If the agent doesn't exit within the drain window, SIGTERM is sent as a fallback with an additional 500ms drain. Also added argc/raw-bytes logging in parseArgv to diagnose why codex's KERN_PROCARGS2 exec_path is empty.
There was a problem hiding this comment.
Pull request overview
This PR implements automatic persistence and resumption of AI agent sessions (Claude, Codex, Gemini) when Architect quits and restarts. When the user quits Architect with agents running, the application detects them via macOS process inspection, signals them gracefully via a background worker thread, captures their session UUIDs from terminal output, and persists them in persistence.toml. On next launch, the resume commands are written directly to the PTY so agents restart automatically.
Changes:
- Added agent session detection and persistence using macOS
sysctl/KERN_PROCARGS2with OSC 1 icon fallback - Implemented asynchronous quit teardown worker with parallel per-agent tasks and UI blocking overlay
- Extended
persistence.tomlschema withterminal_agent_typesandterminal_session_idsarrays
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/ui/mod.zig |
Registers new quit_blocking_overlay component |
src/ui/components/quit_blocking_overlay.zig |
Full-screen animated overlay that blocks input during agent teardown |
src/session/state.zig |
Adds AgentKind enum, agent detection via process inspection, OSC 1 scanning, and new session fields for agent metadata |
src/config.zig |
Extends TerminalEntry with agent_type and agent_session_id, updates TOML serialization/deserialization |
src/app/terminal_history.zig |
Implements UUID extraction from terminal text and resume command construction |
src/app/runtime.zig |
Orchestrates quit teardown worker, UUID extraction after completion, and resume command injection on startup |
docs/configuration.md |
Documents new terminal_agent_types and terminal_session_ids persistence fields |
docs/ARCHITECTURE.md |
Adds ADR-014 documenting agent detection strategy and quit/resume flow |
README.md |
Describes agent session persistence feature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a76615b4d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
When you quit Architect with an AI agent running in one of your terminals, the session is gone. Next time you open it, you're starting from scratch: re-launching the agent, waiting for it to load, manually running the resume command if you even remembered to save the session ID.
This PR fixes that. Architect now detects running agents at quit time, signals them gracefully, captures their session UUID from the PTY output, and writes it to persistence.toml. On next launch, the resume command is written straight to the PTY before you've even focused the terminal.
How it works
session/state.detectForegroundAgent()reads the foreground process group leader'sp_commviasysctl KERN_PROC_PID. Ifp_commisclaude,codex, orgemini, that's a match. If it'snode, we readKERN_PROCARGS2and checkargv[1]for the same names, which covers Node.js-wrapped agent CLIs.SIGTERM goes to the entire foreground process group (not just the leader), so subprocesses get the signal too. Then we block-read the PTY for up to 1.5 s using
poll/read, stopping early ifwaitpidorgetForegroundPgrpshows the agent is gone.terminal_history.extractAgentSessionId()scans the drained PTY text for the last occurrence of the agent's resume command prefix (e.g.,claude --resume) and slices out what follows.config.ziggains two new optional arrays inTomlPersistenceV3—terminal_agent_typesandterminal_session_ids— parallel toterminals. Old files load fine; the new fields default to absent.On the next launch, if both fields are present for a slot,
runtime.zigappends the resume command tosession.pending_writeright after spawning the shell. The shell reads it when ready.The 1.5 s drain runs synchronously on the main thread, outside the frame loop. This is the same exception as git I/O in the diff overlay (ADR-013), now documented as ADR-014.
Test plan
claudein a terminal, quit Architect with Cmd+Q, reopen — verify Claude resumes the previous session automaticallycodex(Node.js wrapper) andgeminipersistence.toml(without agent fields) — verify it loads without error and terminals restore normallyCloses #237