Skip to content

ci::apm-sync wipes .claude/ mid-run, breaking running Claude Code stop hooks #468

@srid

Description

@srid

Problem

just ci::apm-sync depends on the apm recipe, which implements its "clean sync" by deleting and recreating the APM-owned tree on the live .claude/ directory:

apm:
    find .claude -mindepth 1 -maxdepth 1 ! -name launch.json -exec rm -rf {} +
    {{ apm_cmd }} install

Between the find ... -exec rm -rf and apm install steps (and for the duration of the reinstall), files under .claude/hooks/, .claude/settings.json, .claude/commands/, .claude/rules/, and .claude/skills/ do not exist on disk.

If a developer is running Claude Code in the same worktree while just ci is running, and Claude Code's stop hook fires during that window, the hook fails with:

Stop hook error: Failed with non-blocking status code: /bin/sh: line 1:
.claude/hooks/agency/scripts/do-stop-guard.sh: No such file or directory

I hit this today while /do was running just ci in the background — the stop hook fired at a turn boundary during the apm-sync step and couldn't find its own script.

Why this is fragile

apm-sync is supposed to be a verification step ("does the vendored tree match the sources?") — but it mutates the live tree to perform that verification. Any concurrent reader of .claude/ during CI sees a half-built filesystem. That includes:

  • Claude Code sessions running in the same worktree (stop hooks, pre-tool hooks, skill loads, .claude/rules/*.md reads)
  • Editors/LSPs that watch .claude/
  • Other just recipes that expect .claude/ to be stable

Suggested fix

Perform the wipe-and-reinstall in a scratch directory, then compare against the live tree without touching it:

apm-sync: apm-audit
    #!/usr/bin/env bash
    set -euo pipefail
    scratch=$(mktemp -d)
    trap 'rm -rf "$scratch"' EXIT
    cp -a .claude/launch.json "$scratch/" 2>/dev/null || true
    (cd "$scratch" && {{ apm_cmd }} install)
    diff -r .claude "$scratch" || {
        echo "ERROR: .claude/ out of sync with .apm/ — run: just ai::apm"
        exit 1
    }

This keeps the CI check's semantics (fail if the live tree diverges from what apm install would produce) without ever wiping the live tree. The just ai::apm developer recipe can keep its destructive behavior — that's an explicit local-dev action, not a CI check.

Alternatively, narrow the window: stage a fresh tree under .claude.new/, then do an atomic mv swap so no single file is ever missing for more than a few milliseconds. That's simpler but still briefly inconsistent (stop hook fires between the swap steps).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions