Quality gate plugin for Claude Code. Blocks Claude from stopping (via Claude Code's Stop hook) until work passes review by an independent agent. The independent agent uses consensus review when possible (asking Codex and/or Gemini for second opinions).
Be aware:
- It's kind of like a "super ultra extra thinking" mode for Claude Code, with some interesting and useful properties.
- If you're familiar, it makes Claude Code feel like Codex with high/xhigh reasoning, but gives you the coding capabilities and speed of Claude Code.
- Can intentionally be used to run Claude Code on a task for many hours without intervention.
- If you're tight on tokens, I would not recommend unless you're rolling a variant of the Max plan. The reviews are extensive and exhaustive, and the token usage is consequentially large.
- I'm thinking about how to optimize this via prompting, but that work hasn't been done yet.
- When Codex is used within consensus reviewing, in particular, on high or xhigh reasoning will take several minutes to review -- but is extremely thorough. Keep this in mind.
For best results, I would mix in Codex and/or Gemini -- just install the CLIs and auth them, alice will pick them up automatically. Mixing multiple agents into the review process seems to really improve the steering.
What this plugin doesn't solve: what you desire or how you communicate it. Be clear about what you want before you turn it on.
curl -fsSL https://evil-mind-evil-sword.github.io/releases/alice/install.sh | shThis installs:
jwz- Agent messagingtissue- Issue trackingjq- JSON parsing (if needed)- The alice plugin (registered with Claude Code)
Those other two binaries (jwz and tissue) are small Zig programs which allow Claude Code to store issues, messages, retain state (all in JSONL + SQLite, like beads) -- and are used by alice to track the state required to enforce the reviewer pattern (as well as giving Claude Code a place to store issues, research notes, etc). The plugin assumes these binaries are available and contains explicit instructions for how the agent should use them. The goal here is to make it easy to install these and get started (meaning: the goal is you shouldn't have to think about them!)
#alice <your prompt>
alice uses the UserInput hook to look at your prompt, parse it, and see if you've invoked #alice. It then uses jwz to set a session message, enabling the Stop hook.
Review is opt-in per-prompt. After alice approves, the gate resets automatically.
LLMs struggle to reliably evaluate their own outputs (Huang et al., 2023). A model asked to verify its work tends to confirm rather than critique. This creates a gap in agentic coding workflows—agents can exit believing they've completed a task when issues remain.
Research on multi-agent debate suggests a path forward: models produce more accurate outputs when they critique each other (Du et al., 2023; Liang et al., 2023).
alice applies this idea: rather than prompting agents to review themselves, it blocks exit until an independent reviewer (alice, a subagent) explicitly approves.
Agent works → tries to exit → Stop hook → alice reviewed? → block/allow
#aliceat start of prompt enables review (using session state stored via jwz)- Stop hook runs on every agent "stop" attempt (when Claude Code stops and waits for you)
- If review enabled but no approval: blocks exit, agent must spawn alice
- alice (adversarial reviewer) examines the work
- Creates tissue issues for problems found
- Posts decision:
COMPLETEallows exit,ISSUESkeeps agent working
- The loop repeats until Alice is satisfied that the main agent has satisfied your prompt task. Alice is directed to ignore the main agent's attempts to convince Alice that they've addressed the task, and instead take an independent perspective.
- Otherwise, Claude Code operates normally.
┌────────────────────────────────────────────────────────────────┐
│ Claude Code │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ alice plugin │ │
│ │ │ │
│ │ ┌─────────┐ │ │
│ │ │ alice │ Reviewer agent (Claude Opus) │ │
│ │ │ │ Read-only: cannot modify files │ │
│ │ └────┬────┘ │ │
│ │ │ posts decision │ │
│ │ ▼ │ │
│ │ ┌───────────┐ ┌───────────┐ │ │
│ │ │ jwz │ │ tissue │ │ │
│ │ │ (messages)│ │ (issues) │ │ │
│ │ └───────────┘ └───────────┘ │ │
│ │ ▲ ▲ │ │
│ │ │ reads status │ checks issues │ │
│ │ │ │ │ │
│ │ ┌─────┴─────────────────────┴─────┐ │ │
│ │ │ Stop Hook │ │ │
│ │ │ (hooks/stop-hook.sh) │ │ │
│ │ └─────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
Three principles guide alice's architecture:
| Principle | Implementation |
|---|---|
| Pull over push | Agents retrieve context on demand, not via large upfront injections |
| Safety over policy | Critical guardrails enforced mechanically (hooks), not via prompts |
| Pointer over payload | Messages contain references (issue IDs, session IDs), not inline content |
alice extends Claude Code with domain-specific capabilities:
| Skill | Purpose | When to Use |
|---|---|---|
| reviewing | Query other LLMs for a second opinion | When you want another model to check the work |
| researching | Cited research with source verification | Complex topics requiring evidence |
| issue-tracking | Git-native work tracking via tissue | Managing tasks, dependencies, priorities |
| technical-writing | Multi-layer document review (structure/clarity/evidence) | Documentation, design docs, papers |
| bib-managing | Bibliography curation with bibval | Academic citations, reference validation |
The reviewing skill queries external models for independent perspectives:
Priority: codex CLI → gemini CLI → claude -p fallback
This provides a second opinion from a different model. Configure the specific models via each CLI's settings.
alice is the reviewer agent. It critiques the main agent's work rather than accepting it uncritically. Key properties:
- Model: Claude Opus
- Access: Read-only (cannot modify files)
- Tools: Read, Grep, Glob, Bash (restricted to
tissueandjwzcommands)
alice reviews proportionally to scope:
- Simple Q&A → instant
COMPLETE - Bug fix → verify the fix is correct
- New feature → check implementation completeness
- Refactor → ensure behavior is preserved
When issues are found, alice creates tissue issues tagged alice-review and blocks exit until they're resolved.
| Area | Reference | Relevance to alice |
|---|---|---|
| Self-correction limits | Huang et al., 2023 | Motivates using a separate reviewer rather than self-review |
| Multi-agent debate | Du et al., 2023 | Supports querying multiple models for review |
| Constitutional AI | Bai et al., 2022 | Informs alice's structured critique approach |
| Code review practices | Sadowski et al., 2018 | Supports mandatory review before landing code |
| Dependency | Purpose | Required |
|---|---|---|
| jwz | Agent messaging and coordination | Yes |
| tissue | Git-native issue tracking | Yes |
| jq | JSON parsing in hooks | Yes |
| codex | OpenAI CLI for second opinions (reviewing skill) | Optional |
| gemini | Google CLI for second opinions (reviewing skill) | Optional |
| bibval | Citation validation (bib-managing skill) | Optional |
alice captures session events for post-hoc analysis via the alice CLI:
# Show trace for a session
alice trace <session_id>
# Verbose mode - see tool inputs and outputs
alice trace <session_id> -v
# Export as GraphViz DOT
alice trace <session_id> --format dot > trace.dotExample output:
=== Session abc123 ===
[1] prompt_received: "Fix the auth bug" (01KE5DEF)
[2] tool_completed: Read (01KE5GHI)
[3] tool_completed: Edit (01KE5JKL)
[4] tool_completed: Bash [FAILED] (01KE5MNO)
4 events total
=== End Session ===
Traces help debug tool failures and understand agent behavior.
alice/
├── .claude-plugin/
│ └── plugin.json # Plugin metadata
├── agents/
│ └── alice.md # Adversarial reviewer
├── hooks/
│ ├── hooks.json # Hook configuration
│ ├── stop-hook.sh # Quality gate
│ └── user-prompt-hook.sh
├── skills/
│ ├── reviewing/ # Multi-model consensus
│ ├── researching/ # Cited research
│ ├── issue-tracking/ # tissue integration
│ ├── technical-writing/ # Document review
│ └── bib-managing/ # Bibliography curation
├── src/ # Zig CLI source
│ ├── main.zig
│ ├── root.zig
│ └── trace.zig
├── docs/
│ ├── architecture.md # Detailed design
│ └── references.bib # Academic sources
└── tests/
└── stop-hook-test.sh # Hook tests
- Architecture documentation - Detailed design and flow diagrams
- Contributing guide - How to add agents and skills
- Changelog - Version history
AGPL-3.0