19 plugins · 47 agents · 40 skills

A modular runtime and orchestration system
for AI agents.

Structured pipelines, gated phases, specialized agents. Works with Claude Code, OpenCode, Codex CLI, Cursor, and Kiro. 3,750 tests. Production-grade.

Get Started View on GitHub

zsh

Paused

AI models write code.
That's not the hard part anymore.

The hard part is everything else. Picking what to work on. Managing branches. Reviewing output. Cleaning up AI artifacts. Handling CI. Addressing reviewer comments. Deploying. AgentSys automates all of it.

0 Plugins

0 Agents

0 Skills

0 Tests Passing

20 Commands. One Toolkit.

Each works standalone. Together, they automate everything.

/next-task

Task to production, fully automated

12-phase pipeline: discovery through deployment
Multi-agent review loop (code, security, perf, tests)
Persistent state -- resume from any phase
GitHub Issues, GitLab, or local task files

$ /next-task              # Start new workflow
$ /next-task --resume    # Resume interrupted workflow

/agnix

Lint agent configs before they break

385 validation rules across 36 categories
10+ AI tools: Claude Code, Cursor, Copilot, Codex, OpenCode, Gemini CLI
102 auto-fixable rules with --fix flag
SARIF output for GitHub Code Scanning

$ /agnix                # Validate current project
$ /agnix --fix          # Auto-fix fixable issues

/ship

Branch to merged PR in one command

Commits, pushes, creates PR, monitors CI
Waits for auto-reviewers, addresses every comment
Platform auto-detection (GitHub Actions, Railway, Vercel)
Merges, deploys, and cleans up

$ /ship                  # Full workflow
$ /ship --dry-run        # Preview without executing

/deslop

Kill AI slop before it ships

3-phase detection: regex, multi-pass analyzers, CLI tools
Certainty-graded findings (HIGH / MEDIUM / LOW)
JS/TS, Python, Rust, Go, Java
Auto-fix HIGH certainty issues

$ /deslop              # Report only (safe)
$ /deslop apply        # Fix HIGH certainty issues

/perf

Evidence-backed performance investigation

10-phase methodology with baselines and profiling
Hypothesis generation and controlled experiments
Breaking point analysis via binary search
Based on recorded real investigation sessions

$ /perf                # Start new investigation
$ /perf --resume       # Resume previous investigation

/drift-detect

Find what's documented but not built

AST-based plan vs code semantic analysis
JavaScript collectors + single Opus call
77% token reduction vs multi-agent approaches
Tested on 1,000+ repositories

$ /drift-detect              # Full analysis
$ /drift-detect --depth quick  # Quick scan

/audit-project

Multi-agent code review that iterates until clean

Up to 10 specialized agents per project
Security, performance, architecture, DB, API, frontend
Iterates until zero open issues remain
Auto-fixes all non-false-positive findings

$ /audit-project                # Full review
$ /audit-project --domain security  # Security only

/enhance

Analyze everything that shapes agent behavior

7 parallel analyzers for prompts, agents, plugins, docs
Certainty-graded findings with auto-fix support
Auto-learns false positives over time
Hooks and skills analysis included

$ /enhance                # Run all analyzers
$ /enhance --apply        # Apply HIGH certainty fixes

/repo-intel

Unified static analysis for AI agents

Git history intelligence: hotspots, coupling, ownership, bus factor
AST symbols: exports, functions, classes, imports
9 plugins consume repo-intel data automatically
Incremental updates, 20 query types

$ /repo-intel init     # First-time scan
$ /repo-intel query hotspots  # Most active files

/sync-docs

Keep docs in sync with code

Finds outdated references and stale examples
Detects missing CHANGELOG entries
Version mismatch detection
Auto-fixes safe issues like version numbers

$ /sync-docs           # Check what needs updates
$ /sync-docs apply     # Apply safe fixes

/learn

Research any topic, build a learning guide

Progressive discovery: broad to specific to deep
Quality-scored sources (authority, recency, depth)
Structured guide with examples and best practices
RAG index for future agent lookups

$ /learn react hooks --depth=deep   # Comprehensive
$ /learn kubernetes --depth=brief    # Quick overview

/consult

Get a second opinion from another AI tool

Cross-tool AI consultation via ACP transport
6 providers: Claude, Gemini, Codex, Copilot, Kiro, OpenCode
Effort-mapped model selection per provider
Session continuations and context injection

$ /consult "Is this the right approach?" --tool=gemini # Second opinion
$ /consult "Review for performance" --tool=codex      # Codex review

/debate

Structured adversarial debate between AI tools

Multi-round proposer/challenger format
Evidence-backed arguments with mandatory counterpoints
Any two AI tools as debaters (Claude, Gemini, Codex, Kiro, etc.)
Final verdict from the orchestrator

$ /debate codex vs gemini about microservices vs monolith  # Structured debate
$ /debate claude vs kiro about our auth implementation     # Codebase debate

/web-ctl

Browser automation for AI agents

Headless Playwright with encrypted session persistence
Human-in-the-loop auth handoff with CAPTCHA detection
Anti-bot measures and output sanitization
Snapshot-based accessibility tree for element discovery

$ /web-ctl goto https://example.com                       # Navigate
$ /web-ctl auth github --url https://github.com/login    # Auth handoff

/prepare-delivery

Pre-ship quality gates

Deslop, simplify, review loop, delivery validation, docs sync
Conditional agnix + enhance for config changes
Works standalone or as part of /next-task
Does not ship - use /gate-and-ship for full pipeline

$ /prepare-delivery                # Run all quality gates
$ /prepare-delivery --skip-review  # Skip review loop

/gate-and-ship

Quality gates then ship

Chains /prepare-delivery then /ship
One command from code-complete to merged PR
All flags forwarded to sub-commands
Each piece runs independently too

$ /gate-and-ship                # Full pipeline
$ /gate-and-ship --base=develop # Custom base branch

/release

Versioned release with ecosystem detection

Auto-detects: npm, cargo, go, python, maven, gradle
Discovers release tools (semantic-release, goreleaser, etc.)
Pre-release health check with repo-intel
Tag, publish, create GitHub release

$ /release              # Create release
$ /release --dry-run    # Preview without publishing

/skillers

Learn from your workflow patterns

Reads transcripts from Claude Code, Codex, OpenCode
Clusters patterns into themed knowledge
Suggests skills, hooks, and agents to automate repetitive work
No per-turn overhead - works from saved transcripts

$ /skillers            # Analyze workflow patterns
$ /skillers compact    # Compact transcripts into knowledge

/onboard

Codebase orientation for newcomers

Automated project data collection
Interactive guided tour of the codebase
Identifies key files, patterns, conventions
Works on any codebase - no setup required

$ /onboard            # Full onboarding tour
$ /onboard --quick    # Quick overview

/can-i-help

Find where to contribute

Matches developer skills to project needs
Finds test gaps, stale docs, open issues
Good-first-task identification
Uses repo-intel for data-driven suggestions

$ /can-i-help                      # Find contribution opportunities
$ /can-i-help --skills=typescript  # Match specific skills

Built Different

Not another AI wrapper. Engineering-grade workflow automation.

Code does code work. AI does AI work.

Static analysis, regex, and AST for detection. LLMs only for synthesis and judgment. 77% fewer tokens than multi-agent approaches.

One agent, one job, done well

47 specialized agents, each with a narrow scope and clear success criteria. No agent tries to do everything.

Pipeline with gates

Each step must pass before the next begins. Can't push before review. Can't merge before CI. Hooks enforce it.

Validate plan and results

Approve the plan. See the results. The middle is automated. One approval unlocks autonomous execution.

Benchmarks

Structured prompts and enriched context do more for output quality than model tier.

Sonnet + AgentSys beats raw Opus

Sonnet + agentsys: $0.66, 6,084 tokens, specific recommendations. Raw Opus: $1.10, 2,841 tokens, generic output. 40% cheaper, 2x more output.

Model tier matters less

With agentsys, Sonnet matches Opus quality. Pipeline structure captures the gains. 73-83% cost reduction with equivalent outcomes.

Invest in pipeline, not model spend

Better prompts, richer context, enforced phases - these compound in ways that model upgrades alone don't. Tested on real tasks against glide-mq.

47 Agents. 40 Skills.

Right model for the task. Opus reasons. Sonnet validates. Haiku executes.

exploration-agentopus

Deep codebase analysis and context gathering

planning-agentopus

Step-by-step implementation design

implementation-agentopus

Autonomous code writing and modification

perf-orchestratoropus

Performance investigation coordination

perf-analyzeropus

Deep performance analysis and profiling

learn-agentopus

Web research and learning guide creation

plan-synthesizeropus

Multi-source plan synthesis and merging

agent-enhanceropus

Agent configuration quality analysis

claudemd-enhanceropus

CLAUDE.md file optimization

docs-enhanceropus

Documentation quality improvement

hooks-enhanceropus

Git hooks and automation analysis

prompt-enhanceropus

Prompt engineering best practices

skills-enhanceropus

Skill definition quality analysis

debate-orchestratoropus

Structured adversarial debate coordination

skillers-recommenderopus

Workflow pattern analysis and automation suggestions

40 Skills across 19 Plugins

prepare-delivery

prepare-delivery check-test-coverage orchestrate-review validate-delivery

enhance

enhance-orchestrator enhance-agents enhance-claudemd enhance-docs enhance-hooks enhance-plugins enhance-prompts enhance-skills enhance-cross-file

perf

baseline benchmark profile theory-tester theory-gatherer code-paths investigation-logger perf-analyzer

next-task

discover-tasks

web-ctl

web-auth web-browse

skillers

recommend skillers-compact

single-skill plugins

deslop drift-analysis repo-intel sync-docs learn consult debate release onboard can-i-help audit-project

glidemq

glide-mq glide-mq-migrate-bullmq glide-mq-migrate-bee

Get Started in 30 Seconds

Recommended

$ /plugin marketplace add agent-sh/agentsys
$ /plugin install next-task@agentsys
$ /plugin install ship@agentsys

Interactive installer for Claude Code, OpenCode, and Codex CLI

$ npm install -g agentsys && agentsys

Clone and install from source

$ git clone https://github.com/agent-sh/agentsys.git
$ cd agentsys
$ npm install

A modular runtime and orchestration system for AI agents.

AI models write code.That's not the hard part anymore.