# Skill Forge (SKF) Documentation (Full) > Complete documentation for AI consumption > Generated: 2026-04-17 > Repository: https://github.com/armelhbobdad/bmad-module-skill-forge ## 404 — no source cited No page resolves to that URL. In keeping with the rest of this site, if a claim can't cite a source, we don't fake it — even for our own docs. Try one of these cited sources instead, or use the **search bar** at the top of the page: **Why** — [Why Skill Forge?](/why-skf/) · [Verifying a Skill](/verifying-a-skill/) **Try** — [Getting Started](/getting-started/) · [How It Works](/how-it-works/) · [Examples](/examples/) · [Workflows](/workflows/) **Reference** — [Concepts](/concepts/) · [Architecture](/architecture/) · [Skill Model](/skill-model/) · [Agents](/agents/) · [BMAD Synergy](/bmad-synergy/) · [Troubleshooting](/troubleshooting/) --- ### Think this page should exist? A missing page is drift — a citation the docs site made to itself that no longer resolves. [Open an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) with the URL you tried. SKF's docs ship a [drift validator](/verifying-a-skill/#build-time-drift-detection-for-docs-themselves) that catches exactly this pattern — if one slipped through, it's a bug we want to know about. ## Ferris — Skill Architect & Integrity Guardian **ID:** `skf-forger` **Icon:** ⚒️ **Role:** The only agent in SKF. Manages the entire skill compilation lifecycle. Ferris extracts, compiles, validates, and packages agent skills from code repositories, documentation, and developer discourse. **When to Use:** Ferris handles all SKF workflows. You always interact with Ferris — he switches modes based on which workflow you invoke. **Key Capabilities:** - AST-backed code extraction via ast-grep - Semantic code discovery via cocoindex-code for intelligent file pre-ranking - QMD knowledge search for temporal context and evidence - agentskills.io specification compliance and validation - GitHub source navigation and package-to-repo resolution - Cross-library synthesis for stack skills and integration patterns - Skill authoring best practices enforcement (third-person voice, consistent terminology, discovery optimization) - Source-derived scripts and assets extraction with provenance tracking - **Pipeline orchestration** — chain multiple workflows with automatic data forwarding and circuit breakers - **Headless mode** — skip confirmation gates for power users and batch operations (`--headless` or `-H`) **Workflow-Driven Modes:** | Mode | Behavior | Workflows | |------|----------|-----------| | **Architect** | Exploratory, structural, assembling | SF, AN, BS, CS, QS, SS, RA | | **Surgeon** | Precise, semantic diffing, preserves [MANUAL] | US | | **Audit** | Judgmental, drift reports, completeness scoring | AS, TS, VS | | **Delivery** | Packaging, platform-aware, ecosystem-ready | EX | | **Management** | Transactional rename/drop with platform context rebuild | RS, DS | **Communication Style:** - During work: structured reports with AST citations, no metaphor - At transitions: forge language, brief and warm - On completion: quiet craftsman's pride - On errors: direct and actionable **Menu:** ``` ⚒️ Ferris — Skill Forge START HERE: [SF] Setup Forge — Initialize your forge environment [AN] Analyze Source — Discover what to skill CREATE: [BS] Brief Skill — Design a skill scope [CS] Create Skill — Compile a skill from brief [QS] Quick Skill — Fast skill, no brief needed [SS] Stack Skill — Consolidated project stack skill (code-mode or compose-mode) VERIFY: [VS] Verify Stack — Pre-code architecture feasibility check [RA] Refine Architecture — Improve architecture with skill evidence MAINTAIN: [US] Update Skill — Regenerate after changes [AS] Audit Skill — Check for drift [TS] Test Skill — Verify completeness DELIVER: [EX] Export Skill — Package for distribution MANAGE: [RS] Rename Skill — Rename across all versions (transactional) [DS] Drop Skill — Deprecate or purge a skill version [WS] Workflow Status — Show current lifecycle position [KI] Knowledge Index — List available knowledge fragments ``` **Pipeline Aliases:** Ferris chains multiple workflows in one command via named aliases (`forge`, `forge-quick`, `onboard`, `maintain`). The full alias table, expansion rules, and target-resolution contract live in [Workflows → Pipeline Mode](../workflows/#pipeline-mode) — the canonical source. Example: `@Ferris forge-quick cognee` chains Quick → Test → Export with automatic data forwarding. **Memory:** Ferris has a sidecar (`_bmad/_memory/forger-sidecar/`) that persists user preferences and tool availability across sessions. Set `headless_mode: true` in preferences to make headless the default. Skill Forge runs entirely inside the LLM context window through structured instructions. There is no external orchestrator — just an agent persona (Ferris), a set of workflows, and a curated knowledge base. This page covers the machinery. For an end-to-end walkthrough, see [How It Works](../how-it-works/). For what a skill contains, see [Skill Model](../skill-model/). --- ## How BMAD Works [BMAD](https://docs.bmad-method.org/) tackles complex, open-ended work by decomposing it into **repeatable workflows**. Every workflow is a sequence of small, explicit steps, so the AI takes the same route on every run. A **shared knowledge base** of standards and patterns backs those steps, keeping outputs consistent instead of improvised. The formula is simple: **structured steps + shared standards = reliable results**. SKF plugs into BMAD the same way a specialist plugs into a team. It uses the same step-by-step workflow engine and shared standards, but focuses exclusively on skill compilation and quality assurance. --- ## Building Blocks Each workflow directory contains these files, and each has a specific job: | File | What it does | When it loads | |---------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------| | `SKILL.md` | Human-readable entry point — goals, role definition, initialization sequence, invocation contract, routes to first step | Entry point per workflow | | `steps-c/*.md` | **Create** steps — primary execution, 4–10 sequential files per workflow (the last one always chains to the shared health check) | One at a time (just-in-time) | | `references/*.md` | Workflow-specific reference data — rules, patterns, protocols | Read by steps on demand | | `assets/*.md` | Workflow-specific output formats — schemas, templates, heuristics | Read by steps on demand | | `templates/*.md` | Output skeletons with placeholder vars — steps fill these in to produce the final artifact | Read by steps when generating output | | `scripts/*.py` | Deterministic Python scripts — scoring, validation, structural diffing, manifest operations | Invoked by steps via `uv run` for reproducible computation | **Module-level shared files** (not per-workflow — loaded by the agent or referenced across workflows): | File | What it does | When it loads | |---------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------------------------------| | `skf-forger/SKILL.md` | Expert persona — identity, principles, critical actions, menu of triggers | First — always in context | | `knowledge/skf-knowledge-index.csv` | Knowledge fragment index — id, name, tags, tier, file path | Read by steps to decide which fragments to load | | `knowledge/*.md` | 14 reusable fragments + overview.md index — cross-cutting principles and patterns (e.g., `zero-hallucination.md`, `confidence-tiers.md`, `ccc-bridge.md`) | Selectively read into context when a step directs | | `shared/scripts/*.py` | 7 cross-workflow Python scripts — preflight checks, manifest ops, managed-section rebuilds, frontmatter validation, severity classification, structural diffing, skill inventory | Invoked by any workflow that needs deterministic computation | ```mermaid flowchart LR U[User] --> A[Agent Persona] A --> W[Workflow Entry: SKILL.md] W --> S[Step Files: steps-c/] S --> K[Knowledge Fragments
skf-knowledge-index.csv → knowledge/*.md] S --> D[References & Assets
references/*.md, assets/*.md, templates/*.md] S --> P[Scripts
scripts/*.py, shared/scripts/*.py] S --> O[Outputs: skills/, forge-data/, sidecar
when a step writes output] ``` ## Runtime flow 1. **Trigger** — User types `@Ferris CS` (or fuzzy match like `create-skill`). The agent menu in `skf-forger/SKILL.md` maps the trigger to the workflow path. 2. **Agent loads** — `skf-forger/SKILL.md` injects the persona (identity, principles, critical actions) into the context window. Sidecar files (`forge-tier.yaml`, `preferences.yaml`) are loaded for persistent state. 3. **Workflow loads** — `SKILL.md` presents the mode choice and routes to the first step file. 4. **Step-by-step execution** — Only the current step file is in context (just-in-time loading). Each step explicitly names the next one. The LLM reads, executes, saves output, then loads the next step. No future steps are ever preloaded. 5. **Sub-agent delegation** — When a step needs to process large files (full `SKILL.md` documents, multiple `references/*.md` files, parallel per-library extraction), it spawns sub-agents via the Agent tool instead of loading the content into the parent context. Each sub-agent receives a file path, extracts a compact JSON summary, and returns it. Up to 8 sub-agents run concurrently. The parent collects JSON summaries without ever loading the full source — context isolation by design, preventing one step's data from bloating the window for the next. Used in TS (coverage check), RA (gap analysis), VS (integration verification), AS (structural diff), and SS (parallel extraction). 6. **Knowledge injection** — Steps consult `skf-knowledge-index.csv` and selectively load fragments from `knowledge/` by tags and relevance. Cross-cutting principles (zero hallucination, confidence tiers, provenance) are loaded only when a step directs — not preloaded. 7. **Reference and asset injection** — Steps read `references/*.md` and `assets/*.md` files as needed (rules, patterns, schemas, heuristics). This is deliberate context engineering: only the data relevant to the current step enters the context window. 8. **Script execution** — Steps invoke deterministic Python scripts (`scripts/*.py`, `shared/scripts/*.py`) via `uv run` for computation that must be reproducible: scoring, structural diffing, manifest operations, frontmatter validation. The LLM prepares inputs, the script computes, the LLM uses the output. Same inputs always produce the same result. 9. **Templates** — When a step produces output (e.g., a skill brief or test report), it reads the template file and fills in placeholders with computed results. The template provides consistent structure; the step provides the content. 10. **Progress tracking** — Each step appends to an output file with state tracking. Resume mode reads this state and routes to the next incomplete step. ## Ferris Operating Modes Ferris operates in five workflow-driven modes (mode is determined by which workflow is running, not conversation state): | Mode | Workflows | Behavior | |---------------|--------------------|-------------------------------------------------------------| | **Architect** | SF, AN, BS, CS, QS, SS, RA | Exploratory, assembling, refining — discovers structure, scopes skills, and improves architecture | | **Surgeon** | US | Precise, semantic diffing — preserves [MANUAL] sections during regeneration | | **Audit** | AS, TS, VS | Judgmental, scoring — evaluates quality and detects drift | | **Delivery** | EX | Validates package, generates snippets, injects into context files | | **Management** | RS, DS | Transactional rename/drop — copy-verify-delete with platform context rebuild | --- ## Tool Ecosystem ### 7 Tools | Tool | Wraps | Purpose | |------|-------|---------| | **`gh_bridge`** | GitHub CLI (`gh`) | Source code access, issue mining, release tracking, PR intelligence | | **`skill-check`** | [thedaviddias/skill-check](https://github.com/thedaviddias/skill-check) | Validation + auto-fix (`check --fix`), quality scoring (0-100), security scan, split-body, diff comparison | | **`tessl`** | [tessl](https://tessl.io) | Content quality review, actionability scoring, progressive disclosure evaluation, AI judge with suggestions | | **`ast_bridge`** | ast-grep CLI | Structural extraction, custom AST queries, co-import detection | | **`ccc_bridge`** | cocoindex-code | Semantic code search, project indexing, file discovery pre-ranking | | **`qmd_bridge`** | QMD (local search) | BM25 keyword search, vector semantic search, collection indexing | | **`doc_fetcher`** | Environment web tools | Remote documentation fetching for T3-confidence content. Tool-agnostic — uses whatever web fetching is available (Firecrawl, WebFetch, curl, etc.). Output quarantined as T3. | Bridge names are **conceptual interfaces** used throughout workflow steps. Each bridge resolves to concrete MCP tools, CLI commands, or fallback behavior depending on the IDE environment. See [`src/knowledge/tool-resolution.md`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/knowledge/tool-resolution.md) for the complete resolution table. ### Conflict Resolution When tools disagree, higher priority wins for instructions. Lower priority is preserved as annotations: | Priority | Source | Tool | |----------|--------|------| | 1 (highest) | AST extraction | `ast_bridge` | | 1b | CCC discovery (pre-ranking) | `ccc_bridge` | | 2 | QMD evidence | `qmd_bridge` | | 3 | Source reading (non-AST) | `gh_bridge` | | 4 | External documentation | `doc_fetcher` | ### Manifest Detection Stack skill workflows detect project dependencies by scanning for manifest files. This isn't a tool — it's a reference pattern ([`skf-create-stack-skill/references/manifest-patterns.md`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/skf-create-stack-skill/references/manifest-patterns.md)) consulted by workflow steps: | Ecosystem | Manifest Files | |-----------|----------------| | JavaScript / TypeScript | `package.json` | | Python | `requirements.txt`, `setup.py`, `pyproject.toml`, `Pipfile` | | Rust | `Cargo.toml` | | Go | `go.mod` | | Java | `pom.xml`, `build.gradle` | | Ruby | `Gemfile` | | PHP | `composer.json` | | .NET | `*.csproj` | Detection runs at depth 0-1 from project root, excluding dependency trees (`node_modules/`, `.venv/`, `vendor/`), build output (`dist/`, `target/`, `__pycache__/`), and VCS directories. --- ## Workspace Artifacts Build artifacts are committable — another developer can reproduce the same skill: ``` forge-data/{skill-name}/ ├── skill-brief.yaml # Compilation config (version-independent) └── {version}/ ├── provenance-map.json # Source map with AST bindings ├── evidence-report.md # Build audit trail └── extraction-rules.yaml # Language-specific ast-grep schema ``` The `provenance-map.json` includes per-export `entries` with a `source_library` field identifying which library each export belongs to. For stack skills, it also includes an `integrations` array (cross-library patterns) and a `constituents` array (compose-mode only — tracks the compose-time snapshot of each source skill for staleness detection via metadata hash comparison). The `file_entries` array handles script/asset file-level provenance (SHA-256 hashes, source paths). ### Pipeline Result Contracts Pipeline-facing workflows write a machine-readable result JSON file alongside their human-readable output. This enables reliable CI integration and pipeline chaining — downstream workflows or scripts can verify what the prior step produced without parsing markdown. Each run writes two files: a timestamped per-run record (`{skill-name}-result-{YYYYMMDD-HHmmss}.json`) that preserves the full audit trail across retries and aborts, and a stable `{skill-name}-result-latest.json` copy that pipeline consumers read without enumerating timestamps. The schema follows a consistent format: `skill`, `status` (success/failed/partial), `timestamp`, `outputs` (array of produced artifacts with type and path), and a skill-specific `summary` object. `skills/` and `forge-data/` are committed. Agent memory (`_bmad/_memory/forger-sidecar/`) is gitignored. --- ## Knowledge Base SKF relies on a curated skill compilation knowledge base: - Index: [`src/knowledge/skf-knowledge-index.csv`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/knowledge/skf-knowledge-index.csv) - Fragments: [`src/knowledge/`](https://github.com/armelhbobdad/bmad-module-skill-forge/tree/main/src/knowledge) Workflows load only the fragments required for the current task to stay focused and compliant. --- ## Module Structure ``` src/ ├── skf-forger/ # Agent skill (SKILL.md + manifest) ├── skf-setup/ # Setup skill (forge initialization) ├── skf-analyze-source/ ├── skf-brief-skill/ ├── skf-create-skill/ ├── skf-quick-skill/ ├── skf-create-stack-skill/ ├── skf-verify-stack/ ├── skf-refine-architecture/ ├── skf-update-skill/ ├── skf-audit-skill/ ├── skf-test-skill/ ├── skf-export-skill/ ├── skf-rename-skill/ ├── skf-drop-skill/ ├── forger/ │ ├── forge-tier.yaml │ ├── preferences.yaml │ └── README.md ├── knowledge/ │ ├── skf-knowledge-index.csv │ └── *.md (14 knowledge fragments + overview.md index) ├── shared/ # Cross-workflow resources ├── module.yaml # Module metadata (code, name, config vars) └── module-help.csv # Skill menu for bmad-help integration ``` --- ## Security - All tool wrappers use array-style subprocess execution — no shell interpolation - Input sanitization: allowlist characters for repo names, file paths, patterns - File paths validated against project root (no directory traversal) - **Source code never leaves the machine.** All processing is local (AST, QMD, validation). - `doc_fetcher` informs users which URLs will be fetched externally before processing --- ## Ecosystem Alignment SKF produces skills compatible with the [agentskills.io](https://agentskills.io) ecosystem: - Full [specification](https://agentskills.io/specification) compliance - Distribution via [`npx skills add/publish`](https://www.npmjs.com/package/skills) - Compatible with [agentskills/agentskills](https://github.com/agentskills/agentskills) and [vercel-labs/skills](https://github.com/vercel-labs/skills) --- ## Appendix: Key Design Decisions | Decision | Rationale | |----------|-----------| | **Solo agent (Ferris), not multi-agent** | One domain (skill compilation) doesn't benefit from handoffs. Shared knowledge base (AST patterns, provenance maps) is the core asset. | | **Workflows drive modes, not conversation** | Ferris doesn't auto-switch based on question content. Invoke a workflow to change mode. Predictable behavior. | | **Hub-and-spoke cross-knowledge** | Each skill covers one source repository. Stack skills compose cross-library integration patterns in `references/integrations/`, citing each library's own skill. | | **Stack skill = compositional** | SKILL.md is the integration layer. references/ contains per-library + integration pairs. Partial regeneration on dependency updates. | | **Snippet updates only at export** | Create/update write a draft `context-snippet.md` to `skills/`. Export regenerates the final `context-snippet.md` and publishes it to the platform context file (CLAUDE.md/AGENTS.md/.cursorrules). No managed-section updates in draft workflows. | | **Bundle spec, version-pin at release** | Offline-capable. SKF ships with a vendored agentskills.io spec pinned at release time; spec drift is a maintainer concern handled at SKF release, not a runtime concern for users. |
This page builds on BMAD concepts (BMM phases, TEA, modules). New to BMAD? Start with the [BMAD docs](https://docs.bmad-method.org/) first. New to SKF? Read [Getting Started](../getting-started/) instead. --- ## Launcher Skills vs Content Skills A BMAD project that also uses SKF ends up with two different kinds of `SKILL.md` files living in the same IDE skills directory. SKF supports 23 IDEs — Claude Code (`.claude/skills/`), Cursor (`.cursor/skills/`), GitHub Copilot (`.github/skills/`), Windsurf (`.windsurf/skills/`), Cline (`.cline/skills/`), Roo Code (`.roo/skills/`), Gemini CLI (`.gemini/skills/`), and 16 others — each with its own skill directory. See the [complete IDE → Context File mapping](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/skf-export-skill/assets/managed-section-format.md) for the full list. They look similar. They are not the same thing. This is the single most durable point of confusion, so get it straight up front. | | BMAD launcher skill | SKF content skill | |---|---|---| | **Created by** | `npx bmad-method install` (when you pick a module) | `@Ferris CS` / `QS` / `SS` | | **File contains** | A thin wrapper that loads a BMAD workflow, agent, or task | The instructions themselves, with citations to real source code | | **Updates when** | You reinstall or upgrade BMAD | You re-run SKF compilation against a new upstream version | | **Provenance** | Points to a BMAD workflow file inside `_bmad/` | Points to upstream repo commits, files, and line ranges | | **Example** | `bmad-create-prd/SKILL.md` loads a PRD workflow | `hono-4.6.0/SKILL.md` contains verified Hono API signatures | > BMAD skills *launch workflows*. SKF skills *are the workflows' output, frozen with citations*. Both coexist in the same IDE skills directory on purpose. When a BMAD agent runs a workflow, that workflow can consult SKF content skills for verified API knowledge. The two kinds of skills compose — they don't compete. --- ## SKF and BMM: Phase-by-Phase Playbook BMM is BMAD's core [4-phase workflow](https://docs.bmad-method.org/) (Analysis → Planning → Solutioning → Implementation). SKF has five concrete entry points across those phases. The diagram below shows the end-to-end picture; the subsections that follow give the trigger, command, and artifact flow for each phase. ```mermaid flowchart TD P1[BMM Phase 1: Analysis
product-brief · research] -.->|unfamiliar deps| AN[SKF: Analyze Source] AN --> BS[SKF: Brief Skill] BS -.->|risk register| P1 P1 --> P2[BMM Phase 2: Planning
create-prd] P2 -.->|uncertain API| QS[SKF: Quick Skill] QS -.->|verified API ref| P2 P2 --> P3[BMM Phase 3: Solutioning
create-architecture] P3 -.->|declared stack| VS[SKF: Verify Stack] VS --> RA[SKF: Refine Architecture] RA -.->|refined arch| P3b[BMM: check-implementation-readiness] P3 --> P3b P3b --> P4[BMM Phase 4: Implementation
create-story · dev-story] P4 -.->|story libs| CS[SKF: Create Skill / Stack Skill] CS -.->|verified skill context| P4 P4 --> RETRO[BMM: retrospective] RETRO -.->|API confusion found| US[SKF: Update Skill] US -.->|patched skill| P4 ``` ### Phase 1 — Analysis **Trigger:** A brownfield repo or an unfamiliar third-party dependency surfaces during `product-brief` or `research`. The team can't answer "what does this library actually expose?" from training data. **SKF command:** `@Ferris AN` on the repo, then `@Ferris BS` to scope each priority library. **What flows back:** Recommended skill boundaries, an analysis report of the discovered units, and one skill-brief per library that's ready to compile later. The scoping data is what PMs typically feed into their own risk register. **Why now, not later:** Catching surprise libraries during Analysis keeps the PRD honest. Discovering the same unknowns during Implementation forces course corrections that a two-paragraph risk entry could have prevented. ### Phase 2 — Planning **Trigger:** The PRD draft references an API and you realize nobody on the team is 100% sure how it behaves. **SKF command:** `@Ferris QS ` — no brief needed. **What flows back:** A verified skill you can cite directly in acceptance criteria. PM and architect read the same source. **Why now, not later:** Quick Skill is cheap insurance. It takes under a minute and prevents a whole class of "actually that function doesn't exist" moments during story writing. ### Phase 3 — Solutioning This is the highest-value integration. BMM's architect agent works from assumptions about the declared stack; SKF is how those assumptions become evidence-backed before the team commits to an implementation readiness check. **Trigger:** Architecture draft exists, `check-implementation-readiness` hasn't run yet. **SKF commands:** `@Ferris QS ` per declared dependency, then `@Ferris VS`, then `@Ferris RA` on any gaps or failures. **What flows back:** A pass/fail feasibility report per component, version-pinned evidence for every claim, and a refined architecture document with verified API signatures filled in at the callout points. **Why now, not later:** Running VS after Implementation has started means your stories are already built on an unverified foundation. The loop below is designed to iterate cheaply *before* code gets written. ```mermaid flowchart TD ARCH[BMM: create-architecture draft] --> GEN["SKF: Create Skill | Quick Skill
(per declared dependency)"] GEN --> VS[SKF: Verify Stack] VS -->|pass| READY[BMM: check-implementation-readiness] VS -->|fail / gaps| RA[SKF: Refine Architecture] RA -.->|refined draft| VS RA --> READY ``` The "Pre-Code Architecture Verification — Greenfield Confidence" scenario in [Examples](../examples/) walks through a concrete case of this loop. ### Phase 4 — Implementation Two distinct triggers fire during Implementation, one at the start of each story and one after each retrospective. **Trigger A (before `create-story`):** The story touches a library whose API isn't already in a content skill. **SKF command:** `@Ferris CS` for a single library, or `@Ferris SS` when the story spans several dependencies. **What flows back:** A verified content skill the `dev-story` workflow can consult during implementation — no training-data guessing about function signatures. **Trigger B (after `retrospective`):** The retro flagged something like "we kept getting API X wrong this sprint." **SKF command:** `@Ferris US` on the affected skill. **What flows back:** A patched skill with the newly-discovered edge cases captured — `[MANUAL]` sections preserved so human annotations aren't overwritten. Next sprint's stories consume the updated skill automatically. This retrospective → update loop is the pattern that [Scenario A in Examples](../examples/#scenario-a-greenfield--bmm-integration) sketches for one project; it generalizes to any BMM project that runs more than a few sprints. --- ## SKF with Optional BMAD Modules BMAD ships several [optional modules](https://github.com/orgs/bmad-code-org/repositories). Synergy with SKF ranges from very high (TEA) to narrow (CIS). This section is honest about both. ### TEA — Test Architect TEA produces structured test strategies and release gates. SKF produces the verified skills TEA's workflows need when the test target is a library they don't fully know. ```mermaid flowchart TD TD[TEA: Test Design] -.->|test lib unknown| CS[SKF: Create Skill
on Playwright/Vitest/Pact] CS --> AUTO[TEA: Automate / ATDD] AUTO --> GATE[TEA: Release Gate] GATE -.->|drift check| AS[SKF: Audit Skill] AS -->|no drift| GATE AS -.->|drift found| US[SKF: Update Skill] US -.->|refreshed skill| GATE ``` Two concrete integrations: - **Before Test Design / ATDD / Automate / Framework Scaffolding** — run `@Ferris CS` on whichever test library the strategy depends on (Playwright, Vitest, Pact, etc.). TEA's test-authoring agents then work against verified API surfaces instead of training-data approximations. - **Before Release Gate** — run `@Ferris AS` on the skills the gate cites. If the skill has drifted from the current source, the drift report itself becomes evidence the gate can act on, and `@Ferris US` closes the loop. ### BMB — BMAD Builder BMB authors extend BMAD with new agents, workflows, or entire modules. SKF itself was built using BMB — it's a living proof-of-concept for the BMAD module architecture. When a new module depends on third-party libraries, ship a verified companion skill alongside it: - During `module-builder`, run `@Ferris SS` on the module's declared stack. The resulting stack skill becomes part of the module's distribution — downstream users get both the BMAD module BMB built and the SKF content skill you compiled as its companion in a single install. ### GDS — Game Dev Studio Narrow synergy. GDS covers GDD authoring, narrative design, and engine-specific guidance for 21+ game types, and most of that is conceptual work with no code to verify. The exception is when the GDD commits to a concrete engine SDK (Bevy, Godot-Rust, Unity DOTS): - Once the engine binding is pinned, run `@Ferris CS` on it. The implementation team then has verified bindings to work against. For narrative, character design, world-building, or genre research — no synergy. SKF has nothing to offer the creative side of GDS. ### CIS — Creative Intelligence Suite Narrow but real synergy during the **brief-skill** phase. CIS's brainstorming coach, advanced elicitation, and party mode can sharpen scope decisions before compilation begins — especially when briefing a skill for a library you don't know well, or when multiple stakeholders disagree on what the skill should cover. The [oh-my-skills](https://github.com/armelhbobdad/oh-my-skills) repository uses BMAD Core + CIS alongside SKF: when briefing [`oms-storybook-react-vite`](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-storybook-react-vite/skill-brief.yaml), CIS brainstorming, party mode, and advanced elicitation helped narrow a massive repo (Storybook supports Next.js, Astro, SvelteKit, and more — with extensive documentation for each) down to an accurate brief scoped specifically to React + Vite. Beyond briefing, CIS and SKF don't overlap — CIS covers ideation, storytelling, and innovation strategy where there's no code to verify. Use CIS for the creative and strategic work, then bring SKF in once you're producing concrete technical artifacts. --- ## Delivery and Lifecycle in a BMAD Project `@Ferris EX` is the **only workflow that introduces new skill context** into the three context files that serve all 23 IDEs: `CLAUDE.md` (Claude Code), `.cursorrules` (Cursor), and `AGENTS.md` (the remaining 21 IDEs — GitHub Copilot, Windsurf, Cline, Roo Code, Gemini CLI, and others). Each IDE also has its own skill root directory where skill files are installed (e.g., `.windsurf/skills/`, `.roo/skills/`, `.gemini/skills/`). Create-skill and update-skill produce draft artifacts that never touch those files directly — nothing reaches an agent's passive context until it has been through the EX gate. See [Skill Model → Dual-Output Strategy](../skill-model/#dual-output-strategy) for the architectural rationale. This matters specifically in a BMAD project: you may have multiple BMAD modules, each with its own launcher skills, plus SKF content skills, all trying to contribute context. The write-guard means only verified, tested SKF skills ever reach an agent's passive context — nothing half-baked sneaks in. `@Ferris EX` injects managed sections that coexist cleanly with whatever BMAD's installer wrote in the same files. For long-running BMAD projects, `@Ferris RS` (rename) and `@Ferris DS` (drop) keep the skill inventory clean as libraries get swapped, versions get deprecated, or naming conventions evolve across sprints. Both *rebuild* the existing managed sections in those context files so references stay consistent after a rename or drop — they never inject previously-unpublished content, so the EX gate still governs what initially enters those files. --- ## Where to Go Next - [BMAD docs](https://docs.bmad-method.org/) — canonical reference for BMM phases, TEA workflows, BMB / GDS / CIS details, and the full module list - [Workflows](../workflows/) — complete SKF workflow reference with commands and connection diagrams - [Examples](../examples/) — concrete scenarios including the BMM retrospective loop and greenfield architecture verification
These are the seven terms you'll meet in every other page of this site. Each one names something SKF does differently from generic docs tooling. For the full mechanism behind them, see [Architecture](../architecture/) and [Skill Model](../skill-model/). --- ## Agent Skills An agent skill is an instruction file that tells an AI agent how to use your code. Instead of guessing your API from its training data, the agent reads the skill and gets the actual function names, parameter types, and usage patterns. Skills follow the [agentskills.io](https://agentskills.io) open standard, so they work across Claude, Cursor, Copilot, and other AI tools. **Example:** A skill for [cognee](https://github.com/topoteretes/cognee) tells your agent: "The function is `cognee.search()`, its first parameters are `query_text`, `query_type`, `user`, `datasets`, and `dataset_ids`, and it's defined at `cognee/api/v1/search/search.py:L26` (v1.0.0, commit `3c048aa4`)." Every parameter and location is AST-verified from the actual source code. --- ## Provenance Provenance means every instruction in a skill traces back to where it came from. For code, that's a file and line number. For documentation, it's a URL. For developer discourse, it's an issue or PR reference. **If SKF can't point to a source, it doesn't include the instruction.** **Examples** (from a [real generated skill](https://github.com/armelhbobdad/oh-my-skills)): - `[AST:cognee/api/v1/search/search.py:L26]` — extracted from source code via AST parsing (T1) - `[SRC:cognee/api/v1/session/__init__.py:L7]` — read from source code without AST verification (T1-low) - `[QMD:cognee-temporal:issues.md]` — surfaced from indexed developer discourse (T2) - `[EXT:docs.cognee.ai/getting-started/quickstart]` — sourced from external documentation (T3) This is the opposite of how most AI tools work. They generate plausible-sounding content from training data; SKF only includes what it can cite. Quick-tier skills rely on best-effort source reading rather than AST verification — but even Quick skills cite their sources, and nothing ships without a citation. --- ## Confidence Tiers (T1/T1-low/T2/T3) Each piece of information in a skill carries a confidence level based on where it came from: - **T1 — AST extraction:** Pulled directly from source code via AST parsing. The function signature exists in the code at the pinned commit. Cited as `[AST:file:Lnn]`. - **T1-low — Source reading:** Found by reading source files directly without AST parsing. The location is correct but the type signature may be inferred. Produced by Quick tier and by Forge/Forge+/Deep when ast-grep cannot parse a specific file. Cited as `[SRC:file:Lnn]`. - **T2 — Evidence (Deep tier only):** Surfaced by QMD knowledge search from issues, PRs, changelogs, or documentation within the repository. Available only when QMD is installed (Deep tier). Reliable context, but less definitive than source code itself. Cited as `[QMD:collection:document]`. T2 has two temporal subtypes: - **T2-past** — Historical context (closed issues, merged PRs, changelogs) explaining API design decisions. Surfaces in the skill's `references/` directory. - **T2-future** — Forward-looking context (open PRs, deprecation warnings, RFCs) about upcoming changes. Surfaces in SKILL.md Section 4b (Migration & Deprecation Warnings) and `references/`. - **T3 — External:** Pulled from external documentation or websites. Treated with caution and clearly marked. Cited as `[EXT:url]`. Forge+ semantic discovery (via cocoindex-code) does not introduce a new confidence tier — it influences *which* files are extracted, not *how* they're cited. Discovered files are verified by ast-grep (T1) or source reading (T1-low). --- ## Capability Tiers (Quick/Forge/Forge+/Deep) Your capability tier depends on which tools you have installed. Each tier builds on the previous one: - **Quick** — No tools required. SKF reads source files and builds best-effort skills. Works in under a minute. GitHub CLI used when available. - **Forge** — Adds [ast-grep](https://ast-grep.github.io). SKF uses AST parsing to verify instructions against the actual code structure. - **Forge+** — Adds [cocoindex-code](https://github.com/cocoindex-io/cocoindex-code). SKF uses semantic code search to discover relevant source regions before AST extraction, improving coverage on large codebases. - **Deep** — Full pipeline: requires [ast-grep](https://ast-grep.github.io) + [GitHub CLI](https://cli.github.com) + [QMD](https://github.com/tobi/qmd) (all three). SKF indexes knowledge for semantic search and performs GitHub repository exploration. Skills get enriched with historical context, deprecation warnings, and cross-reference intelligence. CCC (cocoindex-code) enhances Deep tier when installed — ast-grep + gh + qmd + ccc gives maximum capability. You don't need all tools to start. SKF detects what you have and sets your tier automatically. See [Skill Model → Progressive Capability Model](../skill-model/#progressive-capability-model) for the full technical treatment. --- ## Drift Drift happens when the source code changes but the skill instructions haven't been updated to match. A skill might still reference a function that was renamed, removed, or had its signature changed upstream. SKF detects drift by comparing the skill's recorded provenance against the current code. The `audit-skill` workflow (`@Ferris AS`) scans for these mismatches — for both individual skills and stack skills. Stack skills track per-library provenance and, in compose-mode, constituent freshness via metadata hash comparison. **Example:** Your skill says `createUser(name: string)` but the function was renamed to `registerUser(name: string, email: string)` in the last release. That's drift. For stack skills, constituent drift occurs when an individual skill is updated but the stack hasn't been re-composed to reflect the changes. --- ## Version Pinning Every skill records the exact version (or commit) of the source code it was built from. This means you always know which version of the library the instructions apply to. By default, the version is auto-detected from the source (package.json, pyproject.toml, etc.). You can also target a specific version — either by specifying it during `@Ferris BS` (brief-skill) or by appending `@version` to a quick skill command (`@Ferris QS cognee@1.0.0`). This is especially useful for docs-only skills where no source code is available for auto-detection. When targeting a specific version on a remote repository, SKF resolves the matching git tag and clones from it — so the extracted API signatures actually reflect the target version's code, not just the label applied to whatever happens to be on the default branch. When the source updates, you can re-run `@Ferris US` (update-skill) to regenerate the skill for the new version while preserving any manual additions you've made. --- ## BMAD Module SKF is a plugin (called a "module") for [BMAD Method](https://docs.bmad-method.org/), a framework for running structured AI workflows. You don't need to know BMAD to use SKF — the standalone installer sets everything up. If you already use BMAD, see [BMAD Synergy](../bmad-synergy/) for how SKF workflows pair with BMM phases and optional modules like TEA, BMB, and GDS. ## What the Output Looks Like When SKF generates a skill, you get a `SKILL.md` file with machine-readable frontmatter and provenance-backed instructions. Below is a trimmed example from the real [`oms-cognee` SKILL.md](https://github.com/armelhbobdad/oh-my-skills/blob/main/skills/oms-cognee/1.0.0/oms-cognee/SKILL.md) generated for [cognee](https://github.com/topoteretes/cognee) (full portfolio at [oh-my-skills](https://github.com/armelhbobdad/oh-my-skills)): **Frontmatter (tells AI agents when to load this skill):** ```yaml name: oms-cognee description: > Builds apps on top of cognee v1.0.0, the knowledge-graph memory engine for AI agents. Use when ingesting text/files/URLs into persistent memory, building knowledge graphs, searching graph-backed memory with multiple SearchType modes, enriching graphs with memify/improve, scoping memory with datasets and node_sets, configuring LLM/embedding/ graph/vector backends, running custom task pipelines, tracing operations, decorating agent entrypoints with `agent_memory`, connecting to Cognee Cloud with `serve`, or visualizing the graph. Covers cognee/__init__.py exports: the V1 API (add, cognify, search, memify, datasets, prune, update, run_custom_pipeline, config, SearchType, visualize_graph, pipelines, Drop, run_startup_migrations, tracing) and the V2 memory-oriented API (remember, RememberResult, recall, improve, forget, serve, disconnect, visualize, agent_memory). Do NOT use for: cognee internals, the HTTP REST API (use cognee-mcp or the FastAPI server), non-cognee memory/RAG libraries. ``` **Body (what your AI agent reads):** ``` ## Key API Summary | Function | Purpose | Key Params | Source | |----------|---------|------------|--------| | add() | Ingest text, files, binary data | data, dataset_name | [AST:cognee/api/v1/add/add.py:L22] | | cognify() | Build knowledge graph | datasets, graph_model | [AST:cognee/api/v1/cognify/cognify.py:L44] | | search() | Query knowledge graph | query_text, query_type | [AST:cognee/api/v1/search/search.py:L26] | | memify() | Enrich graph with custom tasks | extraction_tasks, data | [AST:cognee/modules/memify/memify.py:L25] | | remember() | V2 one-shot memory ingest | data, dataset_name | [AST:cognee/api/v1/remember/remember.py:L339] | | DataPoint | Base class for custom graph nodes | inherit and add fields | [EXT:docs.cognee.ai/guides/custom-data-models] | ``` Every line number above is verbatim from the real [`forge-data/oms-cognee/1.0.0/provenance-map.json`](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-cognee/1.0.0/provenance-map.json) shipped with oh-my-skills — not illustrative. Provenance tags trace each instruction to its source: - `[AST:file:line]` — extracted from code via AST parsing (highest confidence) - `[SRC:file:line]` — read from source code without AST verification - `[EXT:url]` — sourced from external documentation - `[QMD:collection:doc]` — surfaced from indexed developer discourse (issues, PRs, changelogs) See [Skill Model → Output Architecture](../skill-model/#output-architecture) for the full output structure. **Full skill directory structure** (real layout from [`oh-my-skills/skills/oms-cognee/`](https://github.com/armelhbobdad/oh-my-skills/tree/main/skills/oms-cognee)): ``` skills/oms-cognee/ ├── active -> 1.0.0 ├── 0.5.8/ │ └── oms-cognee/ │ ├── SKILL.md # Archived: v0.5.8, pinned to b51dcce1 │ ├── context-snippet.md │ ├── metadata.json │ └── references/ └── 1.0.0/ └── oms-cognee/ ├── SKILL.md # Active: pinned to cognee v1.0.0 (3c048aa4) ├── context-snippet.md # Compressed index for platform context files ├── metadata.json # Machine-readable provenance └── references/ # Progressive disclosure detail ├── config.md ├── core-workflow.md ├── full-api-reference.md └── pipelines-and-datapoints.md ``` This is the real directory listing from [`oh-my-skills/skills/oms-cognee/`](https://github.com/armelhbobdad/oh-my-skills/tree/main/skills/oms-cognee) after cognee shipped v1.0.0 upstream. SKF recompiled the skill from the v1.0.0 commit and wrote it next to the existing 0.5.8 tree — the older version stays pinned to its original commit (`b51dcce1`) and is still installable by any project that hasn't bumped its `CLAUDE.md` pin yet. The `active` symlink and the [`.export-manifest.json`](https://github.com/armelhbobdad/oh-my-skills/blob/main/skills/.export-manifest.json) both point at the current version. Some skills also include `scripts/` and `assets/` directories when the source repository contains executable scripts or static assets — oms-cognee doesn't have either, but see [Skill Model → Per-Skill Output](../skill-model/#per-skill-output) for the full schema. --- ## Example Workflows ### Quick Skill — Under a minute Developer adds [cognee](https://github.com/topoteretes/cognee) to a Python project for AI memory management. Agent keeps hallucinating method signatures and config options. ``` @Ferris QS https://github.com/topoteretes/cognee ``` Ferris reads the repository, extracts the public API, and validates against the agentskills.io spec. The skill is written to `skills/cognee//cognee/` (auto-detected from the source manifest). The agent now reads the real signatures from the skill instead of guessing. Need a specific version? Append `@version`: ``` @Ferris QS cognee@1.0.0 ``` ### Brownfield Platform — Pipeline or per-workflow Alex, a platform engineer, adopts BMAD for 10 microservices spanning TypeScript, Go, and Rust. ``` @Ferris SF # Setup — Deep tier detected # — clear session — @Ferris onboard # Analyze → Create → Test → Export in one pipeline ``` Or one workflow per session: ``` @Ferris SF # Setup — Deep tier detected # — clear session — @Ferris AN # Analyze — 10 services mapped # — clear session — @Ferris CS --batch # Create — batch generation ``` 10 individual skills + 1 platform stack skill. The [BMM](../bmad-synergy/#skf-and-bmm-phase-by-phase-playbook) architect then navigates cross-service flows using verified knowledge. ### Release Prep — Trust Builder Jin, a Rust library maintainer, is preparing v1.0.0 with breaking changes she wants consumers' agents to pick up automatically. ``` @Ferris maintain cocoindex ``` Or one workflow per session: ``` @Ferris AS # Audit — finds 3 renames, 1 removal, 1 addition # — clear session — @Ferris US # Update — preserves [MANUAL] sections, adds annotations # — clear session — @Ferris TS # Test — verify completeness # — clear session — @Ferris EX # Export — package for npm release ``` Ships with the npm release. Consumers upgrade and their agents use the correct function names — no more "wrong signature" support tickets. ### Stack Skill — Integration Intelligence Armel, building a full-stack side project on Next.js + Serwist + SpacetimeDB + better-auth. ``` @Ferris SS ``` Ferris detects 8 significant dependencies, finds 5 co-import integration points. Generates a consolidated stack skill. The agent now knows: "When you modify the auth flow, update the Serwist cache exclusion at `src/sw.ts:L23`." That integration detail isn't available from any other tool in the [comparison table](/#how-skf-compares). ### Pre-Code Architecture Verification — Greenfield Confidence Gery, a backend architect, is designing a new TypeScript service on Hono + Drizzle + SpacetimeDB. Architecture doc is written but no code exists yet — he wants to verify the stack holds together before anyone starts building. ``` @Ferris QS hono # Quick Skill per library @Ferris QS drizzle-orm @Ferris QS spacetimedb-sdk @Ferris VS # Verify Stack — feasibility report @Ferris RA # Refine Architecture — enrich with API evidence @Ferris SS # Stack Skill — compose-mode (no codebase needed) ``` VS flags the Drizzle↔SpacetimeDB integration as incompatible (query-model mismatch) and returns CONDITIONALLY FEASIBLE. Gery adds a bridge layer to the architecture, re-runs VS → FEASIBLE. RA fills in verified API signatures. SS compose-mode synthesizes the stack skill from existing skills + refined architecture. The agent now has integration intelligence for a project that doesn't have code yet. --- ## Common Scenarios ### Scenario A: Greenfield + BMM Integration BMAD user starts a new project. [BMM](../bmad-synergy/#skf-and-bmm-phase-by-phase-playbook) architect suggests skill generation after retrospective. ``` @Ferris BS # Brief — scope the skill @Ferris CS # Create — compile from brief @Ferris TS # Test — verify completeness @Ferris EX # Export — inject into platform context files ``` Skills accumulate over sprints. The agent's coverage improves each iteration. ### Scenario B: Multi-Repo Platform Blondin, a platform lead, needs cross-service knowledge for 10 microservices so agents can navigate shared types and cross-calls. One forge project, multiple QMD collections, hub-and-spoke skills with integration patterns. ### Scenario C: External Dependency Kossi, a developer integrating an uncommon library, needs a skill for it — nothing official exists yet. ``` @Ferris QS better-auth ``` Checks ecosystem first. If no official skill exists: generates from source. `source_authority: community`. ### Scenario D: Docs-Only (SaaS/Closed Source) No source code to clone — only API documentation. Example: you're integrating the [Stripe API](https://docs.stripe.com/api) and want your agent to know the real endpoints, parameters, and error codes instead of hallucinating from training data. ``` @Ferris BS # When asked for target, provide documentation URLs: # https://docs.stripe.com/api/charges # https://docs.stripe.com/api/payment_intents # https://docs.stripe.com/api/errors # Ferris sets source_type: "docs-only" and collects doc_urls # When asked for target version, specify: 2025-04-30.basil # Ferris confirms your doc URLs match that API version @Ferris CS # step-03 skips (no source to clone), step-03c fetches docs via doc_fetcher # All content is T3 [EXT:url] confidence. source_authority: community ``` The brief's `doc_urls` field drives the doc_fetcher step. The agent uses whatever web fetching tool is available in its environment (Firecrawl, WebFetch, curl, etc.) to retrieve documentation as markdown and extract API information with `[EXT:url]` citations. No AST parsing is possible without source code — every instruction carries T3 provenance instead of T1, and the skill is tagged `source_authority: community` regardless of tier. ### Scenario E: Rename a Skill You generated a quick skill for `cognee` and now want a more specific name to distinguish it from the official one. ``` @Ferris RS # Ferris asks: Which skill? → cognee # Ferris asks: New name? → cognee-skf-community # Ferris copies to new name across all versions, verifies every reference, # updates the export manifest, rebuilds CLAUDE.md/AGENTS.md, # then deletes the old name. ``` Transactional safety: if verification fails, the old skill stays intact. ### Scenario F: Drop a Deprecated Version You have `cognee` with versions 0.1.0, 0.5.0, and 0.6.0 (active). Version 0.1.0 is obsolete. ``` @Ferris DS # Ferris asks: Which skill? → cognee # Ferris asks: Which version? → 0.1.0 # Ferris asks: Deprecate (keep files) or Purge (delete)? → Purge # Ferris updates the manifest, rebuilds context files, deletes the 0.1.0 directory. ``` Version 0.6.0 remains active. Version 0.5.0 is untouched. The managed sections in CLAUDE.md/AGENTS.md no longer reference 0.1.0. ### Scenario G: Maximum Accuracy for a High-Stakes Library You're building skills for a production payments library and need maximum citation density. Every signature must be AST-verified, and you want historical context (deprecations, migration notes) baked into the skill. **Workflow:** ``` @Ferris SF # Ferris detects installed tools and sets your tier automatically: # - Quick: no tools required (best-effort, source-read only) # - Forge: + ast-grep (T1 AST-verified signatures) # - Forge+: + cocoindex-code (semantic pre-ranking for large repos) # - Deep: + gh + qmd (T2 evidence — issues, PRs, changelogs) # Install the missing tools, then re-run @Ferris SF to promote your tier. @Ferris BS # Scope — confirm the forge tier is Deep (+ ccc if installed) @Ferris CS # Extract — AST + QMD enrichment @Ferris TS # Completeness score — 80%+ threshold ``` **What you get:** Every signature carries `[AST:file:Lnn]` at T1. Deprecation warnings and design rationale carry `[QMD:collection:doc]` at T2. Install tooling once, every downstream skill benefits. See [Capability Tiers](../concepts/#capability-tiers-quickforgeforgedeep). ### Scenario H: OSS Maintainer Publishing Official Skills You maintain an OSS library and want to ship official agent skills alongside each release — distributed via [skills.sh](https://skills.sh) or [oh-my-skills](https://github.com/armelhbobdad/oh-my-skills) so consumers install them with `npx skills add`. **Workflow:** ``` @Ferris BS # Scope the skill — set source_authority: official in the brief @Ferris CS # Compile — AST extraction + QMD enrichment (Deep tier recommended) @Ferris TS # Verify completeness before publishing (target: 90%+) @Ferris EX # Package for distribution — emits npx skills publish instructions ``` **What you get:** A verified skill pinned to the release commit, with `source_authority: official` surfaced in metadata as a trust signal so downstream tooling (and the ecosystem check in `@Ferris QS`) recognize it as maintainer-published rather than community-forged. Re-run `@Ferris maintain ` (AS → US → TS → EX) on every release to keep published skills current. --- ## Tips & Tricks ### Skip Permissions for Faster Forging > **Tip from Armel:** When forging skills with Claude Code, I run `claude --dangerously-skip-permissions` to bypass all permission prompts. SKF workflows only read source code, write to `skills/` and `forge-data/`, and call local tools (ast-grep, qmd, gh) — every step is auditable in the [open source](https://github.com/armelhbobdad/bmad-module-skill-forge). Skipping permissions drastically reduces forge time: I start a pipeline, go [grab one of those coffees ☕ you keep offering](https://buymeacoffee.com/armelhbobdad), and come back to a completed workflow. Review the output at the end, not at every gate. ### Progressive Capability Start with the Quick tier (no setup required), upgrade to Forge (install ast-grep), then Forge+ (install cocoindex-code for semantic discovery), then Deep (install QMD). Each tier builds on the previous — you never lose capability. ### Batch Operations Use `--batch` with `create-skill` to process multiple briefs at once. Progress is checkpointed — if interrupted, re-run `@Ferris CS --batch` and Ferris will resume automatically from where it left off. ### Stack Skills + Individual Skills Stack skills focus on integration patterns. Individual skills focus on API surface. Use both together for maximum coverage. ### The Loop After each sprint's refactor, run `@Ferris US` to regenerate changed components. Export updates your platform context files (CLAUDE.md, AGENTS.md, .cursorrules) automatically. Skill generation becomes routine — like running tests. ### One Workflow Per Session Clear your conversation context (start a new chat) before invoking a new workflow. Each SKF workflow loads step files, knowledge fragments, and extraction data into context. Starting fresh ensures the next workflow operates without interference from prior steps. Sidecar state (forge tier, preferences) persists automatically across sessions — you don't lose configuration. ### Full Control Over Scope You can compile multiple skills from the same target (repo or docs) with different scopes and intents. Each brief defines what to extract and why, producing a distinct skill from the same source. For example, from a single library you could compile `cognee-core` for the public API, `cognee-graph-types` for the type system, and `cognee-migration` for upgrade patterns — each serving a different use case. ### Best Practices Built In Generated skills automatically follow authoring best practices: third-person descriptions for reliable agent discovery, consistent terminology, degrees-of-freedom matching (prescriptive for fragile operations, flexible for creative tasks), and table-of-contents headers in large reference files. Discovery testing recommendations are included in test reports. ### Scripts & Assets If your source repo includes executable scripts (`scripts/`, `bin/`) or static assets (`templates/`, `schemas/`), SKF detects and packages them automatically with provenance tracking. Custom scripts you add to `scripts/[MANUAL]/` are preserved during updates — just like `` markers in SKILL.md. ### Let the Health Check Run Every SKF workflow ends with a shared **health check** step where Ferris reflects on the session and offers to file friction, bugs, or gaps as GitHub issues (with your approval). Clean runs exit in one line — zero overhead. When something breaks, it's SKF's primary feedback channel, so **please let workflows run to completion**. If you had to cancel before the health check fired, ask Ferris to run it (`@Ferris please run the workflow health check for this session`) or [open an issue directly](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose). See [Workflow Health Check](../workflows/#terminal-step-health-check) for details. --- ## Something not working? See [Troubleshooting](../troubleshooting/) for common errors (ast-grep unavailable, "no brief found", ecosystem check messages) and how to resolve them. For general setup help, see [Getting Started → Need help?](../getting-started/#need-help). ## In 60 seconds One command. One verified skill. Here's a real snippet from a cognee skill SKF compiled: ```python await cognee.search( # [AST:cognee/api/v1/search/search.py:L26] query_text="What does Cognee do?" ) ``` Every instruction carries a receipt — a file, a line, and a commit SHA from the upstream repo. Your AI reads these instead of guessing from training data, and you can open the source at the pinned commit to confirm the function exists. Nothing is made up; everything is falsifiable. Want to see the full audit on a real shipped skill before you install anything? → [Verifying a Skill](../verifying-a-skill/). --- ## Install One command, on any platform. Requires Node.js ≥ 22, Python ≥ 3.10, and `uv` ([full tool matrix below](#prerequisites-full-reference)). ```bash npx bmad-module-skill-forge install ``` You'll be prompted for project name, output folders, and which IDEs to configure. The installer copies skill directories to each IDE's skills folder (e.g. `.claude/skills/`, `.cursor/skills/`) so skills are available natively. ### As a custom module during BMAD Method installation ```bash npx bmad-method install ``` Step through the installer prompts: - **"Would you like to browse community modules?"** — No (SKF isn't in the community catalog yet) - **"Would you like to install from a custom source (Git URL or local path)?"** — Yes - **"Git URL or local path:"** — paste the SKF repo URL: ``` https://github.com/armelhbobdad/bmad-module-skill-forge ``` Or, if you've already cloned the repo locally, provide the path to the repo root instead: ``` /path/to/bmad-module-skill-forge ``` This installs BMAD core + SKF together with full IDE integration, manifests, and help catalog. Best when you want the complete BMAD development workflow. See [BMAD Synergy](../bmad-synergy/) for how SKF workflows pair with BMM phases and other BMAD modules. ### Add SKF to an existing BMAD project If you already have BMAD installed, you can add SKF afterward by running the standalone installer in the same directory: ```bash npx bmad-module-skill-forge install ``` The installer detects the existing `_bmad/` directory and installs SKF alongside your current modules. See [BMAD Synergy](../bmad-synergy/) for integration patterns with your existing BMM workflows. ### Updating an existing SKF installation To move to a newer (or older) SKF version, run the installer again in your project directory: ```bash npx bmad-module-skill-forge@latest install ``` The installer reads the installed version from your manifest and shows the delta in the prompt — for example `v0.10.0 → v1.0.0 available`. Pick **Update** to replace SKF files while keeping your `config.yaml` intact. The option label adapts to the direction you're moving (upgrade, reinstall the same version, or downgrade) so you always see exactly what you're about to apply. Pick **Fresh install** instead if you want to wipe everything and start clean. > The `@latest` suffix forces npx to fetch the newest published version instead of reusing a cached copy from a previous run. --- ## Your first skill ### 1. Setup your forge ``` @Ferris SF ``` This detects your tools, sets your capability tier, and initializes the forge environment. You only need to do this once per project. ### 2. Generate your first skill **Fastest path (Quick Skill):** ``` @Ferris QS https://github.com/bmad-code-org/BMAD-METHOD ``` Ferris reads the repository, extracts the public API, and generates a skill in under a minute. **Targeting a specific version:** Append `@version` to pin the skill to a library version: ``` @Ferris QS cognee@1.0.0 ``` **Full quality path (pipeline mode):** ``` @Ferris forge https://github.com/cocoindex-io/cocoindex cocoindex ``` `forge` chains Brief → Create → Test → Export. It needs an explicit repo URL **and** a skill name because it starts with Brief Skill (BS), which doesn't guess targets. If you just want a fast skill from a package name, use `@Ferris forge-quick cognee` instead — that starts with Quick Skill (QS), which resolves packages via the registry. Or one workflow per session: ``` @Ferris BS # Brief — scope and design the skill # — clear session — @Ferris CS # Create — compile from the brief # — clear session — @Ferris TS # Test — verify completeness # — clear session — @Ferris EX # Export — package for distribution ``` > **One workflow per session.** Each SKF workflow loads step files, knowledge fragments, and extraction data into the LLM's context as it executes. Running a second workflow in the same session can cause leftover context to interfere — stale references, mode confusion, or degraded output. Clear your session (start a new conversation) before invoking a new workflow. Pipeline mode chains workflows automatically with headless mode; for manual control, start fresh between each one. Sidecar state (forge tier, preferences) persists across sessions, so no configuration is lost. ### 3. Stack skill (for full projects) ``` @Ferris SS ``` Analyzes your project's dependencies and generates a consolidated stack skill with integration patterns. > **After every workflow:** Ferris runs a **health check** — a reflection step that captures any friction, bugs, or gaps from the session. Clean runs exit in one line; when something breaks, Ferris offers to file structured findings as GitHub issues (with your approval). **Please let workflows run to completion** so the health check can fire. If it was skipped, ask Ferris to run it (`@Ferris please run the workflow health check for this session`) or [open an issue directly](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose). See [Workflow Health Check](../workflows/#terminal-step-health-check). --- ## Common use cases > **Looking for end-to-end examples?** See [Examples](../examples/) for eleven real-world scenarios with full command transcripts — from Quick Skill under a minute, to brownfield onboarding, stack verification, release-prep drift remediation, and SaaS docs-only skills. --- ## Prerequisites (full reference) Most users only need Node.js, Python, and `uv` — the rest unlock additional capabilities but SKF detects what's available and sets your tier automatically. You can install them later; your tier upgrades when you do. | Tool | Required For | Install | |------------------------------------------------------------------------|---------------------------------------------------------------------------------------|-----------------------------------------------------------| | `Node.js` >= 22 | Installation, npx commands | | | `Python` >= 3.10 | Deterministic scoring, validation, and utility scripts | | | `uv` (Python package runner) | Running Python scripts with automatic dependency management | | | `gh` (GitHub CLI) | Required for Deep mode. Optional convenience in Quick/Forge/Forge+ for source access. | | | `ast-grep` (CLI tool for code structural search, lint, and rewriting) | Forge + Deep modes | | | `ast-grep` MCP server (recommended alongside CLI) | Forge + Deep modes | | | `ccc` (cocoindex-code semantic code search) | Forge+ mode | | | `qmd` (local hybrid search engine for project files) | Deep mode | | | `SNYK_TOKEN` (Snyk API token — **Enterprise plan required**) | Optional security scan | | Security scanning via Snyk is optional and requires an Enterprise plan; it does not affect your tier level. ### Platform support **Linux and Windows** are exercised in CI on every PR (`ubuntu-latest` + `windows-latest` matrix on `validate` and `python` jobs). **macOS** works in practice — POSIX-equivalent to Linux — but isn't CI-gated; if you hit a macOS-specific bug, please [file an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues). On Windows, SKF transparently falls back to NTFS junctions when symlink privilege isn't held, so no Developer Mode or admin rights are required. Git Bash (bundled with [Git for Windows](https://git-scm.com/download/win)), PowerShell, and WSL2 all work. --- ## Configuration SKF has two install-time variables (defined in `src/module.yaml`), one Core Config variable inherited from BMAD, and one runtime preference: | Variable | Purpose | Default | |------------------------|----------------------------------------------------------------------------------------------------------|-----------------------------| | `skills_output_folder` | Where generated skills are saved | `{project-root}/skills` | | `forge_data_folder` | Where workspace artifacts are stored (VS reports, evidence) | `{project-root}/forge-data` | | `output_folder` | Where refined architecture documents are saved (used by RA workflow). *Inherited from BMAD Core Config.* | Defined by BMAD Core Config | | `tier_override` | Force a specific tier for comparison or testing (in `_bmad/_memory/forger-sidecar/preferences.yaml`) | `~` (auto-detect) | | `headless_mode` | Skip confirmation gates in all workflows (in `_bmad/_memory/forger-sidecar/preferences.yaml`) | `false` | Runtime configuration (tool detection, tier, and collection state) is managed by the `setup` workflow and persisted in `forge-tier.yaml`. --- ## What's next? - [Agents](../agents/) — learn about Ferris - [Workflows](../workflows/) — the full command reference - [Examples](../examples/) — real-world scenarios with transcripts --- ## Need help? If you run into issues: 1. Run `/bmad-help` — analyzes your current state and suggests what to do next (e.g. `/bmad-help my quick skill has low confidence scores, how do I improve them?`) *Provided by the [BMAD Method](https://github.com/bmad-code-org/BMAD-METHOD) — not available in standalone SKF installations.* 2. Run `@Ferris SF` to check your tool availability and tier 3. Check `forge-tier.yaml` in your forger sidecar for your current configuration 4. If a workflow gave you friction, ask Ferris to run the health check for that session, or [open an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) — see [Workflow Health Check](../workflows/#terminal-step-health-check) Skill Forge reads your code, extracts what your AI agents actually need, and compiles it into instructions with citations. This page walks through what that looks like end-to-end. For the machinery behind it, see [Architecture](../architecture/). For what ships inside a skill, see [Skill Model](../skill-model/). --- ## A walkthrough: building a cognee skill Your AI agent keeps hallucinating cognee API calls. You run one command: ``` @Ferris QS https://github.com/topoteretes/cognee ``` In under a minute, you get a `SKILL.md` your agent can load — with every instruction traceable to a specific file and line in cognee's source code. Here's what happens between those two moments. ### 1. Ferris picks a workflow `QS` is a trigger — short for *Quick Skill*. Ferris is the single AI agent that runs every Skill Forge workflow. He reads the trigger, loads the Quick Skill workflow, and prepares the context he needs to do the job. ### 2. The workflow resolves your target Ferris confirms the repository exists, detects its language, finds its version from the source manifest (`pyproject.toml`, `package.json`, etc.), and records the exact commit SHA he'll read from. This is the anchor — everything that follows traces back to this one commit. ### 3. He extracts the API He reads cognee's source code, identifies the public exports, and pulls out function signatures, parameter types, and return types. At the Forge tier (with ast-grep installed), each signature is verified against the real syntax tree of the source. At Quick tier (no extra tools), he reads the file directly. Either way, nothing is invented — if he can't cite it, he doesn't include it. ### 4. He writes the skill with receipts Each instruction in the output carries a receipt: ```python await cognee.search( # [AST:cognee/api/v1/search/search.py:L26] query_text="What does Cognee do?" ) ``` That tag means: *this came from AST extraction of this exact file at this exact line.* You can click through to the upstream source at the pinned commit and see it yourself. ### 5. You get two files Ferris writes a `SKILL.md` (the full instruction manual your agent loads on demand) and a `context-snippet.md` (an 80–120 token index). The snippet gets injected into your platform context file (`CLAUDE.md`, `AGENTS.md`, or `.cursorrules`) as an always-on reminder: *"This skill exists; read it before writing cognee code."* Both halves are load-bearing — see the [Dual-Output Strategy](../skill-model/#dual-output-strategy) for why. ### 6. The audit trail stays on disk Alongside the skill, Ferris leaves a `provenance-map.json` (every receipt), an `evidence-report.md` (build audit trail), and the compilation config (`skill-brief.yaml`). Commit these with the skill and any teammate — or any skeptic — can reproduce the same output from the same source. That's the whole pipeline. One trigger in, one verifiable skill out, every claim traceable back to a file and a commit. --- ## Next - **[Architecture](../architecture/)** — how Ferris loads workflows, how sub-agents handle large extractions, how the 7 tools resolve conflicts, where artifacts land on disk - **[Skill Model](../skill-model/)** — what a skill contains, confidence tiers (T1 / T2 / T3), capability tiers (Quick / Forge / Forge+ / Deep), and the dual-output strategy - **[Verifying a Skill](../verifying-a-skill/)** — the 60-second audit recipe and how completeness scoring works - **[BMAD Synergy](../bmad-synergy/)** — how SKF fits alongside BMAD Method, TEA, BMB, and other modules ## The problem AI agents hallucinate API calls. They invent function names, guess parameter types, and produce code that doesn't compile. ## The fix Skill Forge reads the source and hands your agent the truth — with receipts. Every function signature, every parameter type, every usage pattern traces back to a file, a line, and a commit SHA in the upstream repository.
A receipt looks like [AST:cognee/api/v1/search/search.py:L26]
If SKF can't cite a source, it doesn't include the instruction.

Verify any claim in 60 seconds →

## How SKF compares
| Approach | What it does well | Where it falls short | |----------|-------------------|----------------------| | Skill scaffolding (`npx skills init`) | Generates a spec-compliant skill file | The file is empty — you still have to write every instruction by hand | | LLM summarization | Understands context and intent | Generates plausible-sounding content that may not match the actual API | | RAG / context stuffing | Retrieves relevant code snippets | Returns fragments without synthesis — no coherent skill output | | Manual authoring | High initial accuracy | Drifts as the source code changes, doesn't scale across dependencies | | IDE built-in context (Copilot, Cursor) | Convenient, zero setup | Uses generic training data, not your project's specific integration patterns | | **Skill Forge** | **Every instruction cites upstream `file:line` at a pinned commit. Falsifiable in 60 seconds.** | **Coverage depends on which tools you've installed (Quick / Forge / Forge+ / Deep tiers).** |
## Quick install Requires [Node.js](https://nodejs.org/) >= 22, [Python](https://www.python.org/) >= 3.10, and [uv](https://docs.astral.sh/uv/). ```bash npx bmad-module-skill-forge install ``` Then generate your first skill: ``` @Ferris SF # Set up your forge @Ferris QS # Generate a skill in under a minute ``` See [Getting Started](./getting-started/) for platform support, tier selection, and troubleshooting.
A Skill Forge skill is more than a single markdown file. This page explains what ships when you compile and export a skill: the capability tier your forge runs at, the confidence level of every claim, the files in the output directory, and why every skill is shipped as both an active instruction manual and a passive context index. For a walkthrough of compilation, see [How It Works](../how-it-works/). For the audit recipe that ties this all together, see [Verifying a Skill](../verifying-a-skill/). --- ## Progressive Capability Model SKF uses an additive tier model. You never lose capability by adding a tool. | Tier | Required Tools | What You Get | |------|---------------|-------------| | **Quick** | None (`gh_bridge`, `skill-check`, `tessl` used when available) | Source reading + spec validation + content quality review. Best-effort skills in under a minute. **Note:** Quick Skill (QS) is tier-unaware by design — it always runs at community tier regardless of installed tools. | | **Forge** | + `ast_bridge` (ast-grep) | Structural truth. AST-verified signatures. Co-import detection. T1 confidence. | | **Forge+** | + `ccc_bridge` (cocoindex-code) | Semantic discovery. CCC pre-ranks files by meaning before AST extraction. Better coverage on large codebases. | | **Deep** | `ast_bridge` + `gh_bridge` (gh) + `qmd_bridge` (QMD). CCC optional — enhances when installed. | Knowledge search. Temporal provenance. Drift detection. Full intelligence. | Setup detects your installed tools and sets your tier automatically: ``` @Ferris SF ``` ``` ┌─────────────────────────────┐ │ FORGE STATUS │ └─────────────────────────────┘ Tier: Deep Deep tier active. Full capability unlocked — AST-backed code analysis, GitHub repository exploration, and QMD knowledge search with cross-repository synthesis. Maximum provenance and intelligence. Tools Detected: ast-grep — 0.42.0 gh — 2.89.0 qmd — operational (104 docs indexed globally) ccc — 0.2.10 (daemon healthy) QMD Registry: 0 collection(s) healthy 39 orphaned collection(s) removed (empty — collections are created automatically when you run [CS] Create Skill) CCC Index: indexed this run — semantic discovery ready (2 files, 14 chunks) SKF infrastructure excluded: _bmad, _bmad-output, .claude, _skf-learn, skills, forge-data ┌────────────────────────────────┐ │ Forge ready. Deep tier active. │ └────────────────────────────────┘ Health Check: Clean run. No workflow issues to report. Workflow complete. --- Forge is live at Deep tier, armel. All four core tools are available, the CCC semantic index is built, and the QMD registry is clean. What would you like to forge next? Common starting points: - QS — fast skill from a GitHub URL or package name - BS → CS — brief, then compile a skill from source - AN — analyze a large repo to discover what's worth skilling - WS — show current lifecycle status Or chain them: forge (BS CS TS EX), forge-quick (QS TS EX), onboard (AN CS TS EX). ``` Don't have ast-grep, cocoindex-code, or QMD yet? No problem — Quick mode works with no additional tools. Optional GitHub CLI improves source access. Install tools later; your tier upgrades automatically. ### Tier Override — Comparing Output Across Tiers You can force a specific tier by setting `tier_override` in your preferences file (`_bmad/_memory/forger-sidecar/preferences.yaml`): ```yaml # Force Forge tier regardless of detected tools tier_override: Forge ``` This is useful for comparing skill quality across tiers for the same target: ``` # 1. Set tier_override: Quick in preferences.yaml @Ferris CS # compile at Quick tier # 2. Change to tier_override: Forge @Ferris CS # recompile at Forge tier — compare output # 3. Change to tier_override: Forge+ @Ferris CS # recompile with semantic discovery — compare coverage # 4. Reset to tier_override: ~ (auto-detect) ``` Set `tier_override` to `Quick`, `Forge`, `Forge+`, or `Deep`. Set to `~` (null) to return to auto-detection. The override is respected by all tier-aware workflows (CS, SS, US, AS, TS). --- ## Confidence Tiers Every claim in a generated skill carries a confidence tier that traces to its source: | Tier | Source | Tool | What It Means | |------|--------|------|---------------| | **T1** | AST extraction | `ast_bridge` | Current code, structurally verified. Immutable for that version. | | **T1-low** | Source reading | `ast_bridge` (fallback) | Source-read without AST verification. Produced by Quick tier and by Forge/Forge+/Deep when ast-grep cannot parse a specific file. Location correct, signature may be inferred. | | **T2** | QMD evidence | `qmd_bridge` | Historical + planned context (issues, PRs, changelogs, docs). | | **T3** | External documentation | `doc_fetcher` | External, untrusted. Quarantined. | ### Temporal Provenance Confidence tiers map to temporal scopes: - **T1-now (instructions):** What ast-grep sees in the checked-out code. This is what your agent executes. - **T2-past (annotations):** Closed issues, merged PRs, changelogs — why the API looks the way it does. - **T2-future (annotations):** Open PRs, deprecation warnings, RFCs — what's coming. Progressive disclosure controls how much context surfaces at each level: | Output | Content | |--------|---------| | `context-snippet.md` | T1-now + T2-future gotchas (breaking changes, deprecation warnings) — compressed, always-on | | `SKILL.md` | T1-now + lightweight T2 annotations | | `references/` | Full temporal context with all tiers | ### Tier Constrains Authority Your forge tier limits what authority claims a skill can make: | Forge Tier | AST? | CCC? | QMD? | Max Authority | Accuracy Guarantee | |-----------|------|------|------|---------------|-------------------| | Quick | No | No | No | `community` | Best-effort | | Forge | Yes | No | No | `official` | Structural (AST-verified) | | Forge+ | Yes | Yes | No | `official` | Structural + semantic discovery | | Deep | Yes | opt. (enhances when installed) | Yes | `official` | Full (structural + contextual + temporal) | **Tier governs technical verification; authority is an ecosystem claim.** Reaching Deep tier unlocks the *capability* to claim `official` authority — it does not grant it. Only library maintainers can publish `source_authority: official` skills via the [agentskills.io](https://agentskills.io) open-format ecosystem. A Deep-tier skill compiled by a third party is `community` by default. See [oh-my-skills](https://github.com/armelhbobdad/oh-my-skills), where all four Deep-tier skills ship as `community` by design — audited, not blessed. --- ## Completeness Scoring Skills are graded on a 0–100 completeness scale. See [how the score is computed](../verifying-a-skill/#how-the-score-is-computed) in Verifying a Skill for the formula and tier adjustments. --- ## Output Architecture ### Per-Skill Output Every generated skill produces a self-contained, version-aware directory: ``` skills/{name}/ ├── active -> {version} # Symlink to current version ├── {version}/ │ └── {name}/ # agentskills.io-compliant package │ ├── SKILL.md # Active skill (loaded on trigger) │ ├── context-snippet.md # Passive context (compressed, always-on) │ ├── metadata.json # Machine-readable provenance │ ├── references/ # Progressive disclosure │ │ ├── {function-a}.md │ │ └── {function-b}.md │ ├── scripts/ # Executable automation (when detected in source) │ │ └── {script-name}.sh │ └── assets/ # Templates, schemas, configs (when detected in source) │ └── {asset-name}.json └── {older-version}/ └── {name}/ # Previous version preserved └── ... ``` Multiple versions coexist under the same skill name. The `active` symlink points to the current version. Updating a skill for a new library release creates a new version directory — users pinned to older versions keep their skill intact. The inner `{name}/` directory is a standalone [agentskills.io](https://agentskills.io) package, directly installable via `npx skills add`. The `scripts/` and `assets/` directories are optional — only created when the source repository contains executable scripts or static assets matching detection heuristics. Each file traces to its source via `[SRC:file:L1]` provenance citations with SHA-256 content hashes for drift detection. User-authored files go in `scripts/[MANUAL]/` or `assets/[MANUAL]/` subdirectories and are preserved during updates. ### SKILL.md Format Skills follow the [agentskills.io specification](https://agentskills.io/specification) with frontmatter: ```yaml --- name: oms-cognee description: > Builds apps on top of cognee v1.0.0, the knowledge-graph memory engine for AI agents. Use when ingesting text/files/URLs into persistent memory, building knowledge graphs, searching graph-backed memory with multiple SearchType modes, enriching graphs with memify/improve, scoping memory with datasets and node_sets, configuring LLM/embedding/ graph/vector backends, running custom task pipelines, tracing operations, decorating agent entrypoints with `agent_memory`, connecting to Cognee Cloud with `serve`, or visualizing the graph. Covers cognee/__init__.py exports: the V1 API (add, cognify, search, memify, datasets, prune, update, run_custom_pipeline, config, SearchType, visualize_graph, pipelines, Drop, run_startup_migrations, tracing) and the V2 memory-oriented API (remember, RememberResult, recall, improve, forget, serve, disconnect, visualize, agent_memory). Do NOT use for: cognee internals, the HTTP REST API (use cognee-mcp or the FastAPI server), non-cognee memory/RAG libraries. --- ``` Every instruction in the body traces to source: ```python await cognee.search( # [AST:cognee/api/v1/search/search.py:L26] query_text="What does Cognee do?" ) ``` ### metadata.json — The Birth Certificate Machine-readable provenance for every skill: This is a trimmed excerpt from the real [`oms-cognee/1.0.0/metadata.json`](https://github.com/armelhbobdad/oh-my-skills/blob/main/skills/oms-cognee/1.0.0/oms-cognee/metadata.json) shipped with the oh-my-skills canonical output. Every value below is verbatim from the file — not illustrative. ```json { "name": "oms-cognee", "version": "1.0.0", "skill_type": "single", "source_authority": "community", "source_repo": "https://github.com/topoteretes/cognee", "source_commit": "3c048aa4147776f14d4546704f986242554a9ef3", "source_ref": "v1.0.0", "confidence_tier": "Deep", "spec_version": "1.3", "generation_date": "2026-04-13T00:00:00Z", "language": "python", "ast_node_count": 34, "confidence_distribution": { "t1": 34, "t1_low": 0, "t2": 11, "t3": 15 }, "stats": { "exports_documented": 34, "exports_public_api": 34, "exports_internal": 0, "exports_total": 34, "public_api_coverage": 1.0, "total_coverage": 1.0 } } ``` Fields omitted from this excerpt for brevity: `description`, `exports[]`, `tool_versions`, `dependencies`, `compatibility`, `last_update`, `generated_by`. The full 93-line file lives at [`oh-my-skills/skills/oms-cognee/1.0.0/oms-cognee/metadata.json`](https://github.com/armelhbobdad/oh-my-skills/blob/main/skills/oms-cognee/1.0.0/oms-cognee/metadata.json). `scripts` and `assets` arrays are optional — omitted entirely (not empty) when the source has no scripts or assets. ### Stack Skill Output Stack skills map how your dependencies interact — shared types, co-import patterns, integration points: ``` skills/{project}-stack/ ├── active -> {version} └── {version}/ └── {project}-stack/ ├── SKILL.md # Integration patterns + project conventions ├── context-snippet.md # Compressed stack index ├── metadata.json # Component versions, integration graph └── references/ ├── nextjs.md # Project-specific subset ├── better-auth.md # Project-specific subset └── integrations/ ├── auth-db.md # Cross-library pattern └── pwa-auth.md # Cross-library pattern ``` The primary source is your project repo. Component references trace to library repos. `skill_type: "stack"` in metadata. --- ## Dual-Output Strategy Every skill SKF compiles ships as **two** files on purpose — and the reason is empirical, not aesthetic. > **[Vercel research](https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals):** passive context (`AGENTS.md` / `CLAUDE.md`) achieves a **100% pass rate** in agent evals. Active skills loaded alone achieve **79%**. The 21-point gap is what the dual-output strategy closes. Every skill generates both: 1. **`SKILL.md`** — Active skill, loaded on trigger with the full instruction set. This is the instruction manual your agent opens when it knows it needs library guidance. 2. **`context-snippet.md`** — Passive context, compressed to 80–120 tokens per skill. Injected into platform context files (`CLAUDE.md` / `AGENTS.md` / `.cursorrules`) only when `export-skill` is run. This is the ambient index that tells your agent the skill exists in the first place and should be opened for relevant work. Without the snippet, the agent never knows to open `SKILL.md`. Without `SKILL.md`, the snippet has nothing to point at. **Both halves are load-bearing.** That's the 21-point delta. ### Managed Context Section Export injects a managed section between markers: The block below is the real managed section currently in [`oh-my-skills/CLAUDE.md`](https://github.com/armelhbobdad/oh-my-skills/blob/main/CLAUDE.md), showing one of its four compiled skills. Every line is verbatim from the file: ```markdown [SKF Skills]|4 skills|0 stack |IMPORTANT: Prefer documented APIs over training data. |When using a listed library, read its SKILL.md before writing code. | |[oms-cognee v1.0.0]|root: .claude/skills/oms-cognee/ |IMPORTANT: oms-cognee v1.0.0 — read SKILL.md before writing cognee code. Do NOT rely on training data. |quick-start:SKILL.md#quick-start |api-v1: add(), cognify(), search(), memify(), update(), run_custom_pipeline(), visualize_graph(), datasets, prune, config, SearchType, pipelines, Drop, run_startup_migrations(), session, tracing |api-v2: remember()→RememberResult, recall(), improve(), forget(), serve()/disconnect(), visualize(), @agent_memory |key-types:SKILL.md#key-types — SearchType: GRAPH_COMPLETION (default), RAG_COMPLETION, CHUNKS, CHUNKS_LEXICAL, SUMMARIES, TEMPORAL, CODING_RULES, CYPHER, FEELING_LUCKY, GRAPH_COMPLETION_DECOMPOSITION (+5 more); Task, Drop, RememberResult, DataPoint, 5 Cognee* exceptions |gotchas: cognee.low_level REMOVED from public API in v1.0.0 (import from cognee.infrastructure.engine directly); cognee.run_migrations REPLACED by cognee.run_startup_migrations (relational + vector); cognee.delete is DEPRECATED since v0.3.9 (use cognee.datasets.delete_data or cognee.forget); cognee.pipelines restructured in v1.0.0 (package with Drop + lazy re-exports); cognee.agent_memory requires async function; cognee.serve() without url triggers Auth0 Device Code Flow; cognee.start_ui is sync and needs pid_callback arg; all add/cognify/search/memify/remember/recall/improve/forget/serve are async — always await. | |(three more skills — oms-cocoindex, oms-storybook-react-vite, oms-uitripled — omitted here for brevity; see the full file) ``` ~80-120 tokens per skill (version-pinned, retrieval instruction, section anchors, inline gotchas). Root paths are per-IDE — each of the 23 supported IDEs has its own skill directory (e.g., `.claude/skills/`, `.cursor/skills/`, `.github/skills/`, `.windsurf/skills/`). See [`skf-export-skill/assets/managed-section-format.md`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/skf-export-skill/assets/managed-section-format.md) for the complete IDE → Context File Mapping. Aligned with [Vercel's research](https://vercel.com/blog/agents-md-outperforms-skills-in-our-agent-evals) finding that indexed format with explicit retrieval instructions dramatically improves agent performance. Developer controls placement. Ferris controls content. Snippet updates only happen at `export-skill` — create and update are draft operations. An `.export-manifest.json` tracks which skills have been explicitly exported, preventing draft skills from leaking into the managed section. --- ## Ownership Model | Context | `source_authority` | Distribution | |---------|-------------------|-------------| | OSS library (maintainer generates) | `official` | `npx skills publish` to agentskills ecosystem | | Internal service (team generates) | `internal` | `skills/` in repo, ships with code | | External dependency (consumer generates) | `community` | Local `skills/`, marked as community | Provenance maps enable verification: an `official` skill's provenance must trace to the actual source repo owned by the author. If something isn't working, start here. For general setup help see [Getting Started → Need help?](../getting-started/#need-help). --- ## Common errors ### Forge reports ast-grep is unavailable If setup reports that ast-grep was not detected, install it to unlock the Forge tier: . Re-run `@Ferris SF` afterward — your tier upgrades automatically. ### "No brief found" Run `@Ferris BS` first to create a skill brief, or use `@Ferris QS` for brief-less generation. `CS` requires either a brief or a direct invocation with scope arguments. ### "Ecosystem check: official skill exists" An official skill already exists for this package. Consider installing it with `npx skills add` instead of generating your own — the official skill is typically better tested and kept up-to-date by the library maintainer. ### Quick-tier skills have lower confidence scores Quick tier reads source without AST analysis, so signatures are read directly from files rather than structurally verified. Install ast-grep to upgrade to the Forge tier for AST-verified signatures (T1 confidence) — see [Capability Tiers](../concepts/#capability-tiers-quickforgeforgedeep). ### Want semantic discovery for large codebases? Install [cocoindex-code](https://github.com/cocoindex-io/cocoindex-code) to unlock the Forge+ tier. CCC indexes your codebase and pre-ranks files by semantic relevance before AST extraction, improving coverage on projects with 500+ files. --- ## Still stuck? 1. Run `@Ferris SF` to check your tool availability and current tier 2. Check `forge-tier.yaml` in your forger sidecar for your configuration 3. If `/bmad-help` is installed (via full BMAD Method), run it and describe your state — e.g. `/bmad-help my batch creation failed halfway, how do I resume?` 4. [File an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) — SKF's [health check system](../workflows/#terminal-step-health-check) is the primary feedback channel, and manual issues feed the same pipeline **Nothing is made up.** Every instruction in every skill traces back to a specific file, a specific line, and a specific commit in the upstream source. If a skill claims a function exists, you can open the real source tree at the pinned commit and see it with your own eyes. If the claim and the source disagree — that's a bug, and SKF treats it as one. --- ## The three-step audit Pick any symbol in any SKF-compiled skill. You can trace it to the exact line of upstream source in under 60 seconds. ### 1. Open the skill's `metadata.json` Every skill ships a `metadata.json` next to its `SKILL.md`. Note two fields: - `source_commit` — the exact commit SHA the skill was compiled from - `source_repo` — the upstream repository This is the anchor. Everything else traces back to this commit. ### 2. Open the skill's `provenance-map.json` Provenance maps live in `forge-data/{skill}/{version}/provenance-map.json` alongside each compiled skill. Find your symbol. Every entry carries its own `source_file` and `source_line`: ```json { "export_name": "search", "export_type": "function", "params": ["query_text: str", "query_type: SearchType = GRAPH_COMPLETION", "top_k: int = 10"], "return_type": "List[SearchResult]", "source_file": "cognee/api/v1/search/search.py", "source_line": 27, "confidence": "T1", "extraction_method": "ast-grep" } ``` The snippet above is a real entry from [`forge-data/oms-cognee/1.0.0/provenance-map.json`](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-cognee/1.0.0/provenance-map.json). Line number is not rounded. Confidence tier is explicit. Extraction method is named. Nothing is paraphrased. ### 3. Visit the upstream repo at the pinned commit Open `{source_repo}` at `{source_commit}`, jump to `{source_file}` line `{source_line}`. The signature in `SKILL.md` should match what you see in the source. If it doesn't, **that's a bug**. [Open an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose). SKF will republish the skill with a new commit SHA and a new provenance map. Falsifiability isn't a feature — it's the whole deal. ### Workflow-time enforcement The same anchor is enforced automatically by `skf-test-skill` and by gap-driven `skf-update-skill`. Before either workflow reads source at a recorded `source_line`, it runs `git rev-parse HEAD` on the local workspace and compares it to `metadata.source_commit`. If the workspace has drifted, the workflow halts with a `halted-for-workspace-drift` status and tells you the exact `git checkout {source_ref}` to re-sync — so spot-checks can never silently verify against the wrong tree. Pass `--allow-workspace-drift` to opt in to reading the current HEAD anyway; the override is recorded in the final report rather than hidden. --- ## Where to look for what Every file in the per-skill output carries a specific job. Here's the lookup table for the really skeptical: | Question | File | |---|---| | What commit was the source pinned to? | `skills/{name}/{version}/{name}/metadata.json` → `source_commit` | | Which symbols are documented and where did each come from? | `forge-data/{name}/{version}/provenance-map.json` | | What AST patterns were used for extraction? | `forge-data/{name}/{version}/extraction-rules.yaml` | | What signatures, types, and examples did the extractor actually capture? | `forge-data/{name}/{version}/evidence-report.md` | | How was the skill scored? Show me the math. | `forge-data/{name}/{version}/test-report-{name}.md` | | How was the skill scoped, and what was deliberately left out? | `forge-data/{name}/skill-brief.yaml` | Everything a reader needs to reconstruct the compilation is in the two sibling directories: `skills/` ships to consumers, `forge-data/` is the audit trail. --- ## The scores, including the ones we lose Completeness scoring is never 100%. The [scoring formula](#how-the-score-is-computed) is deterministic and the pass threshold is **80%** — but every test report also logs the specific edges where a skill falls short, so the numbers aren't marketing. Take oh-my-skills' four reference skills as an example. Their scores range from **99.0% to 99.49%** — none are perfect, and every test report names the specific drift it found: | Skill | Score | What the report discloses | |---|---|---| | [oms-cocoindex](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-cocoindex/0.3.37/test-report-oms-cocoindex.md) | **99.0%** | 114/114 provenance entries; 55 public-API denominator from `__init__.py` `__all__`; 20/20 sampled signatures matched. Two denominators (barrel vs. full surface) both disclosed with rationale. | | [oms-cognee](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-cognee/1.0.0/test-report-oms-cognee.md) | **99.0%** | 34/34 exports documented; denominator is the `cognee/__init__.py` barrel (61 lines, 34 public re-exports) at pinned commit `3c048aa4` (v1.0.0). | | [oms-storybook-react-vite](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-storybook-react-vite/10.3.5/test-report-oms-storybook-react-vite.md) | **99.49%** | 215/216 documented — the missing 1 entry is logged openly as **GAP-004**, a canonical surface count drift from the stated denominator. | | [oms-uitripled](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-uitripled/0.1.0/test-report-oms-uitripled.md) | **99.45%** | 34-entry denominator (not 11, not 25) with the full reconciliation reasoning in the report. | Perfection is suspicious. Visible fallibility is trustworthy. SKF writes down the edges it can't score cleanly — so you can read them and decide for yourself whether the remaining coverage is enough for your use case. ### GAP-004: a worked example of the 1% that fails The [`oms-storybook-react-vite` test report](https://github.com/armelhbobdad/oh-my-skills/blob/main/forge-data/oms-storybook-react-vite/10.3.5/test-report-oms-storybook-react-vite.md) scores **215/216** — not 216/216. The missing 1 entry is logged as **GAP-004**: a canonical export surface count (via the provenance map) diverges from the stated denominator in metadata.json. The report names the gap, shows the math, and leaves the drift visible for the next recompilation pass. Nothing was hidden. That's the pattern SKF asks you to trust: when scoring can't reach 100%, the report says so, cites the line, and leaves a fingerprint for the next audit. --- ## How the Score Is Computed The Test Skill workflow (`@Ferris TS`) calculates the completeness score — a weighted measure of how thoroughly and accurately a skill documents its target. This score is the quality gate: pass and the skill is ready for export; fail and it routes to update-skill for remediation. ### Categories and weights The score is the weighted sum of five categories: | Category | Weight | What it measures | |---|---|---| | **Export Coverage** | 36% | Percentage of source exports documented in `SKILL.md` | | **Signature Accuracy** | 22% | Documented function signatures match actual source signatures (parameter names, types, order, return types) | | **Type Coverage** | 14% | Types and interfaces referenced in exports are fully documented | | **Coherence** | 18% | Cross-references resolve, integration patterns are complete (contextual mode only) | | **External Validation** | 10% | Average of skill-check quality score (0–100) and tessl content score (0–100%) | ### Formula ``` total_score = sum(category_weight × category_score) ``` Each category score is a percentage: `(items_passing / items_total) × 100`. **Coherence** (contextual mode) combines two sub-scores: ``` coherence = (reference_validity × 0.6) + (integration_completeness × 0.4) ``` If no integration patterns exist, coherence equals reference validity alone. **External validation** averages the two tools when both are available. When only one tool is available, that tool's score is used. When neither is available, the 10% weight is redistributed proportionally to the other active categories. ### Deterministic scoring The weight redistribution and score aggregation are computed by a deterministic Python script ([`compute-score.py`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/skf-test-skill/scripts/compute-score.py)). The LLM extracts category scores from the test report, constructs a JSON input, invokes the script, and uses its output for the final score. Same inputs always produce the same score. If the script is unavailable, the LLM falls back to manual calculation using the same formulas. ### Naive vs contextual mode Test Skill runs in one of two modes, detected automatically: - **Contextual mode** (stack skills) — all five categories scored with the default weights above. - **Naive mode** (individual skills) — Coherence is not scored. Its 18% weight is redistributed: | Category | Naive Weight | |---|---| | Export Coverage | 45% | | Signature Accuracy | 25% | | Type Coverage | 20% | | External Validation | 10% | ### Tier adjustments Your forge tier determines which categories can be scored: | Tier | Skipped Categories | Reason | |---|---|---| | **Quick** | Signature Accuracy, Type Coverage | No AST parsing available | | **Docs-only** | Signature Accuracy, Type Coverage | No source code to compare against | | **Provenance-map** (State 2) | Signature Accuracy, Type Coverage | String comparison only, no semantic AST verification | | **Forge / Forge+ / Deep** | None | Full AST-backed scoring | When categories are skipped, their combined weight is redistributed proportionally to the remaining active categories. A Quick-tier skill and a Deep-tier skill both pass at the same 80% threshold — the score reflects what your tier can actually measure. ### Pass/fail ``` threshold = custom_threshold OR 80% (default) score >= threshold → PASS → Recommend export-skill score < threshold → FAIL → Recommend update-skill ``` The default is 80%. You can override it by specifying a custom threshold when invoking the workflow (e.g., "test this skill with a 70% threshold"). ### Gap severities When the score is calculated, each finding is classified by severity to guide remediation: | Severity | Examples | |---|---| | **Critical** | Missing exported function/class documentation | | **High** | Signature mismatch between source and `SKILL.md` | | **Medium** | Missing type/interface documentation; scripts/assets directory inconsistencies | | **Low** | Missing optional metadata or examples; description optimization opportunities | | **Info** | Style suggestions; discovery testing recommendations | ### Score report output The test report includes a score breakdown table showing each category's raw score, weight, and weighted contribution: | Category | Score | Weight | Weighted | |---|---|---|---| | Export Coverage | 92% | 36% | 33.1% | | Signature Accuracy | 85% | 22% | 18.7% | | Type Coverage | 100% | 14% | 14.0% | | Coherence | 80% | 18% | 14.4% | | External Validation | 78% | 10% | 7.8% | | **Total** | | **100%** | **88.0%** | The report also records `analysisConfidence` (full, provenance-map, metadata-only, remote-only, or docs-only) and includes a degradation notice when source access was limited. --- ## Build-time drift detection (for docs themselves) The SKF docs you're reading right now are themselves verified against oh-my-skills. A `docs/_data/pinned.yaml` anchor file records the exact version, commit SHA, and confidence tier of every reference skill. A Node validator (`tools/validate-docs-drift.js`) runs as part of `npm run quality` and: 1. **Confirms canonical truth** — every anchor in `pinned.yaml` is cross-checked against the actual `metadata.json` in oh-my-skills. Version, commit, tier, and authority must match. 2. **Scans docs for stale prose** — every `.md` file is grepped for ` v?` patterns and any version that disagrees with `pinned.yaml` is flagged with file + line number. If the validator flags drift, the CI fails before the docs get merged. It's the same "nothing is made up" contract SKF applies to skills, applied to the docs that describe SKF. When the anchor file is updated to reflect a new oh-my-skills release, the prose must update too — otherwise `npm run docs:validate-drift` blocks the merge. You can run it yourself from the SKF repo root: ```bash npm run docs:validate-drift ``` Or point it at a different local copy of oh-my-skills: ```bash OMS=/path/to/your/oh-my-skills npm run docs:validate-drift ``` Clean output looks like this: ``` OK: 4 skills checked against /home/you/oh-my-skills, no drift. ``` Dirty output cites exact file:line locations so the fix is mechanical. --- ## Reference output: oh-my-skills Every example in this page points at [**oh-my-skills**](https://github.com/armelhbobdad/oh-my-skills), the SKF reference portfolio. Four Deep-tier skills (cocoindex, cognee, Storybook v10, uitripled), each shipping its full audit trail alongside the compiled skill. Both the worked example for this page and the continuing proof that the pipeline does what it says. If you want to see what SKF produces when you run it on real libraries, that's the answer. Skill Forge is the only AI-skills toolchain where every claim your agent reads cites a file, a line, and a commit SHA. Not "sourced from training data." Not "retrieved from context." **Cited.** You can open the upstream repo at the pinned commit and see the function exists — in under a minute. That's the wedge. This page explains why it matters, how SKF compares to alternatives, and who it's for. --- ## The problem you're hiring SKF to solve Your AI agents read your codebase through the lens of whatever happened to be in their training data. When that training data is wrong, stale, or incomplete, your agent invents — function names that don't exist, parameter types that don't match, config options removed two versions ago. You catch some of it in review. You ship some of it by accident. Every sprint, your team spends hours untangling code that only compiles in the AI's imagination. SKF treats this as a citation problem, not a model problem. If a skill claims `cognee.search()` takes `query_text` as its first parameter, SKF points to `cognee/api/v1/search/search.py:L26` at commit `3c048aa4` in the upstream repo. That's the whole pitch: **nothing is made up, and everything is falsifiable in 60 seconds.** --- ## How SKF compares
| Approach | What it does well | Where it falls short | |----------|-------------------|----------------------| | Skill scaffolding (`npx skills init`) | Generates a spec-compliant skill file | The file is empty — you still have to write every instruction by hand | | LLM summarization | Understands context and intent | Generates plausible-sounding content that may not match the actual API | | RAG / context stuffing | Retrieves relevant code snippets | Returns fragments without synthesis — no coherent skill output | | Manual authoring | High initial accuracy | Drifts as the source code changes, doesn't scale across dependencies | | IDE built-in context (Copilot, Cursor) | Convenient, zero setup | Uses generic training data, not your project's specific integration patterns | | **Skill Forge** | **Every instruction cites upstream `file:line` at a pinned commit. Falsifiable in 60 seconds.** | **Coverage depends on which tools you've installed (Quick / Forge / Forge+ / Deep tiers).** |
--- ## What "falsifiable in 60 seconds" actually means Pick any symbol in any SKF-compiled skill. Three clicks: 1. Open the skill's `metadata.json` — it names the upstream repo and the exact commit SHA. 2. Open the skill's `provenance-map.json` — find your symbol; it lists the file and line. 3. Visit the upstream repo at that commit and that line. The signature in the skill should match. If it doesn't, **that's a bug.** [Open an issue](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) and SKF republishes the skill with a new commit SHA and a new provenance map. No other AI-skills tool treats disagreement between claim and source as a defect. SKF does. See the [Verifying a Skill](../verifying-a-skill/) page for the full three-step audit on real skills, the test reports that log *exactly* where coverage falls short, and the scoring formula behind the 80% pass threshold. --- ## Who's this for? ### The curious developer Your agent just hallucinated a method that doesn't exist, again. You want this to stop, and you don't want to read a 569-line architecture page before running your first command. → Start with [Getting Started](../getting-started/). ### The BMAD user You already use BMAD Method, BMM phases, TEA, or BMB, and you want to know where SKF fits. → Read [BMAD Synergy](../bmad-synergy/) for the phase-by-phase integration playbook. ### The skeptic "AI docs for AI" sounds like the problem pretending to be the solution. You want receipts before you install anything. → Start with [Verifying a Skill](../verifying-a-skill/) — the three-step audit on real skills, including the 1% that fails. ### The OSS maintainer You want to ship verified skills alongside your library releases — `npx skills publish`-ready, drift-detectable, version-pinned. → See [Examples → OSS Maintainer Publishing Official Skills](../examples/#scenario-h-oss-maintainer-publishing-official-skills). ### The team lead evaluating adoption You're considering running SKF across a brownfield platform. You need to know about rollback safety, `[MANUAL]` section preservation, and the health-check feedback loop before committing. → Start with [Architecture](../architecture/), then [Workflows → Workflow Health Check](../workflows/#terminal-step-health-check). --- ## Not for you if… - You want docs that hand-hold through every happy path with screenshots and emojis. SKF is a citation machine, not a tutorial series. - You need perfect coverage of every private implementation detail. SKF extracts public APIs; if you want internals, read the code directly. - You don't have Node.js ≥ 22 and Python ≥ 3.10 installed. SKF is a Node/Python toolchain at its core. - You're looking for something that generates skills from natural-language descriptions alone. SKF compiles from source code and documentation — not prompts. Everything else is downstream of one question: *are the instructions your AI reads provably true?* If yes, SKF isn't adding value. If you can't be sure, SKF is the tool. --- ## Next - **[Install SKF](../getting-started/#install)** — Node ≥ 22, Python ≥ 3.10, `uv`, one `npx` command - **[Audit a skill in 60 seconds](../verifying-a-skill/)** — see the receipts before you install - **[Browse real skills](https://github.com/armelhbobdad/oh-my-skills)** — four Deep-tier skills, all shipping their audit trails
Trigger workflows by typing commands to [Ferris](../agents/). See [Concepts](../concepts/) for definitions. > Already using BMAD? See [BMAD Synergy](../bmad-synergy/) for when to invoke each SKF workflow during BMM phases and alongside TEA, BMB, and GDS. --- ## Core Workflows ### Setup Forge (SF) **Command:** `@Ferris SF` **Purpose:** Initialize forge environment, detect tools (ast-grep, ccc, gh, qmd), set capability tier, index project in CCC (Forge+), verify QMD collection health (Deep). **When to Use:** First time using SKF in a project. Run once per project. **Key Steps:** Detect tools + Determine tier → CCC index check (Forge+) → Write forge-tier.yaml → QMD + CCC registry hygiene (Deep/Forge+) → Status report **Agent:** Ferris (Architect mode) --- ### Brief Skill (BS) **Command:** `@Ferris BS` **Purpose:** Scope and design a skill through guided discovery. **When to Use:** Before `Create Skill` when you want maximum control over what gets compiled. **Key Steps:** Gather intent → Analyze target → Define scope → Confirm brief → Write skill-brief.yaml **Agent:** Ferris (Architect mode) --- ### Create Skill (CS) **Command:** `@Ferris CS` **Purpose:** Compile a skill from a brief. Supports `--batch` for multiple briefs. **When to Use:** After Brief Skill, or with an existing skill-brief.yaml. **Key Steps:** Load brief → Ecosystem check → Extract (AST + scripts/assets) → QMD enrich (Deep) → Compile → Validate → Generate **Agent:** Ferris (Architect mode) --- ### Update Skill (US) **Command:** `@Ferris US` **Purpose:** Regenerates the skill while preserving `[MANUAL]` sections. Detects individual vs stack internally. **When to Use:** After source code changes when an existing skill needs updating. **Key Steps:** Load existing → Detect changes (incl. scripts/assets) → Re-extract → Merge (preserve MANUAL) → Validate → Write → Report **Agent:** Ferris (Surgeon mode) --- ## Feature Workflows ### Quick Skill (QS) **Command:** `@Ferris QS ` or `@Ferris QS @` **Purpose:** Brief-less fast skill with package-to-repo resolution. **When to Use:** When you need a skill quickly — no brief needed. Accepts package names or GitHub URLs. Append `@version` to target a specific version (e.g., `@Ferris QS cognee@1.0.0`). **Key Steps:** Resolve target → Ecosystem check → Quick extract → Compile → Validate → Write **Agent:** Ferris (Architect mode) --- ### Stack Skill (SS) **Command:** `@Ferris SS` **Purpose:** Consolidated project stack skill with integration patterns. Supports two modes: **code-mode** (analyzes a codebase) and **compose-mode** (synthesizes from existing skills + architecture document, no codebase required). **When to Use:** When you want your agent to understand your entire project stack — not just individual libraries. Use code-mode for existing projects; compose-mode activates automatically after the VS → RA verification path when skills exist but no codebase is present. **Key Steps (code-mode):** Detect manifests → Rank dependencies → Scope confirmation → Parallel extract → Detect integrations → Compile stack → Generate references **Key Steps (compose-mode):** Load existing skills → Confirm scope → Detect integrations from architecture doc → Compile stack → Generate references **Agent:** Ferris (Architect mode) --- ### Analyze Source (AN) **Command:** `@Ferris AN` **Purpose:** Decomposes a repo to discover what's worth skilling, and recommends a stack skill. **When to Use:** Brownfield onboarding of large repos or multi-service projects. **Key Steps:** Init → Scan project → Identify units → Map exports & detect integrations → Recommend → Generate briefs **Note:** Supports resume — if the session is interrupted mid-analysis, re-run `@Ferris AN` and Ferris will resume from where it left off. **Agent:** Ferris (Architect mode) --- ### Audit Skill (AS) **Command:** `@Ferris AS` **Purpose:** Drift detection between skill and current source. **When to Use:** To check if a skill has fallen out of date with its source code. Works for both individual skills and stack skills. **Key Steps:** Load skill → Re-index source → Structural diff (incl. script/asset drift) → Semantic diff (Deep) → Classify severity → Report **Stack skill support:** Code-mode stacks are audited per-library against their sources. Compose-mode stacks check constituent freshness via metadata hash comparison — if a constituent skill was updated after the stack was composed, audit flags it as constituent drift. Stack skills that need updating are redirected to `@Ferris SS` for re-composition (surgical update is not supported for stacks). **Agent:** Ferris (Audit mode) --- ### Test Skill (TS) **Command:** `@Ferris TS` **Purpose:** Verifies whether a skill covers its target completely and accurately. Naive and contextual modes. Quality gate before export. **When to Use:** After creating or updating a skill, before exporting. **Key Steps:** Load skill → Detect mode → Coverage check → Coherence check → External validation (skill-check, tessl) → Score → Gap report **Scored Categories:** Export Coverage (36%), Signature Accuracy (22%), Type Coverage (14%), Coherence (18%), External Validation (10%). Default pass threshold: **80%**. Pass routes to Export Skill; fail routes to Update Skill with a gap report. See [Completeness Scoring](../verifying-a-skill/#how-the-score-is-computed) for the full formula and tier adjustments. **Agent:** Ferris (Audit mode) --- ## Architecture Verification Workflows ### Verify Stack (VS) **Command:** `@Ferris VS` **Purpose:** Pre-code stack feasibility verification. Cross-references generated skills against architecture and PRD documents with three passes: coverage, integration compatibility, and requirements. **When to Use:** After generating individual skills with CS/QS, before building a stack skill — to verify the tech stack can support the architecture. **Key Steps:** Load skills + docs → Coverage analysis → Integration verification → Requirements check → Synthesize verdict → Present report **Agent:** Ferris (Audit mode) --- ### Refine Architecture (RA) **Command:** `@Ferris RA` **Purpose:** Improves an architecture document using verified skill data as evidence. Takes the original architecture doc + generated skills + optional VS report, fills gaps, flags contradictions, and suggests improvements — all citing specific APIs. **When to Use:** After VS confirms feasibility, before running SS in compose-mode. Produces a refined architecture ready for stack skill composition. **Key Steps:** Load inputs → Gap analysis → Issue detection → Improvement detection → Compile refined doc → Present report **Agent:** Ferris (Architect mode) --- ## Utility Workflows ### Export Skill (EX) **Command:** `@Ferris EX` **Purpose:** Validate package structure, generate context snippets, and inject managed sections into CLAUDE.md/AGENTS.md/.cursorrules. **When to Use:** When a skill is ready for CLAUDE.md/AGENTS.md integration. Also provides a local install command (`npx skills add `) and distribution instructions for `npx skills publish`. See [Installation → Source Formats](https://www.npmjs.com/package/skills#installation) for other install methods. **Key Steps:** Load skill → Validate package → Generate snippet → Update context file (CLAUDE.md/AGENTS.md/.cursorrules) → Token report → Summary **Agent:** Ferris (Delivery mode) --- ## Management Workflows ### Rename Skill (RS) **Command:** `@Ferris RS` **Purpose:** Rename a skill across all its versions. Because the agentskills.io spec requires `name` to match parent directory name, this is a coordinated move across outer/inner directories, SKILL.md frontmatter, metadata.json, context snippets, provenance maps, the export manifest, and platform context files. **When to Use:** You need to change a skill's name — for example, graduating a `QS`-generated skill (named from the repo) to a formal name, or adding a suffix like `-community` to distinguish from an official skill. **Key Steps:** Select skill + new name → Transactional copy → Update all references → Rebuild context files → Delete old name (point of no return) **Safety:** Transactional — if any step fails before the final delete, the old skill remains intact. Warns if `source_authority: "official"` (rename is local-only; published registry skill won't change). **Agent:** Ferris (Management mode) --- ### Drop Skill (DS) **Command:** `@Ferris DS` **Purpose:** Drop a specific skill version or an entire skill. Soft drop (default) marks the version as deprecated in the manifest and keeps files on disk. Hard drop (`--purge`) also deletes the files. **When to Use:** Retire a deprecated version (e.g., drop an older cognee skill version because it's obsolete), free disk space, or remove a skill you no longer need. **Key Steps:** Select skill → Select version(s) + mode → Update manifest → Rebuild context files → Delete files (if purge) **Safety:** Active version guard — cannot drop the currently active version when other non-deprecated versions exist (switch active first, or drop all). Soft drop is reversible by editing the manifest. **Agent:** Ferris (Management mode) --- ## Workflow Connections **Standard path (code-mode):** ```mermaid flowchart TD SF[Setup Forge — one-time] --> AN[Analyze Source] SF --> QS[Quick Skill] SF --> SS_code[Stack Skill — code-mode] AN --> BS[Brief Skill] BS --> CS[Create Skill] AN -->|direct| CS CS --> TS[Test Skill — quality gate] QS --> TS SS_code --> TS TS --> EX[Export Skill] TS --> AS[Audit Skill] AS --> US[Update Skill] US --> TS ``` **Pre-code verification path (compose-mode):** ```mermaid flowchart TD GEN["Create Skill | Quick Skill ×N
(per library)"] --> VS[Verify Stack — feasibility report] VS --> RA[Refine Architecture — refined doc] RA --> SS_compose[Stack Skill — compose-mode] SS_compose --> TS[Test Skill] TS --> EX[Export Skill] ``` > **One workflow per session** (unless using pipeline mode). Each arrow in the diagrams above represents a new conversation session. Clear your context between workflows for best results — or use pipeline mode to chain them automatically. See [Pipeline Mode](#pipeline-mode) below. --- ## Workflow Categories | Category | Workflows | Description | |----------|-----------|-------------| | Core | SF, BS, CS, US | Setup, brief, create, and update skills | | Feature | QS, SS, AN | Quick skill, stack skill, and analyze source | | Quality | AS, TS | Detect skill drift (AS) and verify skill completeness (TS) | | Architecture Verification | VS, RA | Pre-code architecture feasibility and refinement | | Management | RS, DS | Rename and drop skill versions with transactional safety | | Utility | EX | Package and export for consumption | | In-Agent | WS, KI | WS: show lifecycle position, active briefs, and forge tier; KI: list knowledge fragments (both in-agent, no file-based workflow) | --- ## Pipeline Mode Instead of running one workflow per session, you can chain multiple workflows in a single command. Ferris executes them left to right, passing data (brief path, skill name) between each workflow automatically. ### Syntax ``` @Ferris BS CS TS EX — space-separated codes @Ferris QS[cocoindex] TS EX — with target argument in brackets @Ferris CS TS[min:80] EX — with circuit breaker threshold override @Ferris forge-quick cognee — named alias with target ``` ### Pipeline Aliases | Alias | Expands To | First Workflow | Required Target | |-------|-----------|----------------|-----------------| | `forge` | `BS CS TS EX` | BS | GitHub URL or local path **+** skill name | | `forge-quick` | `QS TS EX` | QS | GitHub URL **or** package name | | `onboard` | `AN CS TS EX` | AN | Project path (defaults to current directory) | | `maintain` | `AS US TS EX` | AS | Existing skill name | **The first workflow's input contract defines what arguments the pipeline needs.** A bare package name works for `forge-quick` (QS resolves packages via the registry) but **not** for `forge` — BS requires both an unambiguous target (URL or path) and a skill name. ### How It Works - Pipelines **automatically activate headless mode** — all confirmation gates auto-proceed with their default action - **Data flows automatically** — once the first workflow completes, the brief path or skill name becomes the input for downstream workflows - **Circuit breakers** halt the pipeline if quality drops below a threshold (e.g., test score < 60 blocks export) - **Anti-pattern warnings** — Ferris warns if you chain workflows in a problematic order (e.g., exporting before testing) - **Progress reporting** — Ferris reports completion of each workflow before starting the next - **Safe halt on ambiguity** — headless mode won't guess. If the initial target doesn't satisfy the first workflow's contract (e.g., `forge cognee` — ambiguous, not a URL or path), the pipeline halts at step 1 before any work happens and suggests concrete next steps. ### Examples ``` @Ferris forge-quick @tanstack/query — QS + TS + EX for TanStack Query @Ferris forge https://github.com/topoteretes/cognee cognee — BS + CS + TS + EX, explicit URL + name @Ferris forge https://github.com/topoteretes/cognee cognee "public API only" — with scope hint @Ferris maintain cocoindex — AS + US + TS + EX for an existing cocoindex skill @Ferris onboard — AN + CS + TS + EX on the current project ``` --- ## Headless Mode Add `--headless` or `-H` to any workflow command to skip all confirmation gates. Ferris auto-proceeds with default actions (typically "Continue") and logs each auto-decision. Progress output is still shown — headless skips interaction, not reporting. ``` @Ferris QS cocoindex --headless — quick skill with no interaction gates @Ferris TS --headless — test a skill without the review pause @Ferris EX -H — export with auto-approved context update ``` You can also set `headless_mode: true` in your forge preferences (`_bmad/_memory/forger-sidecar/preferences.yaml`) to make headless the default for all workflows. --- ## Terminal Step: Health Check All 14 workflows above share the same final step — a **health check** defined in [`src/shared/health-check.md`](https://github.com/armelhbobdad/bmad-module-skill-forge/blob/main/src/shared/health-check.md). This isn't a workflow you invoke directly; there's no command code and no menu entry. Each workflow ends with a dedicated local `step-NN-health-check.md` whose `nextStepFile` points at the shared file, so the health check fires automatically once the main work is done. After the main work is done, Ferris reflects internally on the execution: - Did any step instruction lead the agent astray or cause unnecessary back-and-forth? - Was any step ambiguous, forcing the agent to guess? - Did a scenario arise that the workflow didn't account for? - Were any instructions wrong or contradictory? If the answer to all of these is "no", the health check exits in one line (`Clean run. No workflow issues to report.`). If real friction was observed, Ferris presents structured findings, waits for your review, and — on your approval — routes them to this repo. **Zero overhead for clean runs. High leverage when something breaks.** The health check is honest-by-default: zero findings is the expected outcome. Fabricated issues would hurt the signal, so Ferris only reports what the agent actually experienced. ### How findings are routed - **Severity gate.** Only `bug` findings submit live as GitHub issues by default. `friction` and `gap` findings — the most subjective categories — go to a **local queue** at `{output_folder}/improvement-queue/` unless you explicitly opt in to submit them live during the review gate. This keeps the high-signal reports (real defects) flowing to maintainers while the softer observations sit safely on your disk for you to batch or revisit. - **Fingerprint dedup.** Every finding gets a deterministic 7-hex fingerprint computed from `sha1(severity|workflow|step_file|section)` — no LLM similarity judgment, just a tuple hash. Before Ferris opens a new issue, it searches the repo for an existing open issue with the same `fp-*` label. If one exists, you're offered a choice: add a 👍 reaction (silent upvote), react + post a one-sentence environment delta, open a new issue anyway (if you're certain it's distinct), or skip. Re-reporting the same fingerprint is safe — it just adds to the signal-count on the canonical issue. - **Global seen-cache.** Once you've submitted or reacted for a given fingerprint, it's recorded at `~/.skf/health-check-seen.json` so the same user never re-reports the same defect across sessions or across different projects on the same machine. - **Server-side safety net.** If two users race past the client-side search and both open issues with the same fingerprint, a GitHub Action on this repo catches it: the later issue is auto-closed as a duplicate, linked to the canonical (lowest-numbered) issue, and a 👍 is added there to preserve the signal-count. Manual filers using the [issue template](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) feed the same pipeline. **Net effect:** 1,000 users hitting the same bug produce **one canonical issue** with a reaction-count of roughly 1,000 — not 1,000 duplicate issues or a 1,000-comment thread. The maintainer sees population impact at a glance, and your report is never lost. ### Please let workflows run to completion If you cancel a workflow early, or interrupt the agent before the terminal step, the health check doesn't run — and any friction from that session is lost. When you have time, let each workflow reach its natural end. The health check is how SKF learns to do better. ### If the health check didn't run You have two recovery options: 1. **Ask Ferris to run it now** — while the session context is still fresh: ``` @Ferris please run the workflow health check for this session ``` Ferris will load `shared/health-check.md` and reflect on what just happened, exactly as if the workflow had reached its natural end. 2. **Open an issue directly** — use the [Workflow Health Check issue template](https://github.com/armelhbobdad/bmad-module-skill-forge/issues/new/choose) on this repo. Any concrete, evidence-based report helps — cite the specific step file and section where the friction occurred, and describe what you actually observed (not what you think the problem is). Both paths feed the same improvement queue. > **Note:** Some gates cannot be skipped even in headless mode — for example, merge conflicts in Update Skill always require human judgment.