Releases: boshu2/agentops
v2.33.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
This release tightens execution hygiene, retrieval quality, and release operations. You can now benchmark retrieval quality from the CLI, run persona-based adversarial validation with /red-team, and let Crank surface stale or mergeable bead backlogs before burning worker time. Release prep is also less opinionated now that the enforced cadence gate is gone.
What's New
- Backlog hygiene gates — Crank and related scripts now surface stale or mergeable bead backlogs before execution starts.
- Retrieval benchmarking —
ao retrieval-benchadds benchmark corpora, live mode, global scope, and nightly regression coverage. - Adversarial validation —
/red-teamadds persona-based probing for docs and skills, with checked-in Codex runtime artifacts. - Software factory lane — the CLI and startup flow now expose a dedicated software-factory operator surface.
- Release timing freedom — release prep no longer blocks on a minimum wait between tags.
For the full categorized diff, expand the changelog section below.
Full changelog
Added
- Backlog hygiene gates — added
bd-audit.sh,bd-cluster.sh, and Crank/Codex guidance for cleaning stale or mergeable beads before execution - Retrieval benchmarking and global scope — added
ao retrieval-bench, benchmark corpora,--live,--global, and nightly IR regression coverage /red-teamadversarial validation — added a persona-based validation skill plus checked-in Codex runtime artifacts- Software factory operator lane — added a CLI/operator surface and Claude factory startup routing for software-factory workflows
- Flywheel maintenance utilities — added global garbage purge tooling and nightly retrieval benchmarking for knowledge quality tracking
Changed
- Release policy — removed the enforced release cadence gate so releases no longer block on a minimum wait between tags
- Knowledge operator surfaces — plan and validation now wire knowledge operator surfaces directly into execution flow
- Proof and runtime docs — goals, RPI docs, and contributor guidance now reflect the expanded proof surfaces and hookless runtime behavior
Fixed
- Codex artifact parity — restored checked-in Codex parity for red-team and cleaned Codex runtime metadata/frontmatter drift across crank, forge, post-mortem, release, and swarm artifacts
- Retrieval quality — replaced exact-substring filtering with token-level matching and tuned penalty, deduplication, and OR-fallback behavior
- Harvest metadata preservation — promotion now preserves source metadata and fills missing maturity, utility, and type fields safely
- Release tooling — release artifact directories are created safely and audit artifacts now resolve against release tag names
- Documentation and link drift — repaired the post-mortem Codex link and aligned runtime docs around the newer startup and lifecycle flows
Full Changelog: v2.32.0...v2.33.0
v2.32.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
Knowledge activation and session intelligence are the headline features in this release. A new skill and CLI surfaces let the agent consume cross-domain knowledge at runtime — ranking, assembling, and explaining the context it injects into each session. The session intelligence engine adds trust policies and explainability so you can see exactly why certain knowledge was selected. The pre-push gate now runs 9 checks that previously required CI, giving faster local feedback before you push.
What's New
- Knowledge activation — new
/knowledge-activationskill andaoCLI surfaces activate cross-domain knowledge at runtime with ranked intelligence context and operator surface consumption - Session intelligence engine — complete runtime engine with explainability, trust policy enforcement, and ranked context assembly
- Runtime selection —
ao rpi servesupports explicit runtime selection for Claude and Codex execution modes - Faster local validation — 9 CI-only checks migrated to the pre-push gate for immediate feedback
All Changes
Added
- Knowledge activation skill with CLI surfaces and runtime operator consumption
- Session intelligence runtime engine with explainability and ranked context
- Runtime selection for
ao rpi serve - Quality signals hook with telemetry test coverage
- Nine checks shifted from CI-only to local pre-push gate
- Inject stability warnings, signal tests, and status dashboard improvements
Changed
- README rewritten with product-minded gain-framing and Strunk-style prose
- Philosophy doc and observations section added to README
- Repo front doors and codex artifact guidance aligned
- Retry budgets, stability flags, and orchestration patterns applied from Claude Code architecture lessons
- Homebrew formula updated to v2.31.0 with pre-built binaries
Fixed
- Post-mortem closure integrity file parsing normalized
- CI failures resolved across codex refs, test pairing, hook coverage, docs parity, and codex lifecycle
- Lookup now scans nested global knowledge directories
- Test stubs added for new pre-push checks
Dependencies
- codecov/codecov-action bumped from 5 to 6
- DavidAnson/markdownlint-cli2-action bumped from 22 to 23
Full changelog
Added
- Knowledge activation skill — new
/knowledge-activationskill and CLI surfaces for activating cross-domain knowledge at runtime, with operator surface consumption and ranked intelligence context - Session intelligence engine — complete runtime engine with explainability, ranked context assembly, and trust policy enforcement
- Runtime selection for
ao rpi serve— serve now supports explicit runtime selection for Claude and Codex execution modes - Quality signals hook — new
quality-signals.shhook with test coverage for session quality telemetry - Pre-push gate expansion — 9 checks migrated from CI-only to the local pre-push gate for faster feedback
- Inject stability warnings and status dashboard — closed 3 harvest items with signal tests and dashboard improvements
Changed
- README refresh — product-minded rewrite with gain-framing and Strunk-style prose fixes
- Philosophy doc — new
docs/philosophy.mdand observations section added to README - Documentation alignment — repo front doors and codex artifact guidance unified across entry points
- Claude Code architecture lessons — retry budgets, stability flags, quality signals, and orchestration patterns applied to skills
- Homebrew formula — updated to v2.31.0 with pre-built binaries
Fixed
- Post-mortem closure integrity — normalized file parsing for closure integrity audits
- CI reliability — resolved CI failures across codex refs, test pairing, hook coverage, worktree handling, docs parity, hook portability, and codex lifecycle
- Lookup nested scanning —
ao lookupnow scans nested global knowledge directories correctly - Pre-push test stubs — added test stubs for new pre-push checks, skip non-shell in shellcheck
Dependencies
- Bumped
codecov/codecov-actionfrom 5 to 6 - Bumped
DavidAnson/markdownlint-cli2-actionfrom 22 to 23
Full Changelog: v2.31.0...v2.32.0
v2.31.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
Nine new lifecycle skills let the agent handle bootstrapping, dependency audits, design reviews, performance analysis, refactoring, code review, scaffolding, and testing without manual invocation. A new ao harvest command pulls learnings from sibling workspaces so knowledge compounds across your entire multi-agent fleet, not just one repo. Context debugging is easier with ao context packet, and the hook system now formally supports both Claude Code and Codex runtimes.
What's New
- 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test are now part of the RPI workflow with automatic invocation and mechanical gates
- Cross-rig knowledge harvesting —
ao harvestextracts and catalogs learnings from sibling crew workspaces so insights travel between agents - Context packet inspector —
ao context packetlets you debug what inter-session handoff state the agent actually sees - Dual-runtime hook support — Hooks now have a formal runtime contract covering Claude Code, Codex, and manual execution modes
All Changes
Added
- Nine lifecycle skills wired into the RPI workflow with auto-invocation
- Cross-rig knowledge consolidation via
ao harvest - Context packet inspection via
ao context packet - Hook runtime contract with Claude/Codex/manual event mapping
- Research provenance tracking on pending learnings
- Context declarations for inject, provenance, and rpi skills
- Evidence-backed output templates for goals and product commands
Changed
- Documentation reframed around three-gap context lifecycle model
- Hook docs updated with runtime modes table for dual-runtime support
Fixed
- Four pre-existing CI failures resolved
- Lookup retrieval gaps that caused empty results
- Embedded file sync on first session start
- Closure integrity with 24h grace window for evidence timing
- Skill lint compliance across vibe, post-mortem, crank, and plan
- Codex tool naming rule and five Claude-era tool references
- ASCII diagram consistency across 23 documentation files
- Fork exhaustion in validation script replaced with lightweight parser
Full changelog
Added
- 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test skills wired into RPI with auto-invocation and mechanical gates
ao harvest— cross-rig knowledge consolidation extracts and catalogs learnings from sibling crew workspacesao context packet— inspect stigmergic context packets for debugging inter-session handoff state- Hook runtime contract — formal Claude/Codex/manual event mapping with runtime-aware hook tooling
- Evidence-driven skill enrichment — production meta-knowledge, anti-patterns, flywheel metrics, and normalization defect detection baked into 9 skill reference files
- Research provenance — pending learnings now carry full research provenance for discoverability and citation tracking
- Context declarations — inject, provenance, and rpi skills declare their context requirements explicitly
- Goals and product output templates —
/goalsand/productproduce evidence-backed structured output
Changed
- Three-gap context lifecycle contract — README, PRODUCT.md, positioning docs, and operational guides reframed around the context lifecycle model
- Dual-runtime hook documentation — runtime modes table and troubleshooting updated for Claude + Codex hook coexistence
Fixed
- CI reliability — resolved 4 pre-existing CI failures, restored headless runtime preflight, repaired codex parity drift checks
ao lookupretrieval — fixed retrieval gaps that caused lookup to return no results- Embedded sync — using-agentops SKILL.md and
.agents/.gitignorenow written correctly on first session start - Closure integrity — 24h grace window for close-before-commit evidence, normalized file parsing
- Skill lint compliance — vibe, post-mortem, crank, and plan skills trimmed or restructured to stay under 800-line limit
- Codex tool naming — added CLAUDE_TOOL_NAMING rule and fixed 5 Claude-era tool references in codex skills
- ASCII diagram consistency — aligned box-drawing characters across 23 documentation files
- Fork exhaustion prevention — replaced jq with awk in validate-go-fast to prevent fork bombs on large repos
Full Changelog: v2.30.0...v2.31.0
v2.30.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.30.0 — Codex hookless lifecycle, PROGRAM.md workflows, and stronger long-running RPI runs
Highlights
AgentOps now handles Codex hookless sessions more cleanly, gives autonomous workflows a clearer PROGRAM.md contract, and makes long-running RPI runs much easier to inspect. This release also hardens the local release and validation path itself, so the same gate stack you rely on for shipping is more trustworthy under headless and generated-artifact-heavy workflows.
What's New
- Hookless Codex lifecycle support — Codex sessions can now run through startup, follow-up, validation, and closeout without depending on legacy hook assumptions.
PROGRAM.mdfor autonomous work — Autodev and evolve flows now share a concrete program contract instead of relying on looser ad hoc context.- Artifact-aware long RPI runs — Mission control now shows run artifacts and evaluator output so you can inspect what happened during multi-phase autonomous runs.
- More reliable release validation — Headless runtime checks, reverse-engineer hygiene, and release-gate coverage are more deterministic.
All Changes
Added
- Hookless Codex lifecycle support across CLI commands and skill orchestration
- A first-class
PROGRAM.mdcontract for autodev and evolve-driven workflows - Artifact and evaluator visibility for long-running RPI sessions
Changed
- Codex bundle maintenance, lifecycle guidance, and release validation coverage around the expanded Codex execution path
Fixed
- Codex RPI scope and closeout issues that caused follow-up and validation drift
- Release-gate regressions in headless runtime validation and learning coherence
- Reverse-engineer repo scans so generated or temporary trees no longer contaminate detected CLI surfaces
Full changelog
Added
- Codex hookless lifecycle support —
ao codexruntime commands, lifecycle fallback, and Codex skill orchestration now cover hookless sessions end to end - PROGRAM.md autodev contract — Added a first-class
PROGRAM.mdcontract for autodev flows and taught/evolveand related RPI paths to use it - Long-running RPI artifact visibility — Mission control now exposes run artifacts and evaluator output so long-running RPI sessions are replayable and easier to inspect
Changed
- Codex runtime maintenance flow — Refreshed Codex bundle hashes, lifecycle guards, runtime docs, and release validation coverage around the expanded Codex execution path
Fixed
- Codex RPI scoping and closeout — Tightened objective scope, epic scope, closeout ownership, and validation gaps in the Codex RPI lifecycle
- Release gate reliability — Restored headless runtime coverage, runtime-aware Claude inventory checks, and release-gate coherence validation
- Reverse-engineer repo hygiene — Repo-mode reverse engineer now ignores generated and temp trees when identifying CLI and module surfaces
Full Changelog: v2.29.0...v2.30.0
v2.29.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.29.0 — Config control, broader search, and stronger flywheel proof
Highlights
AgentOps now gives you more control over model spend, a broader default search path, and a tighter proof path for the knowledge flywheel. You can assign agent models by cost tier through config, ao search now pulls from both repo-local knowledge and upstream session history, and the flywheel claim is backed by deterministic proof fixtures instead of manual spot checks.
What's New
- Per-agent model routing —
ao confignow supports model cost tiers and direct config writes, so teams can tune quality and spend without manual file edits. - Broader default search —
ao searchnow brokers across upstreamcasshistory and repo-local AgentOps artifacts instead of making you choose one surface up front. - Stronger flywheel evidence — Close-loop validation now preserves research provenance and uses executable proof fixtures plus artifact-specific citation feedback.
- Richer review guidance — Council, research, swarm, vibe, athena, and post-mortem picked up new reference packs for reviewer routing, retrieval patterns, and write-time quality checks.
All Changes
Added
- Model cost tiers and direct config writes for per-agent routing
- Search brokerage across session history and repo-local knowledge
- New reference packs for reviewer routing, iterative retrieval, confidence scoring, conflict recovery, and write-time quality
Changed
- Comparison docs, command docs, and release smoke coverage around the expanded search and config surface
Fixed
- Flywheel proof, citation feedback, and closure reporting now agree on actual state
- Search stays aligned with forged session history and fallback behavior
- Pre-push and release validation is more deterministic under hook-launched git environments
- Council profile docs are synced between source and checked-in Codex artifacts
Full changelog
Added
- Model cost tiers and config writes —
ao configcan now assign per-agent models by cost tier and persist repo configuration changes directly - Search brokerage over session history and repo knowledge —
ao searchnow wraps upstreamcassresults with repo-local AgentOps artifacts by default - Reviewer and post-mortem reference packs — Added model-routing, iterative-retrieval, confidence-scoring, write-time-quality, and conflict-recovery guidance across council, research, swarm, vibe, athena, and related skills
Changed
- Competitive comparison and CLI docs — Refreshed comparison docs, release smoke coverage, and command documentation around the expanded search/config surface
Fixed
- Flywheel proof and citation loop — Added deterministic proof fixtures, preserved exact research provenance, and made citation feedback artifact-specific so flywheel health reflects real closure state
- Search alignment with forged session history — Search now stays aligned with forged session artifacts and fallback behavior
- Hook-launched validation — Pre-push and release gates now isolate inherited git env/stdin correctly and cover newer hook scripts in integration tests
- Codex council profile parity — Source and checked-in Codex council docs are back in sync for the shared profile contract
Full Changelog: v2.28.0...v2.29.0
v2.28.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.28.0 — Competitive Feature Integration
Five features adopted from reverse-engineering GSD v1.27 and Compound Engineer v2.47:
Highlights
- Smarter failure recovery — Crank now classifies failures and auto-recovers (retry, decompose, or escalate) instead of blindly retrying
- Knowledge stays clean — Athena defrag runs at every session end, pruning stale artifacts automatically
- Per-project review config — Drop a
.agents/reviewer-config.mdto control which council judges run - Right-sized plans — Plans auto-scale detail level (minimal/standard/deep) based on complexity
- Red-team your ideas — Brainstorm now stress-tests every approach before you choose
All Changes
See CHANGELOG.md for the complete list.
Full changelog
Added
- Node repair operator — Crank now classifies task failures as RETRY (transient), DECOMPOSE (too complex), or PRUNE (blocked) with budget-controlled recovery
- Knowledge refresh auto-trigger — Lightweight athena defrag runs automatically at session end via new SessionEnd hook
- Configurable review agents — Project-level
.agents/reviewer-config.mdcontrols which judge perspectives council and vibe spawn - Three-tier plan detail scaling — Plan auto-selects Minimal, Standard, or Deep templates based on issue count and complexity
- Adversarial ideation — Brainstorm Phase 3b stress-tests each approach with four red-team questions before user selection
Fixed
- Crank SKILL.md line limit — Consolidated duplicate References sections to stay under 800-line skill lint limit
- Codex skill parity — Synced all five competitive features to skills-codex with reference file copies
Full Changelog: v2.27.1...v2.28.0
v2.27.1
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.27.1 — Hotfix: Flywheel golden signals now visible by default
The flywheel status was telling you everything was fine while hiding the full picture behind an opt-in flag. ao flywheel status said "COMPOUNDING" but the golden signals analysis (hidden behind --golden) said "accumulating." Now golden signals always compute and display — no more misleading status.
What changed
- Golden signals always shown —
ao flywheel statusnow includes the four golden signals (velocity trend, citation pipeline, research closure, reuse concentration) and the overall verdict in every output format (table, JSON, YAML). --goldenflag deprecated — Kept for backward compatibility but now a no-op (hidden from help).
Full changelog
See CHANGELOG.md for complete details.
Full changelog
Fixed
- Flywheel golden signals always shown — Golden signals were gated behind
--goldenflag, causingao flywheel statusto report "COMPOUNDING" while the hidden golden signals analysis showed "accumulating". Golden signals now compute and display by default.
Full Changelog: v2.27.0...v2.27.1
v2.27.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
The knowledge flywheel now tells you whether it's actually working. Four golden signals answer the question every agent operator asks: is my knowledge compounding, or just collecting dust?
ao flywheel status --goldenWhat's New
Golden Signals for Flywheel Health
Four health indicators that go beyond escape velocity (σρ > δ):
| Signal | Question It Answers |
|---|---|
| Velocity Trend | Is σρ−δ increasing over time, or sliding back? |
| Citation Pipeline | Are citations actually delivering value, or just noise? |
| Research Closure | Is research being mined into learnings, or hoarded? |
| Reuse Concentration | Is the whole knowledge pool active, or just a few items? |
Each signal produces a verdict. Three or more healthy signals = compounding. Three or more critical = decaying. Mixed = accumulating — you know what to fix.
Forge-to-Pool Bridge
Forge now auto-writes pending learnings to .agents/knowledge/pending/ — closing the last manual gap in the flywheel loop. Knowledge flows from session → forge → pool → learnings → inject without intervention.
Session-Start Citation Priming
ao lookup runs at session start, surfacing relevant knowledge and creating the citation events that drive the feedback loop.
All Changes
Added
- Flywheel golden signals (
ao flywheel status --golden) - Forge-to-pool bridge for close-loop knowledge ingestion
- SessionStart citation priming via
ao lookup - Skill catalog quality improvements (descriptions, extraction, references)
Fixed
.agents/.gitignorescope — replaced broad!*/with explicit subdirectory list- Codex runtime skill parity hardening
- Codex install smoke test assertions
Changed
- CLI reference docs regenerated
Full Changelog: v2.26.1...v2.27.0
v2.26.1
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.26.1 — DAG-ify orchestrator skills
Hotfix: /rpi was stopping after implementation (Phase 2) without running validation (Phase 3). The execution steps were spread across prose sections with ### headings that created natural LLM stopping points.
Highlights
- RPI now runs all three phases reliably. The execution sequence for
/rpi,/discovery, and/validationis encoded as a compact DAG code block — no section breaks between steps, no natural stopping points for the LLM. - -577 lines across 6 skill files (3 source + 3 codex variants). Less prose, more program.
What's New
Fixed
/rpistops after Phase 2 — restructured as compact DAG/discoveryand/validationrestructured to match- Test patterns updated for new heading format
Changed
- GOALS.md rebuilt from first principles
- README leads with moats, progressive disclosure
- CLI reference docs regenerated
- Doctor + findings helper test coverage added
Full changelog
See CHANGELOG.md for the complete v2.26.1 entry.
Full changelog
Fixed
- RPI stops after Phase 2 — Restructured rpi, discovery, and validation orchestrator skills as compact DAGs with execution sequence in a single code block; eliminates LLM stopping between phases due to
###section headings acting as natural breakpoints - Test grep patterns for DAG headings — Updated
test-tuning-defaults.shto match new complexity-scaled gate headings after DAG restructure
Changed
- Goals reimagined — GOALS.md rebuilt from first principles with fitness gate fixes
- README progressive disclosure — Lead with moats, collapse detail into expandable sections
- CLI reference docs — Regenerated with updated date stamps
- Doctor + findings helpers — Added CLI test coverage for extracted helpers
Full Changelog: v2.26.0...v2.26.1
v2.26.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.26.0 Release Notes
Highlights
- Test pyramid expanded to BF1–BF9 — Four new bug-finding levels cover regression replay, performance benchmarks, backward compatibility, and security-in-test patterns
- Language-specific test patterns — Go and Python standards now include concrete examples for every new BF level
- Codex audit: 60+ fixes — Orphaned references removed, lint warnings resolved, manifest hashes regenerated across all 54 Codex skills
What's New
- BF6 (Regression): Bug-specific replay tests with ID-based naming (
TestBug_AG_XYZ_.../test_bug_ag_xyz_...) - BF7 (Performance): Benchmark patterns using Go
testing.Band Pythonpytest-benchmark - BF8 (Backward Compatibility): Fixture corpus approach with
testdata/compat/(Go) andtests/fixtures/compat/(Python) - BF9 (Security): In-test secrets redaction and path traversal rejection patterns
- Decision tree extended with 4 new routing questions
- RPI phase mapping updated: bug fix mandates BF6, hot-path mandates BF7, format changes mandate BF8, secrets mandate BF9
regen-codex-hashes.shscript for Codex manifest maintenance
All Changes
Full changelog
Added
- BF6–BF9 test pyramid levels with language-specific Go and Python patterns
- Test pyramid decision tree expansion (4 new routing questions)
- RPI phase mapping for BF6–BF9
regen-codex-hashes.shmanifest hash regeneration script
Changed
- Go standards: benchmark, backward compat, regression, security test patterns
- Python standards: Hypothesis, pytest-benchmark, compat fixtures, regression, security patterns
- Coverage assessment template extended from BF1–BF5 to BF1–BF9
Fixed
- Codex skill audit: 60+ findings across 54 skills
- Skill lint warnings in crank, rpi, recover
- README skill references and orphaned templates
- Skill linter refs in reverse-engineer-rpi
Full Changelog: See CHANGELOG.md
Full changelog
Added
- BF6–BF9 test pyramid levels — Regression (bug-specific replay), Performance/Benchmark, Backward Compatibility, and Security (in-test) bug-finding levels with language-specific patterns for Go and Python
- Test pyramid decision tree expansion — 4 new routing questions for BF6–BF9 in the "When to Use" guide
- RPI phase mapping for BF6–BF9 — Bug fix → BF6 mandatory, hot-path → BF7 benchmark, format change → BF8 compat fixture, secrets → BF9 redaction tests
regen-codex-hashes.sh— Manifest hash regeneration script for Codex skill maintenance
Changed
- Go standards — Added benchmark tests (BF7), backward compat with
testdata/compat/(BF8), regression test naming convention (BF6), security tests for path traversal (BF9) - Python standards — Added Hypothesis property-based testing (BF1),
pytest-benchmarkpatterns (BF7), backward compat with parametrized fixtures (BF8), regression test naming (BF6), secrets redaction tests (BF9) - Coverage assessment template — Extended BF pyramid table from BF1–BF5 to BF1–BF9
Fixed
- Codex skill audit — 60+ findings fixed across all 54 Codex skills; removed orphaned
claude-code-latest-features.mdandclaude-cli-verified-commands.mdreferences - Skill lint warnings — Resolved all warnings in crank, rpi, recover skills
- README skill references — Corrected broken references and linked orphaned templates
- Skill linter refs — Fixed directory reference and backtick formatting in reverse-engineer-rpi
Full Changelog: v2.25.1...v2.26.0