feat(skills): SAGE RL reward signal, trust governance, SkillsBench constraints by bug-ops · Pull Request #2348 · bug-ops/zeph

bug-ops · 2026-03-28T11:27:40Z

Summary

Implements three research-driven improvements to zeph-skills (#2232, #2233, #2261).

research: SAGE skill library RL — principled reward signal for skill evolution (arXiv:2512.17102) #2232 SAGE RL reward signal: cross-session rollout tracking guards auto-promote/demote decisions using COUNT(DISTINCT conversation_id) on skill_outcomes; new min_sessions_before_promote (default 2) and min_sessions_before_demote (default 1) config fields allow asymmetric tuning; migration 048 adds a composite index to prevent table scans
research: agent skill trust governance — 4-tier model, 26% community skills have vulnerabilities (arXiv:2602.12430) #2233 Skill trust governance: ScannerConfig with injection_patterns and capability_escalation_check flags nested under TrustConfig; check_capability_escalation() validates declared allowed_tools against QUARANTINE_DENIED and Blocked levels; provenance fields (x-source-url, x-git-hash) added to SkillMeta; migration 047 stores git_hash in skill_trust
research(skills): SkillsBench — curated skills +16.2pp pass rate, self-generated skills provide zero benefit (arXiv:2602.12670) #2261 SkillsBench quality gates: max_auto_sections caps auto-generated skill bodies at 3 H2 sections via validate_body_sections(); domain_success_gate triggers LLM domain-relevance check before activating improved skills; section limit added to IMPROVEMENT_PROMPT_TEMPLATE

Test plan

6442 unit tests pass (cargo nextest run --workspace --lib --bins)
cargo +nightly fmt --check passes
Clippy clean on changed crates
26 new tests covering all new code paths (session count queries, escalation checks, section validation, provenance parsing, serde roundtrips)
Migrations 047 + 048 apply cleanly to existing DB schema
All new config fields have serde defaults and appear in --init wizard and default.toml

…nstraints (#2232, #2233, #2261) Issue #2232 — cross-session rollout tracking for skill promotion/demotion: - Add `cross_session_rollout` and `min_sessions_before_promote` config fields - Add separate `min_sessions_before_demote` (default 1) to prevent symmetric misuse - Add `distinct_session_count` SQL query on skill_outcomes using existing conversation_id - Migration 048: composite index on skill_outcomes(skill_name, conversation_id) - Guard both promotion and demotion in check_trust_transition when enabled Issue #2233 — skill trust governance and security scanning: - Add `ScannerConfig` nested under TrustConfig with `injection_patterns` and `capability_escalation_check` fields - Add `check_capability_escalation` in scanner.rs: validates allowed_tools against QUARANTINE_DENIED list for Quarantined/Blocked trust levels - Add `EscalationResult` type and `check_escalations` registry method - Wire escalation check into bootstrap when capability_escalation_check is enabled - Add provenance fields `source_url` and `git_hash` to SkillMeta (x-source-url, x-git-hash frontmatter keys) - Migration 047: git_hash column in skill_trust table Issue #2261 — SkillsBench section cap and domain evaluation gate: - Add `max_auto_sections` config field (default 3): caps auto-generated skill bodies at 3 H2 sections via validate_body_sections() - Add `domain_success_gate` config field (default false): LLM-based domain relevance check before activating auto-generated skill versions - Add DOMAIN_GATE_PROMPT_TEMPLATE, DomainGateResult type, build_domain_gate_prompt() - Add section limit instruction to IMPROVEMENT_PROMPT_TEMPLATE All new config fields wired into --init wizard and auto-migrated via default.toml. 26 new unit tests across zeph-config, zeph-skills, zeph-memory, zeph-core. Closes #2232, #2233, #2261

bug-ops enabled auto-merge (squash) March 28, 2026 11:35

fix(init): allow too_many_lines in step_security after wizard additions

a372ab0

github-actions bot added the enhancement New feature or request label Mar 28, 2026

Merge commit 'ff649f17b4' into sage-skill-library-rl

7dc0b81

bug-ops merged commit 766d525 into main Mar 28, 2026
25 checks passed

bug-ops deleted the sage-skill-library-rl branch March 28, 2026 11:51

bug-ops mentioned this pull request Mar 28, 2026

feat(memory): All-Mem consolidation, MAGMA edge weights, RuntimeLayer hooks #2358

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): SAGE RL reward signal, trust governance, SkillsBench constraints#2348

feat(skills): SAGE RL reward signal, trust governance, SkillsBench constraints#2348
bug-ops merged 3 commits intomainfrom
sage-skill-library-rl

bug-ops commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant