feat(skills): SAGE RL reward signal, trust governance, SkillsBench constraints#2348
Merged
feat(skills): SAGE RL reward signal, trust governance, SkillsBench constraints#2348
Conversation
…nstraints (#2232, #2233, #2261) Issue #2232 — cross-session rollout tracking for skill promotion/demotion: - Add `cross_session_rollout` and `min_sessions_before_promote` config fields - Add separate `min_sessions_before_demote` (default 1) to prevent symmetric misuse - Add `distinct_session_count` SQL query on skill_outcomes using existing conversation_id - Migration 048: composite index on skill_outcomes(skill_name, conversation_id) - Guard both promotion and demotion in check_trust_transition when enabled Issue #2233 — skill trust governance and security scanning: - Add `ScannerConfig` nested under TrustConfig with `injection_patterns` and `capability_escalation_check` fields - Add `check_capability_escalation` in scanner.rs: validates allowed_tools against QUARANTINE_DENIED list for Quarantined/Blocked trust levels - Add `EscalationResult` type and `check_escalations` registry method - Wire escalation check into bootstrap when capability_escalation_check is enabled - Add provenance fields `source_url` and `git_hash` to SkillMeta (x-source-url, x-git-hash frontmatter keys) - Migration 047: git_hash column in skill_trust table Issue #2261 — SkillsBench section cap and domain evaluation gate: - Add `max_auto_sections` config field (default 3): caps auto-generated skill bodies at 3 H2 sections via validate_body_sections() - Add `domain_success_gate` config field (default false): LLM-based domain relevance check before activating auto-generated skill versions - Add DOMAIN_GATE_PROMPT_TEMPLATE, DomainGateResult type, build_domain_gate_prompt() - Add section limit instruction to IMPROVEMENT_PROMPT_TEMPLATE All new config fields wired into --init wizard and auto-migrated via default.toml. 26 new unit tests across zeph-config, zeph-skills, zeph-memory, zeph-core. Closes #2232, #2233, #2261
This was
linked to
issues
Mar 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements three research-driven improvements to
zeph-skills(#2232, #2233, #2261).COUNT(DISTINCT conversation_id)onskill_outcomes; newmin_sessions_before_promote(default 2) andmin_sessions_before_demote(default 1) config fields allow asymmetric tuning; migration 048 adds a composite index to prevent table scansScannerConfigwithinjection_patternsandcapability_escalation_checkflags nested underTrustConfig;check_capability_escalation()validates declaredallowed_toolsagainstQUARANTINE_DENIEDandBlockedlevels; provenance fields (x-source-url,x-git-hash) added toSkillMeta; migration 047 storesgit_hashinskill_trustmax_auto_sectionscaps auto-generated skill bodies at 3 H2 sections viavalidate_body_sections();domain_success_gatetriggers LLM domain-relevance check before activating improved skills; section limit added toIMPROVEMENT_PROMPT_TEMPLATETest plan
cargo nextest run --workspace --lib --bins)cargo +nightly fmt --checkpasses--initwizard anddefault.tomlCloses #2232, #2233, #2261