-
Notifications
You must be signed in to change notification settings - Fork 2
research: agent skill trust governance — 4-tier model, 26% community skills have vulnerabilities (arXiv:2602.12430) #2233
Description
Source
arXiv:2602.12430 — "Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward" (Feb 2026 survey)
Summary
Comprehensive survey of SKILL.md-style skill packaging, MCP integration, RL-based skill acquisition, and a four-tier trust governance model (Trusted/Sandboxed/Untrusted/Quarantined). Critically finds 26.1% of community skills contain vulnerabilities — prompt injection, capability escalation, or data exfiltration vectors.
Applicability to Zeph
HIGH — zeph-skills registry and [skills.trust] config (directly Zeph's architecture).
Zeph already has [skills.trust] with default_level, local_level, hash_mismatch_level. The paper's 4-tier model maps to Zeph's existing enum. Key actionable findings:
- 26.1% vulnerability rate in community skills → Zeph's
quarantineddefault level for external skills is validated as necessary - Static analysis for skill source provenance → extend
scan_on_loadwith lightweight injection pattern detection - Capability escalation detection → validate that loaded skill YAML doesn't request elevated tool permissions not in its declared scope
Implementation Direction
- Extend
[skills.trust]scan with injection pattern checks on skill body (not just hash) - Add provenance tracking: source URL/git hash in skill metadata
- Consider adding a
[skills.trust.scanner]config section for pattern-based skill analysis
Priority: P2
Discovered: CI-211 research scan (2026-03-27)