Skip to content

research: agent skill trust governance — 4-tier model, 26% community skills have vulnerabilities (arXiv:2602.12430) #2233

@bug-ops

Description

@bug-ops

Source

arXiv:2602.12430 — "Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward" (Feb 2026 survey)

Summary

Comprehensive survey of SKILL.md-style skill packaging, MCP integration, RL-based skill acquisition, and a four-tier trust governance model (Trusted/Sandboxed/Untrusted/Quarantined). Critically finds 26.1% of community skills contain vulnerabilities — prompt injection, capability escalation, or data exfiltration vectors.

Applicability to Zeph

HIGHzeph-skills registry and [skills.trust] config (directly Zeph's architecture).

Zeph already has [skills.trust] with default_level, local_level, hash_mismatch_level. The paper's 4-tier model maps to Zeph's existing enum. Key actionable findings:

  • 26.1% vulnerability rate in community skills → Zeph's quarantined default level for external skills is validated as necessary
  • Static analysis for skill source provenance → extend scan_on_load with lightweight injection pattern detection
  • Capability escalation detection → validate that loaded skill YAML doesn't request elevated tool permissions not in its declared scope

Implementation Direction

  • Extend [skills.trust] scan with injection pattern checks on skill body (not just hash)
  • Add provenance tracking: source URL/git hash in skill metadata
  • Consider adding a [skills.trust.scanner] config section for pattern-based skill analysis

Priority: P2
Discovered: CI-211 research scan (2026-03-27)

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityresearchResearch-driven improvementskillszeph-skills crate

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions