-
Notifications
You must be signed in to change notification settings - Fork 2
research(security): attack/defense landscape for agentic AI — taxonomy for #2417 and #2420 (arXiv:2603.11088) #2426
Description
Source
arXiv:2603.11088 — The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey (2026-03-11)
Technique
First systematic survey of agent attack vectors and defenses across the full agent stack:
Attack taxonomy:
- Prompt injection (direct + indirect)
- Tool hijacking (malicious MCP server responses)
- Privilege escalation (agent acquiring unintended capabilities)
- Memory poisoning (corrupting stored context)
- Multi-agent cascade attacks (compromising one agent to propagate to others)
Defense taxonomy:
- Least-privilege sandboxing
- Containerized tool execution
- Input/output sanitization layers
- Per-tool access tokens (ephemeral credentials)
- Cross-agent message signing
Applicability to Zeph
Grounds #2417 (formal security model): The paper's structured taxonomy directly maps onto the Task/Action/Source/Data alignment framework. Use this as the reference survey when implementing the formal model.
Grounds #2420 (MCP tool trust metadata): The per-tool access token pattern is exactly what #2420 proposes — this paper validates the approach with empirical evidence across multiple agent systems.
ContentSanitizer / ExfiltrationGuard review: The memory poisoning and tool hijacking attack classes should be audited against current Zeph defenses. The survey may reveal gaps not yet covered by ContentIsolation or ExfiltrationGuard.
Confused-deputy analysis: The survey explicitly covers multi-agent privilege escalation — relevant when Zeph acts as both ACP server and MCP client simultaneously (the confused-deputy scenario from arXiv:2603.12230).
Implementation sketch
Use the taxonomy as a checklist in the #2417 implementation:
- Map each attack class to an existing or missing Zeph defense
- Identify uncovered classes → new
[security]config options or guardrail rules - File sub-issues for any critical uncovered classes
Related
- research(security): formal 4-property security model for Zeph — Task/Action/Source/Data alignment audit (arXiv:2603.19469) #2417 — formal 4-property security model
- research(security): MCP tool trust/confidentiality metadata — capability labels + STPA-based data-flow policy (arXiv:2601.08012) #2420 — MCP tool trust/confidentiality metadata
- research(security): VIGIL verify-before-commit for tool output streams — 22% attack reduction, intent-anchored sanitization (arXiv:2601.05755) #2306 — VIGIL verify-before-commit for tool output streams