research(security): attack/defense landscape for agentic AI — taxonomy for #2417 and #2420 (arXiv:2603.11088)

## Source

arXiv:2603.11088 — *The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey* (2026-03-11)

## Technique

First systematic survey of agent attack vectors and defenses across the full agent stack:

**Attack taxonomy:**
- Prompt injection (direct + indirect)
- Tool hijacking (malicious MCP server responses)
- Privilege escalation (agent acquiring unintended capabilities)
- Memory poisoning (corrupting stored context)
- Multi-agent cascade attacks (compromising one agent to propagate to others)

**Defense taxonomy:**
- Least-privilege sandboxing
- Containerized tool execution
- Input/output sanitization layers
- Per-tool access tokens (ephemeral credentials)
- Cross-agent message signing

## Applicability to Zeph

**Grounds #2417** (formal security model): The paper's structured taxonomy directly maps onto the Task/Action/Source/Data alignment framework. Use this as the reference survey when implementing the formal model.

**Grounds #2420** (MCP tool trust metadata): The per-tool access token pattern is exactly what #2420 proposes — this paper validates the approach with empirical evidence across multiple agent systems.

**ContentSanitizer / ExfiltrationGuard review**: The memory poisoning and tool hijacking attack classes should be audited against current Zeph defenses. The survey may reveal gaps not yet covered by `ContentIsolation` or `ExfiltrationGuard`.

**Confused-deputy analysis**: The survey explicitly covers multi-agent privilege escalation — relevant when Zeph acts as both ACP server and MCP client simultaneously (the confused-deputy scenario from arXiv:2603.12230).

## Implementation sketch

Use the taxonomy as a checklist in the #2417 implementation:
1. Map each attack class to an existing or missing Zeph defense
2. Identify uncovered classes → new `[security]` config options or guardrail rules
3. File sub-issues for any critical uncovered classes

## Related

- #2417 — formal 4-property security model
- #2420 — MCP tool trust/confidentiality metadata
- #2306 — VIGIL verify-before-commit for tool output streams

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(security): attack/defense landscape for agentic AI — taxonomy for #2417 and #2420 (arXiv:2603.11088) #2426

Source

Technique

Applicability to Zeph

Implementation sketch

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(security): attack/defense landscape for agentic AI — taxonomy for #2417 and #2420 (arXiv:2603.11088) #2426

Description

Source

Technique

Applicability to Zeph

Implementation sketch

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions