research(security): VIGIL verify-before-commit for tool output streams — 22% attack reduction, intent-anchored sanitization (arXiv:2601.05755)

## Paper

**arXiv:2601.05755** — *VIGIL: Defending LLM Agents Against Tool Stream Injection via Verify-Before-Commit*

## Key Finding

A verify-before-commit protocol sanitizes tool output streams against user-intent-anchored constraints. Reduces attack success rate by over 22% while more than doubling agent utility under attack compared to static baselines.

## Applicability to Zeph

- **Tool output sanitization**: Zeph's `ContentSanitizer` currently checks *inputs* (user messages). VIGIL targets *outputs* — tool results before they enter the LLM context. This is a gap: a malicious MCP tool response could inject instructions into the next LLM turn.
- **Intent anchoring**: The key insight is that sanitization should be relative to the original user intent, not absolute. A tool result that says "ignore previous instructions" is obviously suspicious; VIGIL formalizes this check.
- **Integration point**: In `agent/tool_execution/`, after calling the executor but before pushing `tool_result` into context. Check tool output against a cached representation of the user's original intent.
- **Relation to #2297**: PR #2297 (sanitize MCP tool descriptions before pruning) addresses a narrower case. VIGIL's approach is more general — covering all tool outputs at runtime.
- **Priority**: P3 — complements existing injection defense but requires implementing intent extraction/anchoring, which is non-trivial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(security): VIGIL verify-before-commit for tool output streams — 22% attack reduction, intent-anchored sanitization (arXiv:2601.05755) #2306

Paper

Key Finding

Applicability to Zeph

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(security): VIGIL verify-before-commit for tool output streams — 22% attack reduction, intent-anchored sanitization (arXiv:2601.05755) #2306

Description

Paper

Key Finding

Applicability to Zeph

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions