Skip to content

security: OWASP AI Agent Security Cheat Sheet 2026 gap analysis #1650

@bug-ops

Description

@bug-ops

Summary

The OWASP AI Agent Security Cheat Sheet (2026 edition) identifies several controls that Zeph partially or fully lacks.

Identified Gaps

Gap 1: Output schema validation before memory writes

Tool results are written to memory (via memory_save or auto-extraction) without schema/type validation. A compromised or malformed tool response could inject arbitrary data into the agent's memory, affecting future recall and behavior.

Current state: ExfiltrationGuard blocks outbound exfiltration from memory writes, but does not validate the structure of tool results before they are stored.

Recommendation: Add a lightweight schema check (expected field types, size bounds) on tool results before they reach memory_save or the graph extractor.

Gap 2: PII filtering on outbound tool results

Tool outputs (shell, web scrape, file reads) are passed to the LLM and potentially logged to debug dumps without PII scrubbing. Sensitive data (emails, phone numbers, API keys embedded in files) can leak into logs, debug artifacts, and memory.

Current state: ContentSanitizer sanitizes LLM-sourced content. No equivalent pipeline for tool-output PII.

Recommendation: Apply a configurable PII filter (regex patterns for common PII: email, phone, SSN, credit card, secret key patterns) to tool results before they enter the context and before they are written to debug dumps.

Gap 3: Agent action rate limiting

Shell sandbox and bearer auth have rate limiting, but there is no per-session or per-tool cap on agent-initiated actions (tool calls, memory writes, sub-agent spawns) within a single turn or session. A runaway or adversarially prompted agent can exhaust resources or make unbounded external calls.

Current state: max_tool_iterations limits total tool calls per turn, but no rate limiter for external actions (HTTP, shell, sub-agent spawns) across time.

Recommendation: Add a configurable rate limiter for tool invocations (calls/minute per tool category: shell, web, memory) with a circuit-breaker that halts the agent and surfaces a warning when the limit is exceeded.

Priority

Medium — security hardening, no immediate exploit vector in typical deployments but relevant for multi-tenant or public-facing configurations.

Labels

security, enhancement

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsecuritySecurity-related issue

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions