-
Notifications
You must be signed in to change notification settings - Fork 2
security: OWASP AI Agent Security Cheat Sheet 2026 gap analysis #1650
Description
Summary
The OWASP AI Agent Security Cheat Sheet (2026 edition) identifies several controls that Zeph partially or fully lacks.
Identified Gaps
Gap 1: Output schema validation before memory writes
Tool results are written to memory (via memory_save or auto-extraction) without schema/type validation. A compromised or malformed tool response could inject arbitrary data into the agent's memory, affecting future recall and behavior.
Current state: ExfiltrationGuard blocks outbound exfiltration from memory writes, but does not validate the structure of tool results before they are stored.
Recommendation: Add a lightweight schema check (expected field types, size bounds) on tool results before they reach memory_save or the graph extractor.
Gap 2: PII filtering on outbound tool results
Tool outputs (shell, web scrape, file reads) are passed to the LLM and potentially logged to debug dumps without PII scrubbing. Sensitive data (emails, phone numbers, API keys embedded in files) can leak into logs, debug artifacts, and memory.
Current state: ContentSanitizer sanitizes LLM-sourced content. No equivalent pipeline for tool-output PII.
Recommendation: Apply a configurable PII filter (regex patterns for common PII: email, phone, SSN, credit card, secret key patterns) to tool results before they enter the context and before they are written to debug dumps.
Gap 3: Agent action rate limiting
Shell sandbox and bearer auth have rate limiting, but there is no per-session or per-tool cap on agent-initiated actions (tool calls, memory writes, sub-agent spawns) within a single turn or session. A runaway or adversarially prompted agent can exhaust resources or make unbounded external calls.
Current state: max_tool_iterations limits total tool calls per turn, but no rate limiter for external actions (HTTP, shell, sub-agent spawns) across time.
Recommendation: Add a configurable rate limiter for tool invocations (calls/minute per tool category: shell, web, memory) with a circuit-breaker that halts the agent and surfaces a warning when the limit is exceeded.
Priority
Medium — security hardening, no immediate exploit vector in typical deployments but relevant for multi-tenant or public-facing configurations.
Labels
security, enhancement