-
Notifications
You must be signed in to change notification settings - Fork 2
bug(tools): adversarial policy falsely denies internal agent operations (memory_save, write) #2469
Copy link
Copy link
Closed
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't workingmemoryzeph-memory crate (SQLite)zeph-memory crate (SQLite)securitySecurity-related issueSecurity-related issue
Description
Summary
When [tools.adversarial_policy] is enabled with a policy like "Do not write files", the gate denies memory_save — an internal agent operation, not a user-visible file write. This breaks core agent functionality.
Repro
Policy file:
Do not write files.
Agent attempts memory_save → denied:
command blocked by policy: [adversarial] Tool call denied by policy
Expected behavior
Adversarial policy should target externally visible side effects (filesystem writes, network calls, destructive shell commands). Internal agent state mutations (memory_save, read_overflow, load_skill, etc.) should be exempt by default.
Impact
- Memory persistence broken silently
- Policy "Do not write files" has unintuitive scope — users expect it to restrict
writetool, notmemory_save - Causes silent data loss: agent says "I couldn't save" but the user may not notice
Fix sketch
- Add an exemption list of always-allowed tool names to
AdversarialPolicyConfig:exempt_tools = ["memory_save", "memory_search", "read_overflow", "load_skill", "schedule_deferred", ...] - Or: mark certain tool categories as internal in
ToolDefinitionand skip gate for them - Related: LLM serialization gate in CI — policy prompts should clarify "user-facing file operations" scope
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P2High value, medium complexityHigh value, medium complexitybugSomething isn't workingSomething isn't workingmemoryzeph-memory crate (SQLite)zeph-memory crate (SQLite)securitySecurity-related issueSecurity-related issue