Skip to content

bug(tools): adversarial policy falsely denies internal agent operations (memory_save, write) #2469

@bug-ops

Description

@bug-ops

Summary

When [tools.adversarial_policy] is enabled with a policy like "Do not write files", the gate denies memory_save — an internal agent operation, not a user-visible file write. This breaks core agent functionality.

Repro

Policy file:

Do not write files.

Agent attempts memory_save → denied:

command blocked by policy: [adversarial] Tool call denied by policy

Expected behavior

Adversarial policy should target externally visible side effects (filesystem writes, network calls, destructive shell commands). Internal agent state mutations (memory_save, read_overflow, load_skill, etc.) should be exempt by default.

Impact

  • Memory persistence broken silently
  • Policy "Do not write files" has unintuitive scope — users expect it to restrict write tool, not memory_save
  • Causes silent data loss: agent says "I couldn't save" but the user may not notice

Fix sketch

  1. Add an exemption list of always-allowed tool names to AdversarialPolicyConfig: exempt_tools = ["memory_save", "memory_search", "read_overflow", "load_skill", "schedule_deferred", ...]
  2. Or: mark certain tool categories as internal in ToolDefinition and skip gate for them
  3. Related: LLM serialization gate in CI — policy prompts should clarify "user-facing file operations" scope

Metadata

Metadata

Assignees

Labels

P2High value, medium complexitybugSomething isn't workingmemoryzeph-memory crate (SQLite)securitySecurity-related issue

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions