Skip to content

fix(sanitizer): injection false positives from legitimate user queries in memory retrieval #2025

@bug-ops

Description

@bug-ops

Summary

During continuous improvement session CI-15 (2026-03-20, v0.16.0), the ContentSanitizer flagged 2 injection patterns (flags=2) when retrieving memory content that contained legitimate user queries.

Root Cause

assembly.rs::sanitize_memory_message() runs all retrieved memory messages through ContentSanitizer::detect_injections(). When a user query like:

"Use the memory_save tool to save this fact: ... Confirm when done."

is stored in memory and later retrieved via memory_search, the sanitizer flags it because it contains imperative language similar to injection patterns (e.g., instruction-like directives).

Observed Warning

WARN zeph_core::agent::context::assembly: injection patterns detected in memory retrieval flags=2

Impact

  • Severity: Low — sanitizer is advisory only (doc comment: "not a security boundary")
  • Functional impact: None — retrieval proceeds normally
  • Operational impact: Log noise; inflates sanitizer_injection_flags metric

Expected Behavior

User messages stored in memory (prior conversation turns) should not trigger injection warnings when retrieved. The sanitizer should distinguish between:

  • Actual injection: untrusted external content (web scrapes, MCP tool output, documents)
  • False positive: prior user conversation turns retrieved from SQLite

Potential Fix

  1. Reduce ContentSource specificity for memory retrieval paths — use a lower sensitivity mode
  2. Add context label to retrieved messages (e.g., role=user) to skip injection scanning for known-safe sources
  3. Filter patterns that commonly trigger on instruction-like user queries (e.g., patterns matching common verb phrases)

Evidence

  • Session: .local/testing/sessions/2026-03-20-session-ci15.md
  • Log: .local/testing/memory-ops-2026-03-20.log
  • Code: crates/zeph-core/src/agent/context/assembly.rs::sanitize_memory_message()

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2High value, medium complexitybugSomething isn't workingmemoryzeph-memory crate (SQLite)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions