Skip to content

[SEC-1.2] ContentSanitizer with injection pattern detection #1197

@bug-ops

Description

@bug-ops

Part of #1195 — Phase 1

Implement content sanitizer that detects injection patterns and wraps untrusted content with spotlighting delimiters.

Crates: zeph-core
Depends on: SEC-1.1

Tasks:

  • ContentSanitizer struct with configurable max size, pattern list, markup stripping
  • Injection pattern detection (flag-based, not blocking):
    • ignore (all|any|previous|prior)?instructions
    • you are now, new (instructions|directive|role|persona)
    • developer mode, system prompt
    • reveal|show|display your (instructions|prompt|rules)
    • Base64-encoded variants
    • Homoglyph/unicode substitution attempts
  • Spotlighting wrapper: <external-data source="..." trust="..."> with explicit "treat as data" preamble
  • Size truncation, null byte stripping, control character removal
  • InjectionFlag struct: pattern matched, confidence, location in content
  • Unit tests with OWASP injection payload corpus (20+ test cases)

Files: crates/zeph-core/src/sanitizer.rs, crates/zeph-core/src/sanitizer/patterns.rs (new)

Metadata

Metadata

Assignees

No one assigned

    Labels

    corezeph-core cratepriority/highHigh prioritysecuritySecurity-related issuesize/MMedium PR (51-200 lines)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions