[SEC-1.2] ContentSanitizer with injection pattern detection

Part of #1195 — Phase 1

Implement content sanitizer that detects injection patterns and wraps untrusted content with spotlighting delimiters.

**Crates**: zeph-core
**Depends on**: SEC-1.1

**Tasks**:
- [ ] `ContentSanitizer` struct with configurable max size, pattern list, markup stripping
- [ ] Injection pattern detection (flag-based, not blocking):
  - `ignore (all|any|previous|prior)?instructions`
  - `you are now`, `new (instructions|directive|role|persona)`
  - `developer mode`, `system prompt`
  - `reveal|show|display your (instructions|prompt|rules)`
  - Base64-encoded variants
  - Homoglyph/unicode substitution attempts
- [ ] Spotlighting wrapper: `<external-data source="..." trust="...">` with explicit "treat as data" preamble
- [ ] Size truncation, null byte stripping, control character removal
- [ ] `InjectionFlag` struct: pattern matched, confidence, location in content
- [ ] Unit tests with OWASP injection payload corpus (20+ test cases)

**Files**: `crates/zeph-core/src/sanitizer.rs`, `crates/zeph-core/src/sanitizer/patterns.rs` (new)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SEC-1.2] ContentSanitizer with injection pattern detection #1197

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[SEC-1.2] ContentSanitizer with injection pattern detection #1197

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions