Skip to content

[SEC-3.1] QuarantinedSummarizer for high-risk sources #1204

@bug-ops

Description

@bug-ops

Part of #1195 — Phase 3

Implement quarantined LLM summarizer (Dual LLM pattern) for extracting facts from high-risk external content without exposing the main agent to raw untrusted data.

Crates: zeph-core
Depends on: SEC-1.4

Tasks:

  • QuarantinedSummarizer struct wrapping Arc<dyn LlmProvider>
  • System prompt: data extraction only, explicit instruction to ignore embedded directives
  • Temperature 0, capped max_tokens, no tools available
  • extract_facts(content: &SanitizedContent, prompt: &str) -> Result<String>
  • Config:
    [security.content_isolation.quarantine]
    enabled = false
    sources = ["web_scrape", "a2a"]
    model = "haiku"
    max_tokens = 1024
  • Wire into sanitization pipeline as optional post-processing step
  • --init wizard: quarantine toggle and source selection
  • Unit tests with mock provider, integration test with real content
  • Metrics: quarantine invocation count, avg latency (via MetricsCollector)

Files: crates/zeph-core/src/sanitizer/quarantine.rs (new), crates/zeph-core/src/config/types.rs

Metadata

Metadata

Assignees

No one assigned

    Labels

    corezeph-core cratellmzeph-llm crate (Ollama, Claude)priority/mediumMedium prioritysecuritySecurity-related issuesize/LLarge PR (201-500 lines)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions