Skip to content

feat(classifiers): Phase 2 — OnnxClassifier, PII detection, LlmClassifier for feedback #2200

@bug-ops

Description

@bug-ops

Context

Phase 1 (#2185, PR #2198) delivered CandleClassifier for injection detection only. Phase 2 expands the classifier infrastructure with three additional backends and two new task integrations.

Prerequisite: PR #2198 merged and live-tested. Collect real latency/FPR data from injection detection before Phase 2 architecture decisions.

Scope

1. OnnxClassifier via ort crate

  • Backend: pykeio/ort wrapping ONNX Runtime (faster than Candle for encoder inference, 3–5x on CPU)
  • Defer until ort reaches stable release (currently 2.0.0-rc.x)
  • Models: protectai/deberta-v3-base-injection-onnx, protectai/deberta-v3-base-zeroshot-v1-onnx
  • ClassifierBackend trait already object-safe — new backend is a drop-in

2. PII detection via iiiorg/piiranha-v1-detect-personal-information

  • Task: ClassifierTask::Pii (token classification / NER, not sequence classification)
  • Candle backend: candle_transformers::models::deberta_v2 already supports NER head
  • 6 languages, 17 PII types (email, phone, SSN, credit card, etc.)
  • Hybrid approach: keep regex fast path, add piiranha second pass for contextual PII
  • Extend ClassifiersConfig with pii_model and pii_threshold fields

3. LlmClassifier for FeedbackDetector

  • Task: ClassifierTask::Feedback — detect user corrections/disagreements for skill learning
  • Backend: zero-shot via existing [[llm.providers]] (gpt-4o-mini or similar)
  • Config: feedback_provider field referencing a [[llm.providers]] name
  • Replace detector_mode = "regex" with detector_mode = "model" option; keep regex as fallback
  • No labeled dataset available — bootstrap with zero-shot prompt

4. Config additions

[classifiers]
# existing Phase 1 fields ...
pii_model = "iiiorg/piiranha-v1-detect-personal-information"
pii_threshold = 0.85
pii_enabled = false
feedback_provider = ""   # references [[llm.providers]] name; empty = skip

5. TUI / observability

  • Spinner during model load (TUI rule: all background ops must show status)
  • --init wizard entries for new classifier fields
  • Classifier latency metrics in TUI metrics panel (p50/p95 per task type)

6. Model hash verification (security finding #5 from PR #2198 audit)

  • Optional injection_model_sha256 / pii_model_sha256 config fields
  • Verify downloaded safetensors against pinned hash before loading

Research dependencies

Notes

  • ort RC stability: check release status before starting — do not add RC dependency to full feature
  • Llama Guard 3-1B (2–5s CPU) remains async-only post-processing, not inline
  • Credential patterns (sk-, AKIA, ghp_, Bearer) stay regex permanently

Metadata

Metadata

Assignees

Labels

P3Research — medium-high complexityenhancementNew feature or requestllmzeph-llm crate (Ollama, Claude)researchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions