Skip to content

test(candle): add integration tests for Candle-backed classifier models #2190

@bug-ops

Description

@bug-ops

Context

Issue #2185 proposes replacing regex heuristics with Candle-backed lightweight classifiers. The research (CI-178/CI-180) confirmed:

  • candle_transformers::models::deberta_v2 supports both sequence classification and NER
  • EmbedModel in candle_provider/embed.rs already covers 90% of the boilerplate
  • parry-guard (vaporif/parry-guard) proves the pipeline works in Rust today

Required Tests

1. ClassifierModel unit tests (once implemented per #2185)

  • Load protectai/deberta-v3-small-prompt-injection-v2 via Candle deberta_v2 module
  • Score a known injection string (should exceed threshold)
  • Score a benign string (should be below threshold)
  • Verify tokenizer handles SentencePiece correctly

2. PII/NER model tests

  • Load iiiorg/piiranha-v1-detect-personal-information via Candle deberta_v2 NER mode
  • Token-level label output: verify EMAIL/PHONE/SSN entities are tagged correctly
  • Multi-language test: one English, one German PII phrase

3. Performance benchmarks (criterion)

  • Candle vs ort path latency for injection detection
  • Cold start (model load) vs warm inference latency
  • CPU-only target (no Metal/CUDA for CI)

4. Integration with FeedbackDetector

  • detector_mode = "model" routes through zero-shot provider path
  • Fallback to regex on provider timeout

5. Regression tests for #2189

Notes

  • Tests requiring model downloads (#[ignore]) — run manually or in a separate CI job with HF cache
  • CPU-only inference must complete in < 500ms per call in CI
  • Future: many specialized lightweight models per task (strategic direction CI-178)

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions