feat(classifiers): add DetectorMode::Model and CandleNerClassifier (#2210, #2211)#2227
Merged
feat(classifiers): add DetectorMode::Model and CandleNerClassifier (#2210, #2211)#2227
Conversation
…2210, #2211) Add ML-backed correction detection mode and NER classifier infrastructure: - DetectorMode::Model variant in zeph-config with detector_model field in LearningConfig (mirrors judge_model pattern, preserves Copy on enum) - CandleNerClassifier in zeph-llm/src/classifier/ner.rs: DeBERTa-v2 NER with BIO/BIOES span decoding, chunk-based inference, overlap deduplication - NerSpan struct (label, score, start, end as char offsets) and spans field on ClassificationResult; all existing literal sites updated with spans: vec![] - FeedbackDetector::detect_with_model() async method behind classifiers feature - FeedbackState::model_backend field; Agent::with_feedback_classifier() builder - Bootstrap wires CandleNerClassifier when detector_mode=model + classifiers.enabled - Fallback to regex with warn!() when backend is absent (C-03 from critic) - validate_detector_model_config() on Bootstrap; ner_model in ClassifiersConfig - --init wizard updated with model option and detector_model prompt - config/default.toml documents new fields in [skills.learning] and [classifiers]
…sors duplicate - Remove duplicated validate_safetensors from candle.rs; use shared ner::validate_safetensors - Add detect_with_model() tests: positive signal, low score, negative result, error swallowed - Add DetectorMode::Model serde roundtrip tests in zeph-config - Add validate_detector_model_config() tests: model+non-empty, model+empty, regex, judge
…eduplication After IC-01 fix removed the private CandleClassifier::validate_safetensors method, test references in candle.rs were not updated during the rebase. Update all 5 test functions to call crate::classifier::ner::validate_safetensors and gate them with #[cfg(feature = "classifiers")] since the ner module requires that feature.
…eature The method is only called from code guarded by #[cfg(feature = "classifiers")]. Without the gate, cargo check --features chat/ide/desktop/server raises dead_code error since no call site exists in those feature sets.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DetectorMode::Modelvariant toFeedbackDetector, enabling offline CPU-based feedback detection via anyClassifierBackendwithout an LLM round-trip (feat(core): add DetectorMode::Model for classifier-backed feedback detection #2210)CandleNerClassifierwrappingDebertaV2NERModelwith full BIO/BIOES span decoding for token-level NER/PII classification (piiranha default) (feat(classifiers): add NER/PII token-level classifier backend (piiranha) #2211)ClassificationResultwithspans: Vec<NerSpan>(char offsets); all 9 existing literal sites updatedvalidate_safetensorsdeduplicated fromcandle.rsinto sharedner::validate_safetensors(pub(crate))warn!()whenclassifiersfeature is off or backend absent (C-03)validate_detector_model_config()at bootstrap validates non-emptydetector_modelwhen mode=Model--initwizard updated with 3-optiondetector_modeselectTest plan
cargo +nightly fmt --checkpassescargo clippy --workspace --features full -- -D warningspassescargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins— 6696/6696 passed (+100 vs base)detect_with_model()(4 branches),DetectorMode::Modelserde roundtrip (3),validate_detector_model_config()(4),decode_bio_spansedge cases (8),validate_safetensors(5)Closes #2210
Closes #2211