feat(core): add multi-language FeedbackDetector support (#1424)#2060
Merged
feat(core): add multi-language FeedbackDetector support (#1424)#2060
Conversation
Replace 3 separate LazyLock<Vec<Regex>> statics with a single LangPatterns struct using Vec<(Regex, f32)> tuples for per-pattern confidence. Add regex patterns for Russian, Spanish, German, French, Chinese, and Japanese across all three correction categories (rejection, alternative, self-correction). Dual anchoring strategy: anchored patterns at base confidence (0.85/0.80/0.70), unanchored mid-sentence patterns at -0.10 (0.75/0.70/0.65). Principle 1 applied consistently: no bare ambiguous short words (Spanish "no", French "non", English "no" when followed by neutral content) without a rejection qualifier. Add 132 tests covering 7 languages x 3 categories with positive and negative (false-positive guard) cases including all mandatory cases from architect-1424.md.
H-1: Fix Japanese chigau test assertion — replace silent if-let with explicit assert!(signal.is_some()) documenting the known CJK word segmentation limitation. M-1: Add comment explaining English unanchored patterns retain 0.85 (multi-word guards carry equal signal strength regardless of position). M-2: Add 5 English "no" tightening guard tests — no worries/problem/thanks must not match; bare "no" and "no." must match. docs: add CHANGELOG.md entry under [Unreleased] ### Added for #1424.
Replace Vec::new() with Vec::with_capacity() in build_*_patterns() functions to avoid clippy warning 'calls to push immediately after creation'. Estimated capacities: rejection (25), alternative (20), self_correction (20).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add multi-language support to
FeedbackDetectorfor self-learning feedback detection across Russian, Spanish, German, French, Chinese (Simplified), and Japanese.Changes
LazyLock<Vec<Regex>>withLangPatternsstruct usingVec<(Regex, f32)>tuples for per-pattern confidenceArchitecture
crates/zeph-core/src/agent/feedback_detector.rs(+887 lines)Test Results
Known Limitations
Validation
Closes #1424