fix(mcp): centroid drift resistance and trust score decay persistence (#2311, #2312)#2325
Merged
fix(mcp): centroid drift resistance and trust score decay persistence (#2311, #2312)#2325
Conversation
…#2311, #2312) EmbeddingAnomalyGuard: switch record_clean() to adaptive EMA. During cold-start (n < min_samples) the existing running mean is preserved for fast convergence. After stabilization, alpha is clamped to ema_floor (default 0.01) so each adversarial clean sample can shift the centroid by at most 1%, bounding the boiling-frog attack surface. TrustScoreStore: persist the decayed score back to SQLite in load() when decay > epsilon. Prevents apply_delta() from operating on the stale pre-decay value. load_all() applies decay without persisting (display path; documented trade-off). Consecutive load() calls within the same second skip the UPDATE. Follow-up: #2322 (ema_floor config validation), #2323 (load-before-delta in probe paths).
a0fdd7c to
39b92c7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EmbeddingAnomalyGuard::record_clean()now uses adaptive EMA to resist boiling-frog centroid drift. During cold-start (n < min_samples) the existing running mean is preserved. After stabilization, alpha is clamped toema_floor(default0.01), limiting each adversarial clean sample's influence to at most 1% of the centroid shift.TrustScoreStore::load()now persists the decayed score back to SQLite when decay > epsilon, ensuringapply_delta()always operates on the true current value rather than the stale pre-decay score.Changes
crates/zeph-mcp/src/embedding_guard.rs: adaptive EMA inrecord_clean(),ema_floorfield + constructor parametercrates/zeph-mcp/src/trust_score.rs: decay write-back inload(), decay applied inload_all()without persistingcrates/zeph-config/src/sanitizer.rs:ema_floor: f32field onEmbeddingGuardConfigwith serde default0.01CHANGELOG.md: entries under[Unreleased]Test plan
cargo nextest run -p zeph-mcp --lib— 283 tests pass (includes 6 new tests)cargo nextest run --workspace --features full --lib --bins— 6915/6915 passcargo +nightly fmt --check— cleancargo clippy --workspace --features full -- -D warnings— zero warningsFollow-up issues filed
ema_floorconfig field(0.0, 1.0]apply_delta()in probe paths operates on stale pre-decay score (load-before-delta gap)Closes #2311
Closes #2312