feat(security): MCP capability attestation, trust calibration, and injection defense#2310
Merged
feat(security): MCP capability attestation, trust calibration, and injection defense#2310
Conversation
…jection defense Implements three security hardening features resolving issues #2217, #2216, #2254: - Tool attestation (zeph-mcp): operator-declared expected_tools with blake3 schema fingerprinting for drift detection; unexpected tools filtered for Untrusted/Sandboxed servers at registration time - MCPShield trust calibration (zeph-mcp): DefaultMcpProber scans resource/prompt descriptions for injection patterns on connect; TrustScoreStore persists per-server scores in SQLite with atomic delta updates and asymmetric decay (scores above 0.5 erode toward 0.5; penalized scores persist until positive evidence); AuditEntry extended with mcp_server_id, injection_flagged, embedding_anomalous fields - Injection defense (zeph-core, zeph-sanitizer): EmbeddingAnomalyGuard performs fire-and-forget cosine distance checks against per-server clean centroids with cold-start regex fallback; ResponseVerifier extended with optional verifier_provider for post-generation LLM-based instruction-following verification - Fenced-block executor bypass fixed: execute() now delegates to execute_tool_call() with full security pipeline applied to all tool invocations New config: [mcp.trust_calibration], [security.content_isolation.embedding_guard], mcp.servers[].expected_tools, security.response_verification.verifier_provider Closes #2217, #2216, #2254
This was referenced Mar 27, 2026
Closed
This was referenced Mar 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
zeph-mcp): operator-declaredexpected_toolswith blake3 schema fingerprinting; unexpected tools filtered for Untrusted/Sandboxed servers at registration timeMcpToolExecutor::execute()now validatesserver:toolagainst the registered tool list and delegates toexecute_tool_call(), applying the full sanitize→audit→policy pipeline to all fenced-block callszeph-mcp): Phase 1 —DefaultMcpProberscans resource/prompt descriptions for injection patterns on connect; Phase 2 —AuditEntrygainsmcp_server_id,injection_flagged,embedding_anomalous; Phase 3 —TrustScoreStore(SQLite-backed, asymmetric decay, atomic delta updates)zeph-mcp):EmbeddingAnomalyGuardfire-and-forget cosine-distance check against per-server clean centroid; cold-start falls back to regex detectionzeph-sanitizer):ResponseVerifierextended with optionalverifier_providerfor post-generation instruction-following checkNew config sections:
[mcp.trust_calibration]— trust calibration settings[security.content_isolation.embedding_guard]— anomaly guard settingsmcp.servers[].expected_tools— operator-declared tool allowlistsecurity.response_verification.verifier_provider— verifier model nameTest plan
cargo +nightly fmt --check— cleancargo clippy --workspace --features full -- -D warnings— cleancargo nextest run --workspace --features full --lib --bins— 6896 passedFollow-up issues (non-blocking)
mcp_server_idpopulation in non-MCP audit entriestx.send()error logging in embedding guard warm pathCloses #2217
Closes #2216
Closes #2254