Skip to content

research: MCPShield three-phase probe-execute-reflect trust calibration for MCP tools (arXiv:2602.14281) #2216

@bug-ops

Description

@bug-ops

Source

arXiv:2602.14281 — MCPShield: A Security Cognition Layer for Adaptive Trust Calibration in Model Context Protocol Agents (Feb 2026)

Summary

Proposes a plug-in security layer that validates MCP tool invocations via: (1) metadata-guided pre-invocation probing, (2) constrained runtime execution monitoring, (3) post-invocation reflection on historical traces — evaluated against 6 novel attack scenarios across multiple LLMs.

Applicability to Zeph

Directly applicable to zeph-mcp. The three-phase pattern maps cleanly onto existing infrastructure:

  • Pre-invocation probing → between server discovery and tool registration in McpManager
  • Runtime monitoring → fits the audit layer in zeph-tools
  • Post-invocation trace analysis → can feed into existing anomaly detection (zeph-core)

Modular plug-in design means no core protocol changes needed. Complements PR #2213 (McpTrustLevel + tool_allowlist) and partial issue #2178.

Implementation Sketch

  • Add a McpProber trait invoked before tool registration (pre-invocation phase)
  • Extend ToolAuditEvent to capture runtime execution trace
  • Add a post-invocation summarizer (small model call) that updates a per-server trust score
  • Persist trust scores in SQLite; decay over time (similar to RAPS reputation tracking)

Complexity

Medium — probing and audit hooks are straightforward; LLM-based cognition update requires a dedicated small-model call and trust-score store.

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityresearchResearch-driven improvementsecuritySecurity-related issuetoolsTool execution and MCP integration

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions