Skip to content

research(tools): NabaOS tool receipt layer — 94.2% hallucination detection at <15ms overhead (arXiv:2603.10060) #2266

@bug-ops

Description

@bug-ops

Summary

NabaOS (arXiv:2603.10060, March 2026) — a verification layer that categorizes every agent claim by its epistemic source (direct tool output, inference, external testimony, absence, unsupported opinion). Cryptographic receipts are generated per tool call, preventing the LLM from fabricating or misrepresenting tool results. Tested on 1,800 scenarios across 4 languages: 94.2% detection of fabricated tool references, 87.6% count misstatements, 91.3% false absence claims — all under 15ms overhead vs 180,000ms for ZK-proof alternatives.

Applicability to Zeph

Direct fit for zeph-tools. Zeph's ToolExecutor and audit log already capture tool call results. Adding an epistemically-tagged receipt wrapper:

  1. Fabrication prevention: tag each tool result with its provenance (real output vs. LLM inference), block untrusted claims from entering the context as facts
  2. Audit trail enrichment: the existing .local/testing/data/audit-test.jsonl audit log could carry claim-source tags per tool invocation
  3. Low overhead: <15ms is well within the tool execution latency budget; no model change required

Implementation Sketch

  • SHORT term (LOW): add claim_source: ClaimSource enum to ToolResult — variants: DirectOutput, Cached, LlmInference, NotFound
  • MEDIUM term: expose claim_source in debug dumps and audit log; use it in ContentSanitizer to downgrade trust of LlmInference results
  • LONG term: implement lightweight receipt hashing to detect cross-turn result substitution

Related

Complements existing security pipeline: ContentSanitizer (injection flags) + ExfiltrationGuard + PiiFilter. Adds a fourth layer for output provenance.

Source: https://arxiv.org/abs/2603.10060

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityresearchResearch-driven improvementtoolsTool execution and MCP integration

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions