research(tools): NabaOS tool receipt layer — 94.2% hallucination detection at <15ms overhead (arXiv:2603.10060)

## Summary

NabaOS (arXiv:2603.10060, March 2026) — a verification layer that categorizes every agent claim by its epistemic source (direct tool output, inference, external testimony, absence, unsupported opinion). Cryptographic receipts are generated per tool call, preventing the LLM from fabricating or misrepresenting tool results. Tested on 1,800 scenarios across 4 languages: **94.2% detection of fabricated tool references**, 87.6% count misstatements, 91.3% false absence claims — all under **15ms overhead** vs 180,000ms for ZK-proof alternatives.

## Applicability to Zeph

Direct fit for `zeph-tools`. Zeph's `ToolExecutor` and audit log already capture tool call results. Adding an epistemically-tagged receipt wrapper:

1. **Fabrication prevention**: tag each tool result with its provenance (real output vs. LLM inference), block untrusted claims from entering the context as facts
2. **Audit trail enrichment**: the existing `.local/testing/data/audit-test.jsonl` audit log could carry claim-source tags per tool invocation
3. **Low overhead**: <15ms is well within the tool execution latency budget; no model change required

## Implementation Sketch

- **SHORT term** (LOW): add `claim_source: ClaimSource` enum to `ToolResult` — variants: `DirectOutput`, `Cached`, `LlmInference`, `NotFound`
- **MEDIUM term**: expose `claim_source` in debug dumps and audit log; use it in `ContentSanitizer` to downgrade trust of `LlmInference` results
- **LONG term**: implement lightweight receipt hashing to detect cross-turn result substitution

## Related

Complements existing security pipeline: ContentSanitizer (injection flags) + ExfiltrationGuard + PiiFilter. Adds a fourth layer for output provenance.

Source: https://arxiv.org/abs/2603.10060

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(tools): NabaOS tool receipt layer — 94.2% hallucination detection at <15ms overhead (arXiv:2603.10060) #2266

Summary

Applicability to Zeph

Implementation Sketch

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(tools): NabaOS tool receipt layer — 94.2% hallucination detection at <15ms overhead (arXiv:2603.10060) #2266

Description

Summary

Applicability to Zeph

Implementation Sketch

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions