fix(security): add URL grounding gate to prevent fetch hallucination (#2191) by bug-ops · Pull Request #2209 · bug-ops/zeph

bug-ops · 2026-03-27T07:20:31Z

Summary

Fixes #2191 — agent was calling fetch with hallucinated URLs (e.g. https://api.anthropic.ai/v1/models) fabricated from training knowledge when asked about known entities.

Three independent root causes addressed with three-layer defense:

RC-1 (primary): fetch/web_scrape tool descriptions contained no grounding constraint and included a misleading api.example.com example that trained the LLM to construct plausible API endpoints
RC-2: System prompt ## Guidelines had no explicit rule prohibiting URL fabrication for network tools
RC-3: No pre-execution gate to verify a URL was user-provided before executing fetch

Changes

Fix 1 — Tool description hardening (`crates/zeph-tools/src/scrape.rs`)

Rewrote fetch and web_scrape descriptions to explicitly prohibit constructing or inferring URLs from entity names. Removed misleading api.example.com/data.json example.

Fix 2 — System prompt hardening (`crates/zeph-core/src/context.rs`)

Added to BASE_PROMPT_TAIL Guidelines: "Only call fetch or web_scrape with a URL that the user explicitly provided in their message or that appeared in prior tool output. Never fabricate, guess, or infer URLs from entity names."

Fix 3 — `UrlGroundingVerifier` (`crates/zeph-tools/src/verifier.rs`)

New pre-execution verifier (same pattern as DestructiveCommandVerifier and InjectionPatternVerifier):

Blocks fetch, web_scrape, and any *_fetch tool calls when the requested URL was not present in user messages
Returns: "fetch rejected: URL was not provided by the user"
user_provided_urls: Arc<RwLock<HashSet<String>>> in SecurityState — populated from each user turn via extract_flagged_urls, persists across turns, cleared on /clear
Config: [security.pre_execution_verify.url_grounding] with enabled = true default
URL matching: bidirectional prefix to handle sub-path fetches (user provides https://docs.rs/, agent fetches https://docs.rs/tokio/latest/)
Fail-open on poisoned RwLock

Fix 4 — Already present

flagged_urls.clear() between turns was already at tool_execution/legacy.rs:18.

Test plan

Unit tests: 7 new tests in verifier::tests — allow with user URL, block hallucinated URL, block when no URLs at all, allow non-guarded tool, guard *_fetch suffix, allow web_scrape with provided URL, allow prefix match
Regression: ask agent "what do you know about Anthropic?" without providing a URL → fetch must be blocked
Positive: paste https://docs.anthropic.com/ and ask to fetch it → proceeds normally
Multi-turn: provide URL in turn 1, fetch in turn 3 (no URL in turn 3) → proceeds (URL persists in session)
After /clear: URL cleared, same fetch now blocked until re-provided

Known limitation

Skill-generated fetch calls are blocked if the skill URL was not in a user message. Mitigation tracked separately.

…2191) Three-layer defense against the LLM hallucinating API endpoints and calling fetch/web_scrape with fabricated URLs: 1. Tool description hardening: fetch and web_scrape descriptions now explicitly prohibit constructing or inferring URLs from entity names, brand knowledge, or domain patterns. Removed misleading api.example.com example that was training the LLM to fabricate API endpoints. 2. System prompt hardening: add fetch/URL grounding rule to BASE_PROMPT_TAIL Guidelines section — only call fetch/web_scrape with URLs explicitly provided by the user or that appeared in prior tool output. 3. UrlGroundingVerifier: new pre-execution gate in zeph-tools/verifier.rs. Blocks fetch, web_scrape, and any *_fetch tool calls when the requested URL was not present in user messages for the session. Returns a clear error: "fetch rejected: URL was not provided by the user". - user_provided_urls: Arc<RwLock<HashSet<String>>> in SecurityState, populated from each user message via extract_flagged_urls, cleared on /clear, shared with verifier via Arc clone in builder.rs. - Config: [security.pre_execution_verify.url_grounding] enabled=true. - Guarded tools: fetch, web_scrape, plus any tool ending in _fetch. - URL matching: bidirectional prefix — covers sub-path fetches when user provided a root URL. - Fail-open on poisoned RwLock to avoid total tool outage. Fix 4 (flagged_urls.clear between turns) was already present at tool_execution/legacy.rs:18 — no change needed.

) dispatch_slash_command returns early via the handled! macro, so URL extraction at process_user_message was never reached for slash commands. Extract flagged URLs from the trimmed slash command text at the top of dispatch_slash_command so that /browse https://... and similar commands populate user_provided_urls before any early return.

github-actions bot added bug Something isn't working documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate size/L Large PR (201-500 lines) and removed bug Something isn't working labels Mar 27, 2026

test(security): add regression tests for #2191 URL grounding verifier

6ca5f20

github-actions bot added the bug Something isn't working label Mar 27, 2026

bug-ops enabled auto-merge (squash) March 27, 2026 07:32

bug-ops merged commit f1778f4 into main Mar 27, 2026
25 checks passed

bug-ops deleted the 2191-fetch-hallucination branch March 27, 2026 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): add URL grounding gate to prevent fetch hallucination (#2191)#2209

fix(security): add URL grounding gate to prevent fetch hallucination (#2191)#2209
bug-ops merged 3 commits intomainfrom
2191-fetch-hallucination

bug-ops commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 27, 2026

Summary

Changes

Fix 1 — Tool description hardening (crates/zeph-tools/src/scrape.rs)

Fix 2 — System prompt hardening (crates/zeph-core/src/context.rs)

Fix 3 — UrlGroundingVerifier (crates/zeph-tools/src/verifier.rs)

Fix 4 — Already present

Test plan

Known limitation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix 1 — Tool description hardening (`crates/zeph-tools/src/scrape.rs`)

Fix 2 — System prompt hardening (`crates/zeph-core/src/context.rs`)

Fix 3 — `UrlGroundingVerifier` (`crates/zeph-tools/src/verifier.rs`)