Skip to content

fix(web-fetch): expose ssrfPolicy.allowRfc2544BenchmarkRange config option#25258

Closed
byungsker wants to merge 1 commit intoopenclaw:mainfrom
byungsker:fix/web-fetch-ssrf-allow-rfc2544-benchmark-range
Closed

fix(web-fetch): expose ssrfPolicy.allowRfc2544BenchmarkRange config option#25258
byungsker wants to merge 1 commit intoopenclaw:mainfrom
byungsker:fix/web-fetch-ssrf-allow-rfc2544-benchmark-range

Conversation

@byungsker
Copy link
Copy Markdown

@byungsker byungsker commented Feb 24, 2026

Problem

web_fetch unconditionally applied the SSRF block list including the RFC 2544 benchmark range (198.18.0.0/15). Clash/mihomo and similar proxy tools use this range for fake-ip mode: DNS queries return virtual IPs in 198.18.x.x that are transparently forwarded to the real destination by the kernel.

This made web_fetch completely unusable for users running a local proxy after the RFC 2544 check was added in 2026.2.22-2.

Furthermore, even though SsrFPolicy.allowRfc2544BenchmarkRange already existed and partial wiring was in web-fetch.ts, the ToolsWebFetchSchema Zod schema used .strict() without including ssrfPolicy — so the config field was silently rejected at validation time with:

Unrecognized key: "ssrfPolicy" at path tools.web.fetch

Additionally, readability and firecrawl sub-config were in the same situation: present in types.tools.ts and schema.help.ts but missing from ToolsWebFetchSchema, causing config validation failures.

Fixes #25215

Changes

  • src/config/zod-schema.agent-runtime.ts — Add ssrfPolicy, readability, and firecrawl to ToolsWebFetchSchema (all three caused unrecognized-key errors due to .strict())
  • src/agents/tools/web-fetch.ts — Wire ssrfPolicy.allowRfc2544BenchmarkRange through to fetchWithSsrFGuard
  • src/config/types.tools.ts — TypeScript type for the new config field
  • src/config/schema.help.ts — Help text for tools.web.fetch.ssrfPolicy and tools.web.fetch.ssrfPolicy.allowRfc2544BenchmarkRange
  • src/config/schema.labels.ts — UI labels for the new ssrfPolicy fields
  • src/agents/tools/web-fetch.ssrf.test.ts — Tests for RFC 2544 range allow/block behavior
  • src/config/config.schema-regressions.test.ts — Config regression tests for ssrfPolicy, readability, and firecrawl

Config

Users running Clash/mihomo in fake-ip mode can now add to their config:

{
  "tools": {
    "web": {
      "fetch": {
        "ssrfPolicy": {
          "allowRfc2544BenchmarkRange": true
        }
      }
    }
  }
}

When enabled, the 198.18.0.0/15 range is permitted for web_fetch; all other SSRF protections (private IPs, loopback, cloud-metadata hostnames, etc.) remain active.

Greptile Summary

Added ssrfPolicy.allowRfc2544BenchmarkRange config option to fix web_fetch compatibility with Clash/mihomo proxy fake-ip mode, and fixed schema validation bugs that were silently rejecting ssrfPolicy, readability, and firecrawl config fields.

Key changes:

  • Wired ssrfPolicy.allowRfc2544BenchmarkRange through from config to fetchWithSsrFGuard in web-fetch.ts:539
  • Fixed ToolsWebFetchSchema missing fields (readability, ssrfPolicy, firecrawl) that caused silent rejection with .strict() validation
  • Added comprehensive tests covering RFC 2544 range blocking and allowlist behavior
  • Added config regression tests for all three previously-broken fields

Confidence Score: 5/5

  • Safe to merge with no blocking issues found
  • The implementation correctly wires the new config option through all layers (schema, types, implementation), includes comprehensive tests for all three config fields (ssrfPolicy, readability, firecrawl), and fixes a real schema validation bug where .strict() was silently rejecting valid user config. The security change is conservative (defaults to blocking, requires explicit opt-in), and all SSRF protections remain active when enabled.
  • No files require special attention

Last reviewed commit: 41ab60b

…ption

web_fetch unconditionally applied the SSRF block list, which includes
the RFC 2544 benchmark range (198.18.0.0/15). Clash/mihomo and similar
proxy tools use this range for their fake-ip mode: DNS queries return
virtual IPs in 198.18.x.x that are transparently forwarded to the real
destination by the kernel. The block made web_fetch completely unusable
for these users after the RFC 2544 check was added in 2026.2.22-2.

The underlying SsrFPolicy.allowRfc2544BenchmarkRange option already
existed (used by the Telegram delivery path), and was partially wired
into the web_fetch tool. However, ToolsWebFetchSchema used .strict()
without including ssrfPolicy, so the config field was rejected at
validation time with 'Unrecognized key: ssrfPolicy'.

Fix:
- Add ssrfPolicy, readability, and firecrawl to ToolsWebFetchSchema
  (all three were present in types.tools.ts and schema.help.ts but
   missing from the Zod validation schema, causing config rejections)
- Wire ssrfPolicy.allowRfc2544BenchmarkRange through to fetchWithSsrFGuard
- Add schema help and labels for tools.web.fetch.ssrfPolicy
- Add config regression tests for ssrfPolicy, readability, and firecrawl

When tools.web.fetch.ssrfPolicy.allowRfc2544BenchmarkRange is true,
the 198.18.0.0/15 range is allowed; all other SSRF protections (private
IPs, loopback, cloud-metadata hostnames, etc.) remain active.

Fixes openclaw#25215
@fatoncn
Copy link
Copy Markdown

fatoncn commented Feb 26, 2026

This would fix #25215 for us — we run Clash/mihomo in fake-ip mode and had to pin an older version to work around the SSRF block. Clean approach exposing it as a config option. 👍

@lumenclaw-cloud
Copy link
Copy Markdown

This fix is critical for many users in China who rely on Clash/mihomo with fake-ip mode. The web_fetch tool is completely unusable without this. Hope this can be merged soon! Thanks for the clean implementation.

@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 3, 2026
@reallinzc
Copy link
Copy Markdown

This fix is critical for many users in China who rely on Clash/mihomo with fake-ip mode. The web_fetch tool is completely unusable without this. Hope this can be merged soon! Thanks for the clean implementation.

@openclaw-barnacle openclaw-barnacle bot removed the stale Marked as stale due to inactivity label Mar 4, 2026
@byungsker

This comment was marked as spam.

@jiesou
Copy link
Copy Markdown

jiesou commented Mar 7, 2026

Closing — issue #25215 is already being addressed by PR #26436. Deferring to avoid duplication.

These are different. The current issue affects the web_fetch tool, while #26436 only affects the discord channel.

@boat2moon
Copy link
Copy Markdown

These are different. The current issue affects the web_fetch tool, while #26436 only affects the discord channel.

These are different. The current issue affects the web_fetch tool, while #26436 only affects the discord channel.

@byungsker

This comment was marked as spam.

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 9, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium SSRF and Firecrawl API key exfiltration via configurable firecrawl.baseUrl

1. 🟡 SSRF and Firecrawl API key exfiltration via configurable firecrawl.baseUrl

Property Value
Severity Medium
CWE CWE-918
Location src/agents/tools/web-fetch.ts:368-387

Description

tools.web.fetch.firecrawl.baseUrl is now accepted as an arbitrary string in the runtime config schema, and is used to construct the Firecrawl API endpoint without SSRF protections or allowlisting.

Impact:

  • SSRF/internal network access: If an attacker can influence runtime config (multi-tenant config, untrusted overrides, etc.), they can set firecrawl.baseUrl to internal hosts/IPs (e.g. http://127.0.0.1:..., http://169.254.169.254/...) and the server will fetch() it.
  • Credential leakage: The request includes Authorization: Bearer <firecrawl apiKey>. If baseUrl is attacker-controlled, the Firecrawl API key will be sent to the attacker-controlled endpoint.

Vulnerable flow:

  • input: tools.web.fetch.firecrawl.baseUrl (schema: z.string().optional(); no URL/host/scheme validation)
  • sink: fetch(endpoint, { headers: { Authorization: ... }}) to the computed endpoint

Vulnerable code:

const endpoint = resolveFirecrawlEndpoint(params.baseUrl);
const res = await fetch(endpoint, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${params.apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify(body),
});

Recommendation

Treat firecrawl.baseUrl as security-sensitive.

Preferred (most secure): disallow arbitrary baseUrl

  • Only allow the default SaaS host (e.g. api.firecrawl.dev) or a small administrator-controlled allowlist.

If custom base URLs must be supported (self-hosted Firecrawl):

  1. Validate and normalize the URL at config-parse time:
    • require https: (or explicitly allow http: only for local/dev)
    • block localhost, link-local, private IP ranges, and other special-use ranges
    • optionally require hostnames to match an allowlist
  2. Use the existing SSRF guard for the Firecrawl call as well.

Example hardening:

import { fetchWithSsrFGuard } from "../../infra/net/fetch-guard.js";// Validate baseUrl (https + allowlisted hostname) before use.
const endpoint = resolveFirecrawlEndpoint(validatedBaseUrl);

const { response } = await fetchWithSsrFGuard({
  url: endpoint,
  init: {
    method: "POST",
    headers: {
      Authorization: `Bearer ${apiKey}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(body),
  },// ensure private/internal ranges remain blocked for this config-driven request
  policy: { dangerouslyAllowPrivateNetwork: false },
  auditContext: "firecrawl",
});

Additionally:

  • If baseUrl is not on an allowlist, do not send the API key (fail closed) to prevent exfiltration.

Analyzed PR: #25258 at commit 41ab60b

Last updated on: 2026-03-09T13:51:05Z

@openclaw-barnacle
Copy link
Copy Markdown

Closing this PR because the author has more than 10 active PRs in this repo. Please reduce the active PR queue and reopen or resubmit once it is back under the limit. You can close your own PRs to get back under the limit.

@openclaw-barnacle openclaw-barnacle bot closed this Mar 9, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR fixes two distinct bugs: (1) ToolsWebFetchSchema used Zod's .strict() without declaring readability, firecrawl, or ssrfPolicy, causing those keys to be silently rejected at validation time with "Unrecognized key" errors; and (2) even though the SsrFPolicy.allowRfc2544BenchmarkRange field already existed in the SSRF guard, it was never wired through from the user config, making web_fetch unconditionally block the 198.18.0.0/15 benchmark range used by Clash/mihomo fake-ip mode.

  • Added readability, ssrfPolicy, and firecrawl to ToolsWebFetchSchema in zod-schema.agent-runtime.ts, fixing the silent config rejection
  • Wired fetch.ssrfPolicy.allowRfc2544BenchmarkRange from config through createWebFetchToolrunWebFetchfetchWithSsrFGuard's policy parameter
  • Added the ssrfPolicy field to ToolsConfig in types.tools.ts
  • Added help text and UI labels for the new ssrfPolicy fields
  • Added schema regression tests covering all three previously-broken config keys (ssrfPolicy, readability, firecrawl)
  • Added unit tests for default-block, explicit-allow, and explicit-disable behaviors of the RFC 2544 range

Confidence Score: 5/5

  • This PR is safe to merge — it fixes genuine config-rejection and SSRF-bypass bugs with appropriate opt-in gating, tests, and documentation.
  • The fix is minimal and targeted: the Zod schema additions directly mirror the existing TypeScript types, the new runtime parameter is a simple boolean passed through an already-established policy interface, the opt-in flag is false by default so all existing SSRF protections remain unchanged, and new tests cover default, enabled, and disabled paths. No regressions are expected.
  • No files require special attention.

Last reviewed commit: 41ab60b

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR fixes a real-world breakage where web_fetch was completely unusable for users running Clash/mihomo in fake-ip mode. It does this by exposing the pre-existing SsrFPolicy.allowRfc2544BenchmarkRange option all the way through the config/schema/runtime stack, and simultaneously fixes a latent schema bug where ssrfPolicy, readability, and firecrawl sub-configs were silently rejected at validation time due to ToolsWebFetchSchema using .strict() without declaring those fields.

  • Schema fix: readability, ssrfPolicy, and firecrawl added to ToolsWebFetchSchema — closes the "Unrecognized key" validation errors.
  • Runtime wiring: allowRfc2544BenchmarkRange is read from fetch.ssrfPolicy and passed through to fetchWithSsrFGuard; defaults safely to false so no existing SSRF behaviour changes.
  • Type / help / labels: types.tools.ts, schema.help.ts, and schema.labels.ts updated consistently.
  • Tests: SSRF test file covers default-block, explicit-allow, and explicit-deny paths; schema regression tests confirm all three newly-added sub-configs are accepted.
  • Minor style note: The inline resolution of allowRfc2544BenchmarkRange (lines 773–780 of web-fetch.ts) uses a verbose "key" in obj guard chain that is equivalent to the simpler fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true; the existing resolveFetch* helper pattern in the same file would be more consistent.

Confidence Score: 5/5

  • Safe to merge — the change is opt-in, defaults to false, and all existing SSRF protections remain active unless explicitly overridden.
  • The feature is a clearly scoped opt-in config flag with no change to default behaviour. The root-cause bug (missing fields in .strict() schema) is straightforwardly fixed, tests cover the critical paths, and the wiring is consistent with how the rest of fetchWithSsrFGuard is parameterised. The only finding is a verbose but logically correct inline expression that could be simplified.
  • No files require special attention; the minor verbosity in src/agents/tools/web-fetch.ts lines 773–780 is style-only.

Last reviewed commit: 41ab60b

Comment on lines +773 to +780
allowRfc2544BenchmarkRange:
fetch &&
"ssrfPolicy" in fetch &&
typeof fetch.ssrfPolicy === "object" &&
fetch.ssrfPolicy !== null &&
"allowRfc2544BenchmarkRange" in fetch.ssrfPolicy
? fetch.ssrfPolicy.allowRfc2544BenchmarkRange === true
: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly verbose runtime check — simpler equivalent available

The inline guard chain (fetch && "ssrfPolicy" in fetch && typeof … === "object" && … !== null && "allowRfc2544BenchmarkRange" in fetch.ssrfPolicy) is considerably more verbose than necessary. Since fetch is typed as WebFetchConfig which now includes ssrfPolicy?: { allowRfc2544BenchmarkRange?: boolean }, TypeScript already guarantees the shape. Other boolean flags in this file (e.g. readabilityEnabled via resolveFetchReadabilityEnabled) use a lightweight resolver or a simple optional-chain expression. The entire expression collapses to:

Suggested change
allowRfc2544BenchmarkRange:
fetch &&
"ssrfPolicy" in fetch &&
typeof fetch.ssrfPolicy === "object" &&
fetch.ssrfPolicy !== null &&
"allowRfc2544BenchmarkRange" in fetch.ssrfPolicy
? fetch.ssrfPolicy.allowRfc2544BenchmarkRange === true
: false,
allowRfc2544BenchmarkRange: fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true,

This is semantically identical (returns false for undefined, null, missing key, or an explicit false) and aligns with how other optional boolean flags are resolved in the codebase.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/tools/web-fetch.ts
Line: 773-780

Comment:
**Overly verbose runtime check — simpler equivalent available**

The inline guard chain (`fetch && "ssrfPolicy" in fetch && typeof … === "object" && … !== null && "allowRfc2544BenchmarkRange" in fetch.ssrfPolicy`) is considerably more verbose than necessary. Since `fetch` is typed as `WebFetchConfig` which now includes `ssrfPolicy?: { allowRfc2544BenchmarkRange?: boolean }`, TypeScript already guarantees the shape. Other boolean flags in this file (e.g. `readabilityEnabled` via `resolveFetchReadabilityEnabled`) use a lightweight resolver or a simple optional-chain expression. The entire expression collapses to:

```suggestion
        allowRfc2544BenchmarkRange: fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true,
```

This is semantically identical (returns `false` for `undefined`, `null`, missing key, or an explicit `false`) and aligns with how other optional boolean flags are resolved in the codebase.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR fixes a real regression where web_fetch was completely unusable for users running a local proxy (e.g. Clash/mihomo in fake-ip mode) after the RFC 2544 benchmark range (198.18.0.0/15) was added to the SSRF block list. It also fixes a separate but related issue where three fields — ssrfPolicy, readability, and firecrawl — were silently rejected at config validation time because ToolsWebFetchSchema used .strict() without declaring them.

Key changes:

  • Schema fix: ToolsWebFetchSchema now includes readability, ssrfPolicy, and firecrawl — resolving the "Unrecognized key" errors on all three fields.
  • New config option: tools.web.fetch.ssrfPolicy.allowRfc2544BenchmarkRange can be set to true to permit connections to 198.18.0.0/15; all other SSRF protections (loopback, private ranges, cloud-metadata hostnames, etc.) remain active.
  • Test coverage: Three new SSRF tests cover the default-block, explicit-allow, and explicit-disable behaviour for RFC 2544 IP literals. Three new schema regression tests confirm all previously-rejected config keys are now accepted.
  • Minor style inconsistency: allowRfc2544BenchmarkRange is computed inline inside the execute callback using a verbose runtime type-inspection pattern, while every other config-derived value (readabilityEnabled, firecrawlEnabled, userAgent, etc.) is pre-computed outside the callback. Now that ssrfPolicy is a properly typed field on WebFetchConfig, the check could be simplified to fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true and extracted alongside the other const declarations.

Confidence Score: 4/5

  • Safe to merge; the security boundary is preserved — only the RFC 2544 range is conditionally permitted and all other SSRF protections remain active.
  • The schema fix and type additions are straightforward and well-tested. The SSRF wiring is correct end-to-end. The only deduction is a minor style inconsistency in how allowRfc2544BenchmarkRange is computed (inline in the execute callback using verbose runtime checks) compared to the established pattern for every other config value in the same function.
  • src/agents/tools/web-fetch.ts — the inline allowRfc2544BenchmarkRange computation at lines 773–780 is inconsistent with the rest of the function's config-resolution pattern.

Last reviewed commit: 41ab60b

Comment on lines +773 to +780
allowRfc2544BenchmarkRange:
fetch &&
"ssrfPolicy" in fetch &&
typeof fetch.ssrfPolicy === "object" &&
fetch.ssrfPolicy !== null &&
"allowRfc2544BenchmarkRange" in fetch.ssrfPolicy
? fetch.ssrfPolicy.allowRfc2544BenchmarkRange === true
: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency: config value computed inside execute callback

All other config-derived values (readabilityEnabled, firecrawlEnabled, firecrawlApiKey, userAgent, etc.) are pre-computed outside the execute callback and simply referenced inside it. allowRfc2544BenchmarkRange is the only one that is computed inline on every execute call.

Additionally, since ssrfPolicy is now a properly-typed field on WebFetchConfig (added to types.tools.ts in this PR), the verbose runtime inspection is unnecessary — TypeScript already knows the field exists. This could be simplified to a single line, extracted before the callback like all other values:

Suggested change
allowRfc2544BenchmarkRange:
fetch &&
"ssrfPolicy" in fetch &&
typeof fetch.ssrfPolicy === "object" &&
fetch.ssrfPolicy !== null &&
"allowRfc2544BenchmarkRange" in fetch.ssrfPolicy
? fetch.ssrfPolicy.allowRfc2544BenchmarkRange === true
: false,
allowRfc2544BenchmarkRange: fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true,

For full consistency with the rest of the function, the extraction should be placed alongside the other const declarations above the return { ... } block:

const allowRfc2544BenchmarkRange = fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true;

and then just reference allowRfc2544BenchmarkRange in the runWebFetch call.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/tools/web-fetch.ts
Line: 773-780

Comment:
**Inconsistency: config value computed inside execute callback**

All other config-derived values (`readabilityEnabled`, `firecrawlEnabled`, `firecrawlApiKey`, `userAgent`, etc.) are pre-computed outside the `execute` callback and simply referenced inside it. `allowRfc2544BenchmarkRange` is the only one that is computed inline on every execute call.

Additionally, since `ssrfPolicy` is now a properly-typed field on `WebFetchConfig` (added to `types.tools.ts` in this PR), the verbose runtime inspection is unnecessary — TypeScript already knows the field exists. This could be simplified to a single line, extracted before the callback like all other values:

```suggestion
        allowRfc2544BenchmarkRange: fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true,
```

For full consistency with the rest of the function, the extraction should be placed alongside the other `const` declarations above the `return { ... }` block:
```ts
const allowRfc2544BenchmarkRange = fetch?.ssrfPolicy?.allowRfc2544BenchmarkRange === true;
```
and then just reference `allowRfc2544BenchmarkRange` in the `runWebFetch` call.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling r: too-many-prs size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

web_fetch SSRF check blocks Clash/mihomo fake-ip range (198.18.0.0/15) — regression in 2026.2.22-2

6 participants