Skip to content

fix(browser): surface 429 rate limit errors with actionable hints#40491

Merged
altaywtf merged 6 commits intoopenclaw:mainfrom
mvanhorn:fix/browserbase-429-hints
Mar 10, 2026
Merged

fix(browser): surface 429 rate limit errors with actionable hints#40491
altaywtf merged 6 commits intoopenclaw:mainfrom
mvanhorn:fix/browserbase-429-hints

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented Mar 9, 2026

Fixes #40390

Summary

When Browserbase returns 429 (max concurrent sessions exceeded), users see a generic "Browser control failed" message leading to unnecessary gateway restarts and agent retry loops.

This PR detects 429 responses and surfaces actionable error messages with explicit "do not retry" guidance.

Changes

  • src/browser/client-fetch.ts - Detect 429 in both HTTP and dispatcher paths, throw BrowserServiceError with rate-limit message and BROWSER_TOOL_MODEL_HINT
  • src/browser/cdp.helpers.ts - Detect 429 in fetchCdpChecked with descriptive error
  • src/browser/pw-session.ts - Break out of connectBrowser retry loop on rate-limit errors (prevents retry amplification)
  • src/browser/client-fetch.loopback-auth.test.ts - 4 test cases for 429 detection

Error message

Browserbase rate limit reached (max concurrent sessions).
Wait for the current session to complete, or upgrade your plan.
Do NOT retry the browser tool - it will keep failing.

This contribution was developed with AI assistance (Claude Code).

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR improves observability when Browserbase returns HTTP 429 (max concurrent sessions exceeded) by surfacing actionable, no-retry error messages instead of the previous generic "Browser control failed" text. The change touches three layers of the browser stack — the HTTP/dispatcher fetch client, the CDP helper used for WebSocket-endpoint discovery, and the Playwright session retry loop — and adds four test cases covering both fetch paths.

Key changes:

  • client-fetch.ts: Adds BROWSER_RATE_LIMIT_MESSAGE constant + isRateLimitStatus helper; both fetchHttpJson and the dispatcher branch now throw BrowserServiceError with the rate-limit message on 429.
  • cdp.helpers.ts: fetchCdpChecked detects 429 before the generic HTTP ${status} throw, but the message is hand-rolled rather than shared — it differs in structure and no-retry wording from client-fetch.ts.
  • pw-session.ts: Breaks out of the connectBrowser retry loop on rate-limit errors; the errMsg.includes("429") substring check is overly broad and could suppress retries for unrelated errors containing that token.
  • client-fetch.loopback-auth.test.ts: Four new tests cover HTTP-path 429 (with/without body), dispatcher-path 429, and a regression guard for non-429 errors.

Confidence Score: 4/5

  • This PR is safe to merge; the logic is correct and the two flagged issues are style/maintenance concerns with low runtime impact.
  • The core 429 detection and error-propagation logic is sound across all three layers, and the new tests validate the primary paths. Score is 4 rather than 5 because of two minor issues: the errMsg.includes("429") substring check in pw-session.ts could theoretically suppress retries on unrelated errors, and the rate-limit message in cdp.helpers.ts is a duplicate with slightly different wording/structure that will drift over time.
  • Minor attention on src/browser/pw-session.ts (fragile "429" substring match) and src/browser/cdp.helpers.ts (message duplication/inconsistency).

Last reviewed commit: 91a553f

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91a553f70a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

lastErr = err;
// Don't retry rate-limit errors; retrying worsens the 429.
const errMsg = err instanceof Error ? err.message : String(err);
if (errMsg.includes("rate limit") || errMsg.includes("429")) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict retry break to true HTTP 429 rate-limit errors

The new retry guard treats any error message containing "429" as rate limiting, which can match unrelated transient connection failures that include a port or URL segment like :42900 (for example ECONNREFUSED 127.0.0.1:42900). In those cases connectBrowser now exits after the first failure instead of retrying up to 3 times, reducing robustness for valid CDP endpoints that happen to include 429 in their address.

Useful? React with 👍 / 👎.

Comment on lines +369 to +371
if (errMsg.includes("rate limit") || errMsg.includes("429")) {
break;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad "429" substring match risks false positives

errMsg.includes("429") can match error messages that contain 429 for unrelated reasons — e.g., a timeout value like "timed out after 4290ms", a URL path like /api/v429/..., or a port number in a connection string. A false match would silently suppress retries for a transient (non-rate-limit) failure.

The errMsg.includes("rate limit") check is already sufficient: every 429 thrown by fetchCdpChecked and fetchBrowserJson in this codebase now carries "rate limit" in the message. The "429" fallback is only needed for raw Playwright errors, but those are unlikely to surface a plain "429" token in their messages.

Consider removing the "429" check and relying solely on the "rate limit" phrase, or using a more specific pattern like errMsg.includes("HTTP 429"):

Suggested change
if (errMsg.includes("rate limit") || errMsg.includes("429")) {
break;
}
const errMsg = err instanceof Error ? err.message : String(err);
if (errMsg.includes("rate limit")) {
break;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/browser/pw-session.ts
Line: 369-371

Comment:
**Overly broad `"429"` substring match risks false positives**

`errMsg.includes("429")` can match error messages that contain `429` for unrelated reasons — e.g., a timeout value like `"timed out after 4290ms"`, a URL path like `/api/v429/...`, or a port number in a connection string. A false match would silently suppress retries for a transient (non-rate-limit) failure.

The `errMsg.includes("rate limit")` check is already sufficient: every 429 thrown by `fetchCdpChecked` and `fetchBrowserJson` in this codebase now carries `"rate limit"` in the message. The `"429"` fallback is only needed for raw Playwright errors, but those are unlikely to surface a plain `"429"` token in their messages.

Consider removing the `"429"` check and relying solely on the `"rate limit"` phrase, or using a more specific pattern like `errMsg.includes("HTTP 429")`:

```suggestion
        const errMsg = err instanceof Error ? err.message : String(err);
        if (errMsg.includes("rate limit")) {
          break;
        }
```

How can I resolve this? If you propose a fix, please make it concise.

@wirelessjoe
Copy link
Copy Markdown

Thanks for picking this up! The implementation looks solid.

A few observations from my testing that led to filing the original issue:

  1. The error message is exactly what I needed — "Do NOT retry" is critical because agents will otherwise loop indefinitely burning tokens.
  2. Question: Does this also catch the case where Browserbase returns 429 during session creation (via Stagehand init())? That's where I hit it most often with cron jobs competing for the single free-tier slot.
  3. Minor: Would it be worth including the current session limit in the error message if Browserbase returns it? e.g., "max concurrent sessions: 1" — helps users know if upgrading would even help.

Happy to test this on my setup (Docker/GCP, cron jobs using Stagehand + Browserbase) once it's ready for review.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

mvanhorn commented Mar 9, 2026

Thanks @wirelessjoe! Glad the error messages match what you needed. Your original issue report was really clear - made it easy to reproduce and fix.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Addressed bot feedback in 3702810:

P2 - "429" substring false positives: Removed errMsg.includes("429") from the retry-skip check in pw-session.ts. The errMsg.includes("rate limit") check alone is sufficient since every 429 thrown by fetchCdpChecked and fetchBrowserJson includes "rate limit" in the message. This avoids false positives from errors containing "429" in non-rate-limit contexts (e.g. "timed out after 4290ms").

P2 - Inconsistent rate-limit messages: Exported BROWSER_RATE_LIMIT_MESSAGE from client-fetch.ts and imported it in cdp.helpers.ts to replace the hand-rolled message, ensuring consistent wording across all 429 error paths.

@altaywtf altaywtf self-assigned this Mar 10, 2026
@altaywtf
Copy link
Copy Markdown
Member

This new 429 path reflects upstream response text directly into the thrown error in src/browser/cdp.helpers.ts:176-180. That text is provider-controlled for remote CDP/browser endpoints, so it should not be embedded verbatim in the error surface.

That message is later logged and returned to the tool caller unchanged in src/agents/pi-tool-definition-adapter.ts:179-188, which creates an avoidable log / agent-output injection surface. It also reads the full body before slicing, so I think this should use a stable message instead, or at most a bounded and sanitized snippet / request-id style metadata.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Good catch, @altaywtf - you're right that reflecting upstream response text verbatim into the error surface is an unnecessary injection risk.

Fixed in 490b675: removed the ${detail} interpolation from all three 429 error paths (cdp.helpers.ts and both client-fetch.ts callers). The stable BROWSER_RATE_LIMIT_MESSAGE already tells the user/agent exactly what happened - the provider-controlled body text added no actionable information.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 490b67536a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@altaywtf
Copy link
Copy Markdown
Member

@mvanhorn sir... your claude is not running format before pushing commits 😅

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 10, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Undrained fetch Response body on HTTP 429 in fetchCdpChecked can exhaust connections (DoS)

1. 🟡 Undrained fetch Response body on HTTP 429 in fetchCdpChecked can exhaust connections (DoS)

Property Value
Severity Medium
CWE CWE-404
Location src/browser/cdp.helpers.ts:175-180

Description

fetchCdpChecked throws on HTTP 429 without consuming or cancelling the response body.

In Node.js (v22+), fetch is backed by Undici. Undici will generally not reuse/release the underlying connection back to the pool until the response body is fully consumed or the body stream is cancelled. If many 429s occur (e.g., upstream rate limiting), this can lead to:

  • connection pool exhaustion / socket leak (connections stuck in-use)
  • increased memory usage from uncollected streams
  • eventual denial of service (subsequent fetches hang/fail)

This risk is amplified because fetchCdpChecked is used by CDP polling paths (e.g., reachability checks) and may be called repeatedly in loops; repeated 429s can accumulate leaked/undrained bodies.

Vulnerable code:

if (!res.ok) {
  if (res.status === 429) {// Do not reflect upstream response text into the error surface (log/agent injection risk)
    throw new Error(`${resolveBrowserRateLimitMessage(url)} Do NOT retry the browser tool.`);
  }
  throw new Error(`HTTP ${res.status}`);
}

Note: this repo already introduced a best-effort body cancellation helper (discardResponseBody) for the same 429 scenario in src/browser/client-fetch.ts, suggesting this is an expected mitigation but was missed here.

Recommendation

Before throwing, cancel (or drain) the response body so the underlying socket can be reused.

Minimal safe fix (best-effort cancel) for the new 429 branch:

if (!res.ok) {
  if (res.status === 429) {
    try {
      await res.body?.cancel();
    } catch {// best-effort
    }
    throw new Error(`${resolveBrowserRateLimitMessage(url)} Do NOT retry the browser tool.`);
  }
  throw new Error(`HTTP ${res.status}`);
}

Better: apply the same pattern for all non-OK responses (either await res.text().catch(() => "") without reflecting it, or await res.body?.cancel()), to prevent similar leaks on other error statuses.


Analyzed PR: #40491 at commit 13839c2

Last updated on: 2026-03-10T22:03:48Z

@openclaw-barnacle openclaw-barnacle bot added the cli CLI command changes label Mar 10, 2026
@altaywtf altaywtf force-pushed the fix/browserbase-429-hints branch from f2497e2 to d844788 Compare March 10, 2026 21:16
@openclaw-barnacle openclaw-barnacle bot removed the cli CLI command changes label Mar 10, 2026
mvanhorn and others added 6 commits March 11, 2026 00:48
Detect HTTP 429 responses from Browserbase at the fetch level and throw
errors with actionable messages instead of generic 'HTTP 429'. Break out
of the CDP retry loop on rate-limit errors to avoid amplifying the 429.

Closes openclaw#40390

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Remove errMsg.includes("429") fallback from retry-skip logic in
  pw-session.ts; the "rate limit" check alone is sufficient and avoids
  false positives from unrelated errors containing "429" (e.g. timeouts,
  port numbers, API paths)
- Export BROWSER_RATE_LIMIT_MESSAGE from client-fetch.ts and use it in
  cdp.helpers.ts instead of hand-rolled message for consistent wording

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Stop reflecting provider-controlled response body into thrown errors
on HTTP 429. The upstream text was embedded in error messages that
reach logs and agent tool output, creating an avoidable log/output
injection surface for remote CDP/browser endpoints.

The stable BROWSER_RATE_LIMIT_MESSAGE already provides all the
actionable information the user needs.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@altaywtf altaywtf force-pushed the fix/browserbase-429-hints branch from 5de372f to 13839c2 Compare March 10, 2026 21:48
@altaywtf altaywtf merged commit 5ed96da into openclaw:main Mar 10, 2026
26 checks passed
@altaywtf
Copy link
Copy Markdown
Member

Merged via squash.

Thanks @mvanhorn!

mrosmarin added a commit to mrosmarin/openclaw that referenced this pull request Mar 10, 2026
* main: (42 commits)
  test: share runtime group policy fallback cases
  refactor: share windows command shim resolution
  refactor: share approval gateway client setup
  refactor: share telegram payload send flow
  refactor: share passive account lifecycle helpers
  refactor: share channel config schema fragments
  refactor: share channel config security scaffolding
  refactor: share onboarding secret prompt flows
  refactor: share scoped account config patching
  feat(discord): add autoArchiveDuration config option (openclaw#35065)
  fix(gateway): harden token fallback/reconnect behavior and docs (openclaw#42507)
  fix(acp): strip provider auth env for child ACP processes (openclaw#42250)
  fix(browser): surface 429 rate limit errors with actionable hints (openclaw#40491)
  fix(acp): scope cancellation and event routing by runId (openclaw#41331)
  docs: require codex review in contributing guide (openclaw#42503)
  Fix stale runtime model reuse on session reset (openclaw#41173)
  docs: document r: spam auto-close label
  fix(ci): auto-close and lock r: spam items
  fix(acp): implicit streamToParent for mode=run without thread (openclaw#42404)
  test: extract sendpayload outbound contract suite
  ...
gumadeiras pushed a commit to BillChirico/openclaw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
frankekn pushed a commit to MoerAI/openclaw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
frankekn pushed a commit to Effet/openclaw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
frankekn pushed a commit to ImLukeF/openclaw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
Treedy2020 pushed a commit to Treedy2020/openclaw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
dhoman pushed a commit to dhoman/chrono-claw that referenced this pull request Mar 11, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
ahelpercn pushed a commit to ahelpercn/openclaw that referenced this pull request Mar 12, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
Ruijie-Ysp pushed a commit to Ruijie-Ysp/clawdbot that referenced this pull request Mar 12, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
leozhengliu-pixel pushed a commit to leozhengliu-pixel/openclaw that referenced this pull request Mar 13, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
senw-developers pushed a commit to senw-developers/va-openclaw that referenced this pull request Mar 17, 2026
…enclaw#40491)

Merged via squash.

Prepared head SHA: 13839c2
Co-authored-by: mvanhorn <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Surface Browserbase rate limit errors (429) with actionable hints

3 participants