fix(ai): fail fast on long quota delays in google-gemini-cli#504
Closed
mukhtharcm wants to merge 1 commit intobadlogic:mainfrom
Closed
fix(ai): fail fast on long quota delays in google-gemini-cli#504mukhtharcm wants to merge 1 commit intobadlogic:mainfrom
mukhtharcm wants to merge 1 commit intobadlogic:mainfrom
Conversation
When Antigravity API returns a rate limit error with a long retry delay (e.g., 12+ minutes for quota exhaustion), don't wait inside the retry loop. Instead, throw immediately so callers (like Clawdbot) can rotate to another account. Previously, if the API said 'wait 12 minutes', the code would sleep for 12 minutes inside the request, causing the caller's timeout to fire. Now, any delay > 30 seconds triggers an immediate error with a clear message indicating it's a rate limit issue. This enables multi-account setups to gracefully handle per-account quota limits by quickly failing over to accounts that aren't exhausted.
Owner
|
No, that is something clawdbot has to implement on its side. The sleep receives an abort signal. Use that to abort after the clawdbot specific timeout has passed. I'm afraid I will not add clawdbot specific "fixes" that can be done without modifications to pi. |
mukhtharcm
added a commit
to mukhtharcm/clawdbot
that referenced
this pull request
Jan 6, 2026
This commit fixes several issues with multi-account OAuth rotation that
were causing slow responses and inefficient account cycling.
## Changes
### 1. Fix usageStats race condition (auth-profiles.ts)
The `markAuthProfileUsed`, `markAuthProfileCooldown`, `markAuthProfileGood`,
and `clearAuthProfileCooldown` functions were using a stale in-memory store
passed as a parameter. Long-running sessions would overwrite usageStats
updates from concurrent sessions when saving.
**Fix:** Re-read the store from disk before each update to get fresh
usageStats from other sessions, then merge the update.
### 2. Capture AbortError from waitForCompactionRetry (pi-embedded-runner.ts)
When a request timed out, `session.abort()` was called which throws an
`AbortError`. The code structure was:
```javascript
try {
await session.prompt(params.prompt);
} catch (err) {
promptError = err; // Catches AbortError here
}
await waitForCompactionRetry(); // But THIS also throws AbortError!
```
The second `AbortError` from `waitForCompactionRetry()` escaped and
bypassed the rotation/fallback logic entirely.
**Fix:** Wrap `waitForCompactionRetry()` in its own try/catch to capture
the error as `promptError`, enabling proper timeout handling.
Root cause analysis and fix proposed by @erikpr1994 in openclaw#313.
Fixes openclaw#313
### 3. Fail fast on 429 rate limits (pi-ai patch)
The pi-ai library was retrying 429 errors up to 3 times with exponential
backoff before throwing. This meant a rate-limited account would waste
30+ seconds retrying before our rotation code could try the next account.
**Fix:** Patch google-gemini-cli.js to:
- Throw immediately on first 429 (no retries)
- Not catch and retry 429 errors in the network error handler
This allows the caller to rotate to the next account instantly on rate limit.
Note: We submitted this fix upstream (badlogic/pi-mono#504)
but it was closed without merging. Keeping as a local patch for now.
## Testing
With 6 Antigravity accounts configured:
- Accounts rotate properly based on lastUsed (round-robin)
- 429s trigger immediate rotation to next account
- usageStats persist correctly across concurrent sessions
- Cooldown tracking works as expected
## Before/After
**Before:** Multiple 429 retries on same account, 30-90s delays
**After:** Instant rotation on 429, responses in seconds
steipete
pushed a commit
to openclaw/openclaw
that referenced
this pull request
Jan 7, 2026
This commit fixes several issues with multi-account OAuth rotation that
were causing slow responses and inefficient account cycling.
## Changes
### 1. Fix usageStats race condition (auth-profiles.ts)
The `markAuthProfileUsed`, `markAuthProfileCooldown`, `markAuthProfileGood`,
and `clearAuthProfileCooldown` functions were using a stale in-memory store
passed as a parameter. Long-running sessions would overwrite usageStats
updates from concurrent sessions when saving.
**Fix:** Re-read the store from disk before each update to get fresh
usageStats from other sessions, then merge the update.
### 2. Capture AbortError from waitForCompactionRetry (pi-embedded-runner.ts)
When a request timed out, `session.abort()` was called which throws an
`AbortError`. The code structure was:
```javascript
try {
await session.prompt(params.prompt);
} catch (err) {
promptError = err; // Catches AbortError here
}
await waitForCompactionRetry(); // But THIS also throws AbortError!
```
The second `AbortError` from `waitForCompactionRetry()` escaped and
bypassed the rotation/fallback logic entirely.
**Fix:** Wrap `waitForCompactionRetry()` in its own try/catch to capture
the error as `promptError`, enabling proper timeout handling.
Root cause analysis and fix proposed by @erikpr1994 in #313.
Fixes #313
### 3. Fail fast on 429 rate limits (pi-ai patch)
The pi-ai library was retrying 429 errors up to 3 times with exponential
backoff before throwing. This meant a rate-limited account would waste
30+ seconds retrying before our rotation code could try the next account.
**Fix:** Patch google-gemini-cli.js to:
- Throw immediately on first 429 (no retries)
- Not catch and retry 429 errors in the network error handler
This allows the caller to rotate to the next account instantly on rate limit.
Note: We submitted this fix upstream (badlogic/pi-mono#504)
but it was closed without merging. Keeping as a local patch for now.
## Testing
With 6 Antigravity accounts configured:
- Accounts rotate properly based on lastUsed (round-robin)
- 429s trigger immediate rotation to next account
- usageStats persist correctly across concurrent sessions
- Cooldown tracking works as expected
## Before/After
**Before:** Multiple 429 retries on same account, 30-90s delays
**After:** Instant rotation on 429, responses in seconds
zooqueen
pushed a commit
to hanzoai/bot
that referenced
this pull request
Mar 6, 2026
This commit fixes several issues with multi-account OAuth rotation that
were causing slow responses and inefficient account cycling.
## Changes
### 1. Fix usageStats race condition (auth-profiles.ts)
The `markAuthProfileUsed`, `markAuthProfileCooldown`, `markAuthProfileGood`,
and `clearAuthProfileCooldown` functions were using a stale in-memory store
passed as a parameter. Long-running sessions would overwrite usageStats
updates from concurrent sessions when saving.
**Fix:** Re-read the store from disk before each update to get fresh
usageStats from other sessions, then merge the update.
### 2. Capture AbortError from waitForCompactionRetry (pi-embedded-runner.ts)
When a request timed out, `session.abort()` was called which throws an
`AbortError`. The code structure was:
```javascript
try {
await session.prompt(params.prompt);
} catch (err) {
promptError = err; // Catches AbortError here
}
await waitForCompactionRetry(); // But THIS also throws AbortError!
```
The second `AbortError` from `waitForCompactionRetry()` escaped and
bypassed the rotation/fallback logic entirely.
**Fix:** Wrap `waitForCompactionRetry()` in its own try/catch to capture
the error as `promptError`, enabling proper timeout handling.
Root cause analysis and fix proposed by @erikpr1994 in openclaw#313.
Fixes openclaw#313
### 3. Fail fast on 429 rate limits (pi-ai patch)
The pi-ai library was retrying 429 errors up to 3 times with exponential
backoff before throwing. This meant a rate-limited account would waste
30+ seconds retrying before our rotation code could try the next account.
**Fix:** Patch google-gemini-cli.js to:
- Throw immediately on first 429 (no retries)
- Not catch and retry 429 errors in the network error handler
This allows the caller to rotate to the next account instantly on rate limit.
Note: We submitted this fix upstream (badlogic/pi-mono#504)
but it was closed without merging. Keeping as a local patch for now.
## Testing
With 6 Antigravity accounts configured:
- Accounts rotate properly based on lastUsed (round-robin)
- 429s trigger immediate rotation to next account
- usageStats persist correctly across concurrent sessions
- Cooldown tracking works as expected
## Before/After
**Before:** Multiple 429 retries on same account, 30-90s delays
**After:** Instant rotation on 429, responses in seconds
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When Antigravity API returns a rate limit error with a long retry delay (e.g.,
Your quota will reset after 12m10s), the code wouldsleep()for the full duration inside the retry loop.This caused issues for callers with shorter timeouts (like Clawdbot's 90-second timeout):
extractRetryDelay()returns 731,000mssleep(731000)starts waitingSolution
Don't wait for delays longer than 30 seconds. Instead, throw an error immediately with a clear message indicating it's a rate limit issue. This lets callers like Clawdbot quickly rotate to another account that isn't quota-exhausted.
Testing
Tested with a quota-exhausted Antigravity account: