-
-
Notifications
You must be signed in to change notification settings - Fork 40.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Problem
When the conversation context grows too large (lots of screenshots, long logs, many tool calls), the Anthropic API returns 413 request_too_large. Currently, the gateway just forwards this error to the user, making the assistant completely unresponsive until someone manually clears the session.
This happened to me on Jan 6-7, 2026 - the assistant was stuck returning 413 errors for hours until I manually cleared the context with Claude Code.
Expected Behavior
The gateway should handle this gracefully:
- Detect approaching limit - Track token count and warn/act before hitting the 413
- Auto-truncate on 413 - When receiving a 413, automatically remove oldest messages from context and retry
- Preserve memory before cleanup - Write a summary to memory files before truncating so context isn't completely lost
- Notify user - Send a message like "Context was getting too large, I've cleaned up old messages but I'm still here"
Suggested Implementation
-
In the request handler, catch 413 errors specifically
-
On 413:
- Log the event
- Truncate the oldest 50% of messages (or until under limit)
- Retry the request
- If still failing, truncate more aggressively
- Send a notification to the user about what happened
-
Proactively:
- Track approximate token count per session
- When approaching 80% of model's context limit, start summarizing/truncating older messages
Impact
Without this fix, users who are away from their machine (like traveling) have no way to recover - the assistant becomes completely unresponsive.
Environment
- Model: claude-opus-4-5 (200k context)
- Heavy use of peekaboo screenshots and long tool outputs contributed to the overflow
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working