Skip to content

the tokens got burned by dragging a huge context forward #1594

@beastoin

Description

@beastoin

i tried clawdbot today — great experience. thanks to Peter and all the maintainers.

but i found a tiny itchy thing that might help improve clawdbot, so let me drop it here.

today we(me and my clawdbot) hit the 5 hours limit twice:

  • first time: after ~400 messages
  • second time: after only 20–30 messages.

and all requests in the second round were super small, with very strict token limits.

so yeah...clawdbot went from “i’m tired” to “i’m dead” real fast.

this was the message from clawdbot right before we hit the limit.

---
The cost spike isn’t because you “asked something big” — it’s because we accidentally pulled big data into the main DM context, and then every normal message kept dragging that context along.

What happened (root cause)

• We ran commands that returned huge outputs (especially gateway config.schema → massive JSON).
• Clawdbot stores tool outputs in the session transcript, so the main DM session context became huge.
• After that, even small questions caused the model to process a lot of cached context (cacheRead in the hundreds of thousands per turn).
Best fixes to reduce cost (do these now)

  1. Reset / start a fresh main DM session • This immediately stops carrying the giant schema/history forward.
    • Easiest: you say “reset session”, and I’ll start replying in a new session context (or you can just start a new DM thread).

  2. Never run “big output” tools in the main DM • For things like config.schema, status --all, logs dumps: do them in an isolated debug session (separate session key) and only paste back a small snippet.

  3. Stop cron from chatting / hitting the model • Right now your cron jobs are triggering “agent turns” frequently; even “no-op” runs can still consume tokens depending on how they’re implemented.
    • Change cron behavior so it runs a shell script only and only wakes the agent when there’s a finding (or run less frequently).

  4. Turn on aggressive compaction / smaller context for main • Main session is currently ~56–58% of a 400k window; that’s expensive to keep re-sending/caching.
    • We can set lower agents.defaults.contextTokens (e.g. 50k–100k) and/or a more aggressive compaction mode.

If you want the “one-click” plan

Tell me: “reset main session + move heavy diagnostics to isolated + reduce cron token use” and I’ll propose the exact config edits (and only apply after you confirm).

---

not sure if we should introduce a compact mode (similar to claude code) with per-session token stats to help prevent this issue.

btw happy clawding!

updates 1:

  • after feeling something was weird, i checked the docs and saw that we already have compaction and it's auto-on by default. so now the question becomes: why isn't it working ?
~ clawdbot models list
🦞 Clawdbot 2026.1.22 (c48751a) — Hot reload for config, cold sweat for deploys.

Model                                      Input      Ctx      Local Auth  Tags
openai-codex/gpt-5.2                       text+image 266k     no    yes   default,configured
~ cat ~/.clawdbot/clawdbot.json
{
  "meta": {
    "lastTouchedVersion": "2026.1.22",
    "lastTouchedAt": "2026-01-24T12:32:55.198Z"
  },
  "wizard": {
    "lastRunAt": "2026-01-24T12:32:55.188Z",
    "lastRunVersion": "2026.1.22",
    "lastRunCommand": "configure",
    "lastRunMode": "local"
  },
  "auth": {
    "profiles": {
      "openai-codex:default": {
        "provider": "openai-codex",
        "mode": "oauth"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai-codex/gpt-5.2"
      },
      "models": {
        "openai-codex/gpt-5.2": {}
      },
      "workspace": "/root/clawd",
      "compaction": {
        "mode": "safeguard"
      },
      "maxConcurrent": 4,
      "subagents": {
        "maxConcurrent": 8
      }
    }
  },
  "messages": {
    "ackReactionScope": "group-mentions"
  },
  "commands": {
    "native": "auto",
    "nativeSkills": "auto"
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "dmPolicy": "pairing",
      "botToken": "--",
      "groups": {
        "-5276928642": {
          "requireMention": true
        }
      },
      "groupPolicy": "open",
      "streamMode": "partial"
    }
  },
  "gateway": {
    "port": 18789,
    "mode": "local",
    "bind": "loopback",
    "auth": {
      "mode": "token",
      "token": "--"
    },
    "tailscale": {
      "mode": "off",
      "resetOnExit": false
    }
  },
  "skills": {
    "install": {
      "nodeManager": "npm"
    }
  },
  "plugins": {
    "entries": {
      "telegram": {
        "enabled": true
      }
    }
  }
}

update 2: i see, agent inflates LLM context by appending the entire chat history, and every tool output (even huge ones) to every call during tool-calling loops, causing token-limit crashes mid-turn.

Q: so if my session has 1000 messages and a new message needs a tool call that lists all files on my system (like 1M file paths), the agent will send all of that to the llm. and if the loop needs 1–20 more calls, then that accumulated data will get sent to the llm again for each of those calls, right?

A: Short version: yes, it will happily try to send the 1M paths + full history on every loop iteration until it crashes.

Q: do cron jobs have “the same problem” as a new message?

A:

  • Main session cron jobs → Yes, almost exactly the same problem. They enqueue a system event into the main heartbeat loop → full main session context is loaded (including any prior huge tool dumps or 1000-message history) → same risk of context explosion if the job triggers large tool outputs or chains.
  • Isolated session cron jobs (the recommended mode for most scheduled tasks) → No, they largely avoid the problem.
    • Every execution starts with a clean slate (fresh session ID, no prior conversation carry-over).
    • LLM input = bootstrap files + the single cron prompt (message field) + any tool calls/results within that one run.
    • No inherited 1M-file-list or bloated history from previous turns/chats.
    • Output to main session is either a short summary (default) or capped full text → minimal pollution.
    • Recent release note (from GitHub): “isolated cron jobs now start a fresh session id on every run to prevent context buildup” — explicitly designed to solve exactly this kind of accumulation issue.

Q: does an isolated cron job still have issues with tool calls?

A: Yes, isolated cron jobs can still have the tool call accumulation issue — but it’s far less severe and contained compared to normal chat sessions.

Scenario Loads prior session history? Tool loop accumulation risk? Blow-up survives to next run? Overall context explosion risk Recommended for heavy tools?
Normal chat message Yes (full 1000+ messages) Yes – accumulates forever Yes (poisons JSONL forever) Very High No – avoid large outputs
Main-session cron job Yes (full main session) Yes – accumulates in turn Yes (poisons main session) High No – use only light tasks
Isolated cron job No (fresh session every run) Yes – but only within one run No (next run starts clean) Medium (contained per-run) Yes – safest for automation

Quick notes:

  • Isolated cron is the only mode that reliably prevents chronic history/tool-result pollution.
  • All modes can still hit per-turn token limits if a single tool call returns huge output (e.g. 1M file paths).

update 4: lobster’s wife turns out to be the best-fit solution. she should take care of the budget before lobster burns all the family wealth.

the idea is simple: before clawdbot talks to the LLM, it must declare all the context involved and get approval — or reduce it. and only ask for more when the LLM really needs it (by vectorizing the context and using a search tool on that hot memory).

but let me ask my wife about budgeting experience first before proposing any implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions