the tokens got burned by dragging a huge context forward

i tried clawdbot today — great experience. thanks to Peter and all the maintainers.

but i found a tiny itchy thing that might help improve clawdbot, so let me drop it here.

today we(me and my clawdbot) hit the 5 hours limit twice:
- first time: after ~400 messages
- second time: after only 20–30 messages.

and all requests in the second round were super small, with very strict token limits.

so yeah...clawdbot went from “i’m tired” to “i’m dead” real fast.

this was the message from clawdbot right before we hit the limit.

\---
The cost spike isn’t because you “asked something big” — it’s because we accidentally pulled big data into the main DM context, and then every normal message kept dragging that context along.

What happened (root cause)

• We ran commands that returned huge outputs (especially gateway config.schema → massive JSON).
• Clawdbot stores tool outputs in the session transcript, so the main DM session context became huge.
• After that, even small questions caused the model to process a lot of cached context (cacheRead in the hundreds of thousands per turn).
Best fixes to reduce cost (do these now)

1. Reset / start a fresh main DM session  • This immediately stops carrying the giant schema/history forward.
  • Easiest: you say “reset session”, and I’ll start replying in a new session context (or you can just start a new DM thread).

2. Never run “big output” tools in the main DM  • For things like config.schema, status --all, logs dumps: do them in an isolated debug session (separate session key) and only paste back a small snippet.

3. Stop cron from chatting / hitting the model  • Right now your cron jobs are triggering “agent turns” frequently; even “no-op” runs can still consume tokens depending on how they’re implemented.
  • Change cron behavior so it runs a shell script only and only wakes the agent when there’s a finding (or run less frequently).

4. Turn on aggressive compaction / smaller context for main  • Main session is currently ~56–58% of a 400k window; that’s expensive to keep re-sending/caching.
  • We can set lower agents.defaults.contextTokens (e.g. 50k–100k) and/or a more aggressive compaction mode.

If you want the “one-click” plan

Tell me: “reset main session + move heavy diagnostics to isolated + reduce cron token use” and I’ll propose the exact config edits (and only apply after you confirm).

\---

not sure if we should introduce a compact mode (similar to claude code) with per-session token stats to help prevent this issue.

btw happy clawding!

**updates 1:**

- after feeling something was weird, i checked the docs and saw that we already have [compaction](https://docs.clawd.bot/concepts/compaction) and it's auto-on by default. so now the question becomes: why isn't it working ? 
 
```
~ clawdbot models list
🦞 Clawdbot 2026.1.22 (c48751a) — Hot reload for config, cold sweat for deploys.

Model                                      Input      Ctx      Local Auth  Tags
openai-codex/gpt-5.2                       text+image 266k     no    yes   default,configured
```

```
~ cat ~/.clawdbot/clawdbot.json
{
  "meta": {
    "lastTouchedVersion": "2026.1.22",
    "lastTouchedAt": "2026-01-24T12:32:55.198Z"
  },
  "wizard": {
    "lastRunAt": "2026-01-24T12:32:55.188Z",
    "lastRunVersion": "2026.1.22",
    "lastRunCommand": "configure",
    "lastRunMode": "local"
  },
  "auth": {
    "profiles": {
      "openai-codex:default": {
        "provider": "openai-codex",
        "mode": "oauth"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai-codex/gpt-5.2"
      },
      "models": {
        "openai-codex/gpt-5.2": {}
      },
      "workspace": "/root/clawd",
      "compaction": {
        "mode": "safeguard"
      },
      "maxConcurrent": 4,
      "subagents": {
        "maxConcurrent": 8
      }
    }
  },
  "messages": {
    "ackReactionScope": "group-mentions"
  },
  "commands": {
    "native": "auto",
    "nativeSkills": "auto"
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "dmPolicy": "pairing",
      "botToken": "--",
      "groups": {
        "-5276928642": {
          "requireMention": true
        }
      },
      "groupPolicy": "open",
      "streamMode": "partial"
    }
  },
  "gateway": {
    "port": 18789,
    "mode": "local",
    "bind": "loopback",
    "auth": {
      "mode": "token",
      "token": "--"
    },
    "tailscale": {
      "mode": "off",
      "resetOnExit": false
    }
  },
  "skills": {
    "install": {
      "nodeManager": "npm"
    }
  },
  "plugins": {
    "entries": {
      "telegram": {
        "enabled": true
      }
    }
  }
}
```

**update 2:** i see, agent inflates LLM context by appending the entire chat history, and every tool output (even huge ones) to every call during tool-calling loops, causing token-limit crashes mid-turn.

  **Q:** so if my session has 1000 messages and a new message needs a tool call that lists all files on my system (like 1M file paths), the agent will send all of that to the llm. and if the loop needs 1–20 more calls, then that accumulated data will get sent to the llm again for each of those calls, right?

  **A:** Short version: yes, it will happily try to send the 1M paths + full history on every loop iteration until it crashes.

  **Q:** do cron jobs have “the same problem” as a new message?

  **A:** 
  - Main session cron jobs → Yes, almost exactly the same problem. They enqueue a system event into the main heartbeat loop → full main session context is loaded (including any prior huge tool dumps or 1000-message history) → same risk of context explosion if the job triggers large tool outputs or chains.
  - Isolated session cron jobs (the recommended mode for most scheduled tasks) → No, they largely avoid the problem.
	- Every execution starts with a clean slate (fresh session ID, no prior conversation carry-over).
	- LLM input = bootstrap files + the single cron prompt (message field) + any tool calls/results within that one run.
	- No inherited 1M-file-list or bloated history from previous turns/chats.
	- Output to main session is either a short summary (default) or capped full text → minimal pollution.
	- Recent release note (from GitHub): “isolated cron jobs now start a fresh session id on every run to prevent context buildup” — explicitly designed to solve exactly this kind of accumulation issue.

  **Q:** does an isolated cron job still have issues with tool calls?

  **A:** Yes, isolated cron jobs can still have the tool call accumulation issue — but it’s far less severe and contained compared to normal chat sessions.


  | Scenario                  | Loads prior session history? | Tool loop accumulation risk? | Blow-up survives to next run? | Overall context explosion risk | Recommended for heavy tools? |
|---------------------------|------------------------------|-------------------------------|--------------------------------|--------------------------------|------------------------------|
| Normal chat message       | Yes (full 1000+ messages)    | Yes – accumulates forever     | Yes (poisons JSONL forever)    | Very High                      | No – avoid large outputs     |
| Main-session cron job     | Yes (full main session)      | Yes – accumulates in turn     | Yes (poisons main session)     | High                           | No – use only light tasks    |
| Isolated cron job         | No (fresh session every run) | Yes – but only within one run | No (next run starts clean)     | Medium (contained per-run)     | Yes – safest for automation  |

 Quick notes:
  - **Isolated cron** is the only mode that reliably prevents chronic history/tool-result pollution.
  - All modes can still hit per-turn token limits if a single tool call returns huge output (e.g. 1M file paths).

**update 4:** lobster’s wife turns out to be the best-fit solution. she should take care of the budget before lobster burns all the family wealth.

the idea is simple: before clawdbot talks to the LLM, it must declare all the context involved and get approval — or reduce it. and only ask for more when the LLM really needs it (by vectorizing the context and using a search tool on that hot memory).

but let me ask my wife about budgeting experience first before proposing any implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

the tokens got burned by dragging a huge context forward #1594

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scenario	Loads prior session history?	Tool loop accumulation risk?	Blow-up survives to next run?	Overall context explosion risk	Recommended for heavy tools?
Normal chat message	Yes (full 1000+ messages)	Yes – accumulates forever	Yes (poisons JSONL forever)	Very High	No – avoid large outputs
Main-session cron job	Yes (full main session)	Yes – accumulates in turn	Yes (poisons main session)	High	No – use only light tasks
Isolated cron job	No (fresh session every run)	Yes – but only within one run	No (next run starts clean)	Medium (contained per-run)	Yes – safest for automation

Uh oh!

the tokens got burned by dragging a huge context forward #1594

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions