Skip to content

[Bug]: Heartbeat ignores lightContext: true, loads full agent context + unbounded session history #43767

@dwdupont

Description

@dwdupont

Bug type

Regression (worked before, now fails)

Summary

Heartbeat ignores lightContext: true, loads full agent context + unbounded session history

Steps to reproduce

Bug Description

The heartbeat feature loads the full agent context and accumulated conversation history on every heartbeat tick, instead of honoring the lightContext: true directive in HEARTBEAT.md. This causes unbounded token growth that eventually maxes out the context window and burns through API credits.

Environment

  • OpenClaw version: 2026.3.8
  • Deployment: Docker/Podman (containerized)
  • LLM provider: Anthropic (claude-sonnet-4-6) via LiteLLM proxy
  • Heartbeat interval: 5m

Expected behavior

Expected Behavior

  1. HEARTBEAT.md specifies lightContext: true — heartbeat turns should inject only HEARTBEAT.md, not the full workspace context
  2. Each heartbeat should run in an isolated session (fresh context), not append to the main conversation history
  3. There should be a token budget/cap for heartbeat runs

Actual behavior

Actual Behavior

  1. lightContext: true is completely ignored. The systemPromptReport in sessions.json shows ALL workspace files injected on every heartbeat: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, MEMORY.md (~29K chars system prompt)
  2. Heartbeat runs append to a single persistent session (agent:files:main). The session grows monotonically until it hits the model's context limit (200K tokens)
  3. No token budget exists — the heartbeat sent 200,656 tokens in its final request before failing with ContextWindowExceededError

OpenClaw version

OpenClaw version: 2026.3.8

Operating system

Windows 11

Install method

npm

Model

Anthropic claude code

Provider / routing chain

→ LiteLLM proxy (http://litellm:4000, running in openclaw-litellm container) → Anthropic API (api.anthropic.com/v1/messages)

Config file / key location

No response

Additional provider/model setup details

Evidence from sessions.json

{
  "deliveryContext": {
    "channel": "webchat",
    "to": "heartbeat"
  },
  "origin": {
    "provider": "heartbeat"
  },
  "contextTokens": 200000,
  "inputTokens": 118222,
  "outputTokens": 1416,
  "compactionCount": 1
}

LiteLLM Proxy Logs (showing the runaway calls)

06:04:09 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:05:21 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:06:05 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:06:23 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:06:38 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:11:11 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:11:31 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:11:44 - litellm.acompletion(model=anthropic/claude-sonnet-4-6) 200 OK
06:16:02 - ContextWindowExceededError: prompt is too long: 200656 tokens > 200000 maximum

Agent Logs (showing retry storm pattern)

Every 5 minutes, the heartbeat fires and retries 3-4 times on rate limit before giving up:

04:14:40 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:15:05 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:15:33 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:16:03 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
  -- 5 min pause, then repeats --
04:19:35 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:20:01 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:20:31 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
04:21:06 [agent/embedded] embedded run agent end: isError=true error=API rate limit reached
  -- pattern continued for hours --

Suggested Fixes

  1. Honor lightContext: true — only inject the file that declares it, not the full workspace
  2. Isolate heartbeat sessions — each heartbeat should start fresh, not accumulate history
  3. Add maxContextTokens config for heartbeat runs (e.g., 10K cap)
  4. Fail fast on rate limit — don't retry heartbeat runs on 429, just skip to next interval
  5. Add token budget tracking — if heartbeat exceeds N tokens, auto-reset the session

Related

Workaround

Set heartbeat.every to "off" in openclaw.json to disable heartbeats entirely.

Logs, screenshots, and evidence

Impact and severity

Impact

  • Credit burn: With a 5m heartbeat, the agent was making 288 full-context API calls per day. When rate-limited, it retried 3-4 times per cycle, multiplying the cost
  • Retry storm on rate limit: Each rate-limited heartbeat retried with the full 200K token payload multiple times before giving up
  • No circuit breaker: Nothing stops the heartbeat from sending increasingly large requests as history accumulates

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions