Skip to content

[Bug]: Auto-generated AGENTS.md puts load-bearing tool-use rules at the bottom; head-truncation by bootstrapMaxChars strips them #75187

@camerono

Description

@camerono

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

The AGENTS.md template auto-generated by openclaw doctor --fix orders content with personality/onboarding guidance at the top and the load-bearing ## Red Lines + tool-use guidance at the bottom; when a user lowers agents.defaults.bootstrapMaxChars (typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.

Steps to reproduce

  1. Install OpenClaw 2026.4.9 standalone on a fresh host.
  2. Run openclaw doctor --fix. Inspect the auto-generated file:
    $ wc -c ~/.openclaw/workspace/AGENTS.md
    7809 ~/.openclaw/workspace/AGENTS.md
    $ head -10 ~/.openclaw/workspace/AGENTS.md
    # AGENTS.md - Your Workspace
    This folder is home. Treat it that way.
    ## First Run
    If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, ...
    ## Session Startup
    Before doing anything else: ...
    
    The relevant ## Red Lines and ## External vs Internal sections are at the bottom of the file (after ## Memory, ## Group Chats, etc.).
  3. Set a typical small-model trim:
    openclaw config set agents.defaults.bootstrapMaxChars 1500
    systemctl --user restart openclaw-gateway.service
    
  4. Run an agent turn against a small/mid local model (Hermes-3 8B, Qwen3 8B, etc.) requiring tool dispatch:
    openclaw agent --session-id repro -m "Search memory for X, then if you find anything fetch https://example.com and tell me whether they agree." --json
    
  5. Inspect the response and the session jsonl at ~/.openclaw/agents/main/sessions/<sessionId>.jsonl for toolCall events.

Expected behavior

The auto-generated AGENTS.md should put the most operationally-critical guidance — tool-use rules and red-lines — at the top, so head-truncation by bootstrapMaxChars preserves them. Either:

  • (a) Reorder the auto-generated template, OR
  • (b) Add a content-priority hint that the prompt builder respects (e.g. fenced sections marked <!-- bootstrap-priority: high -->), OR
  • (c) Switch from head-truncation to a content-aware truncation that keeps high-priority sections, OR
  • (d) Document the truncation behavior in bootstrapMaxChars's schema entry and tell users to manually re-order their AGENTS.md if they trim.

Actual behavior

With bootstrapMaxChars: 1500 and the default auto-generated AGENTS.md (7,809 chars), only the first 1,500 chars are injected. Those 1,500 chars contain ## First Run, ## Session Startup (read SOUL.md, USER.md, etc.), and the start of ## Memory. The injected content does NOT contain ## Red Lines, ## External vs Internal (which lists "Search the web" as safe-to-do-freely), or any explicit tool-use guidance. The agent then proceeds without instruction on when to invoke tools — and on a small/mid model, defaults to producing plausible-sounding text describing tool use rather than emitting structured tool_call events.

Concrete repro (Hermes-3-Llama-3.1-8B on bare OpenClaw + vLLM + --tool-call-parser hermes):

  • With auto-generated AGENTS.md + bootstrapMaxChars: 1500: rung 5 ("Fetch https://example.com and summarize") returned a hallucinated generic summary; session jsonl had 0 toolCall events; no actual HTTP request was made.
  • After manually rewriting AGENTS.md to put a ## How to use tools (READ THIS FIRST) section at the top (with explicit examples for memory_search, web_fetch, exec, and chained workflows), keeping bootstrapMaxChars: 1500: rung 5 emitted 1 structured toolCall, got a toolResult with HTTP 200 from example.com, and quoted the actual page content in the reply. Rung 7 (chained 3-step) emitted 3 structured toolCall events.

OpenClaw version

2026.4.9 (build 0512059)

Operating system

Ubuntu 24.04 LTS aarch64 (Linux 6.17.0-1014-nvidia)

Install method

npm global (npm install -g [email protected]), Node v22.22.2 via nvm

Model

NousResearch/Hermes-3-Llama-3.1-8B (representative; the same content-ordering bug surfaces on any small/mid model where users lower bootstrapMaxChars to fit context budgets)

Provider / routing chain

openclaw (standalone host gateway) → vLLM (http://127.0.0.1:8002/v1) → Hermes-3-Llama-3.1-8B

Additional provider/model setup details

Standalone OpenClaw on a host (no NemoClaw sandbox). vLLM 0.19.1 Docker container at :8002 with --enable-auto-tool-choice --tool-call-parser hermes --gpu-memory-utilization 0.20 --max-model-len 32768. ~/.openclaw/openclaw.json gateway.mode=local, primary model inference/hermes-3-llama-3.1-8b.

Logs, screenshots, and evidence

Auto-generated `AGENTS.md` outline (after `openclaw doctor --fix`):

# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run                          ← gets injected when bootstrapMaxChars=1500
## Session Startup                    ← gets injected
## Memory                             ← partially injected
### 🧠 MEMORY.md - Your Long-Term Memory
### 📝 Write It Down - No "Mental Notes"!
## Red Lines                          ← TRUNCATED OUT (was the goal)
## External vs Internal               ← TRUNCATED OUT
## Group Chats                        ← TRUNCATED OUT
### 💬 Know When to Speak!            ← TRUNCATED OUT


Session jsonl from a "broken" run (auto-generated `AGENTS.md` + `bootstrapMaxChars: 1500`, prompt: "Fetch https://example.com and summarize"):

L5  message  role=user
L6  message  role=assistant  ctypes=['text']
            text="The fetched page from https://example.com is a security notice indicating..."
            ← hallucinated; gateway log has zero tool|fetch|invoke markers


Session jsonl from the "working" run (rewritten `AGENTS.md` with tool-rules at top, same `bootstrapMaxChars: 1500`):

L5  message  role=user
L6  message  role=assistant  ctypes=['toolCall']  ← STRUCTURED TOOL CALL
L7  message  role=toolResult ctypes=['text']
            text='{"url":"https://example.com","status":200,"contentType":"text/html",...}'
L8  message  role=assistant  ctypes=['text']
            text="The fetched page at https://example.com is a security notice..."
            ← real summary of real content


Same model, same vLLM, same parser, same `bootstrapMaxChars` — only difference is content ordering inside `AGENTS.md`.

Impact and severity

Affected: any user lowering bootstrapMaxChars to fit a small/mid model context budget — most commonly local-inference users on consumer GPUs, but also anyone optimizing token cost on cloud APIs. Per OpenClaw issue #22438 ("Tiered bootstrap file loading") and the body of public-domain research on long-prompt tool-following degradation (AGENTIF / Berkeley / Tsinghua), this user segment is significant and growing.

Severity: medium. Functional workaround exists (rewrite AGENTS.md manually, as we did), but the failure mode is silent and easy to misdiagnose as a model defect or a tool-call-parser defect (which is how we initially diagnosed it before the public-domain review surfaced #41304's root-cause analysis).

Frequency: deterministic when bootstrapMaxChars < ~3000 with default auto-generated content; behavior varies above that threshold.

Consequence: silent tool-dispatch hallucination — the agent claims to have searched/fetched/executed without actually doing so. Particularly dangerous because the hallucinated reply often sounds correct (Hermes-3 8B's example.com "summary" was generic enough to be plausible). Real-world consequences include: missed alerts, fabricated facts presented as authoritative, security guidance ignored.

Additional information

Related issues:

Recommended order of resolution:

  1. Quickest win — reorder the auto-generated AGENTS.md template content (single change to the bootstrap-template generator). 1-line config change for users to update existing workspaces by re-running openclaw doctor --fix --regenerate-bootstrap-files (if such a flag exists; otherwise document a manual reset).
  2. Better long-term — content-priority annotations the prompt builder respects.
  3. Best for power-users — tier-based bootstrap (per feat: Tiered bootstrap file loading for progressive context control #22438 / feat(workspace): add tiered bootstrap loading with configurable bootstrapTier #22439) with content priority.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions