You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
The AGENTS.md template auto-generated by openclaw doctor --fix orders content with personality/onboarding guidance at the top and the load-bearing ## Red Lines + tool-use guidance at the bottom; when a user lowers agents.defaults.bootstrapMaxChars (typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.
Steps to reproduce
Install OpenClaw 2026.4.9 standalone on a fresh host.
Run openclaw doctor --fix. Inspect the auto-generated file:
$ wc -c ~/.openclaw/workspace/AGENTS.md
7809 ~/.openclaw/workspace/AGENTS.md
$ head -10 ~/.openclaw/workspace/AGENTS.md
# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run
If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, ...
## Session Startup
Before doing anything else: ...
The relevant ## Red Lines and ## External vs Internal sections are at the bottom of the file (after ## Memory, ## Group Chats, etc.).
Set a typical small-model trim:
openclaw config set agents.defaults.bootstrapMaxChars 1500
systemctl --user restart openclaw-gateway.service
Run an agent turn against a small/mid local model (Hermes-3 8B, Qwen3 8B, etc.) requiring tool dispatch:
openclaw agent --session-id repro -m "Search memory for X, then if you find anything fetch https://example.com and tell me whether they agree." --json
Inspect the response and the session jsonl at ~/.openclaw/agents/main/sessions/<sessionId>.jsonl for toolCall events.
Expected behavior
The auto-generated AGENTS.md should put the most operationally-critical guidance — tool-use rules and red-lines — at the top, so head-truncation by bootstrapMaxChars preserves them. Either:
(a) Reorder the auto-generated template, OR
(b) Add a content-priority hint that the prompt builder respects (e.g. fenced sections marked <!-- bootstrap-priority: high -->), OR
(c) Switch from head-truncation to a content-aware truncation that keeps high-priority sections, OR
(d) Document the truncation behavior in bootstrapMaxChars's schema entry and tell users to manually re-order their AGENTS.md if they trim.
Actual behavior
With bootstrapMaxChars: 1500 and the default auto-generated AGENTS.md (7,809 chars), only the first 1,500 chars are injected. Those 1,500 chars contain ## First Run, ## Session Startup (read SOUL.md, USER.md, etc.), and the start of ## Memory. The injected content does NOT contain ## Red Lines, ## External vs Internal (which lists "Search the web" as safe-to-do-freely), or any explicit tool-use guidance. The agent then proceeds without instruction on when to invoke tools — and on a small/mid model, defaults to producing plausible-sounding text describing tool use rather than emitting structured tool_call events.
Concrete repro (Hermes-3-Llama-3.1-8B on bare OpenClaw + vLLM + --tool-call-parser hermes):
After manually rewriting AGENTS.md to put a ## How to use tools (READ THIS FIRST) section at the top (with explicit examples for memory_search, web_fetch, exec, and chained workflows), keeping bootstrapMaxChars: 1500: rung 5 emitted 1 structured toolCall, got a toolResult with HTTP 200 from example.com, and quoted the actual page content in the reply. Rung 7 (chained 3-step) emitted 3 structured toolCall events.
npm global (npm install -g [email protected]), Node v22.22.2 via nvm
Model
NousResearch/Hermes-3-Llama-3.1-8B (representative; the same content-ordering bug surfaces on any small/mid model where users lower bootstrapMaxChars to fit context budgets)
Standalone OpenClaw on a host (no NemoClaw sandbox). vLLM 0.19.1 Docker container at :8002 with --enable-auto-tool-choice --tool-call-parser hermes --gpu-memory-utilization 0.20 --max-model-len 32768. ~/.openclaw/openclaw.jsongateway.mode=local, primary model inference/hermes-3-llama-3.1-8b.
Logs, screenshots, and evidence
Auto-generated `AGENTS.md` outline (after `openclaw doctor --fix`):
# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run ← gets injected when bootstrapMaxChars=1500## Session Startup ← gets injected## Memory ← partially injected### 🧠 MEMORY.md - Your Long-Term Memory### 📝 Write It Down - No "Mental Notes"!## Red Lines ← TRUNCATED OUT (was the goal)## External vs Internal ← TRUNCATED OUT## Group Chats ← TRUNCATED OUT### 💬 Know When to Speak! ← TRUNCATED OUT
Session jsonl from a "broken" run (auto-generated `AGENTS.md` + `bootstrapMaxChars: 1500`, prompt: "Fetch https://example.com and summarize"):
L5 message role=user
L6 message role=assistant ctypes=['text']
text="The fetched page from https://example.com is a security notice indicating..."
← hallucinated; gateway log has zero tool|fetch|invoke markers
Session jsonl from the "working" run (rewritten `AGENTS.md` with tool-rules at top, same `bootstrapMaxChars: 1500`):
L5 message role=user
L6 message role=assistant ctypes=['toolCall'] ← STRUCTURED TOOL CALL
L7 message role=toolResult ctypes=['text']
text='{"url":"https://example.com","status":200,"contentType":"text/html",...}'
L8 message role=assistant ctypes=['text']
text="The fetched page at https://example.com is a security notice..."
← real summary of real content
Same model, same vLLM, same parser, same `bootstrapMaxChars` — only difference is content ordering inside `AGENTS.md`.
Impact and severity
Affected: any user lowering bootstrapMaxChars to fit a small/mid model context budget — most commonly local-inference users on consumer GPUs, but also anyone optimizing token cost on cloud APIs. Per OpenClaw issue #22438 ("Tiered bootstrap file loading") and the body of public-domain research on long-prompt tool-following degradation (AGENTIF / Berkeley / Tsinghua), this user segment is significant and growing.
Severity: medium. Functional workaround exists (rewrite AGENTS.md manually, as we did), but the failure mode is silent and easy to misdiagnose as a model defect or a tool-call-parser defect (which is how we initially diagnosed it before the public-domain review surfaced #41304's root-cause analysis).
Frequency: deterministic when bootstrapMaxChars < ~3000 with default auto-generated content; behavior varies above that threshold.
Consequence: silent tool-dispatch hallucination — the agent claims to have searched/fetched/executed without actually doing so. Particularly dangerous because the hallucinated reply often sounds correct (Hermes-3 8B's example.com "summary" was generic enough to be plausible). Real-world consequences include: missed alerts, fabricated facts presented as authoritative, security guidance ignored.
Quickest win — reorder the auto-generated AGENTS.md template content (single change to the bootstrap-template generator). 1-line config change for users to update existing workspaces by re-running openclaw doctor --fix --regenerate-bootstrap-files (if such a flag exists; otherwise document a manual reset).
Better long-term — content-priority annotations the prompt builder respects.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
The
AGENTS.mdtemplate auto-generated byopenclaw doctor --fixorders content with personality/onboarding guidance at the top and the load-bearing## Red Lines+ tool-use guidance at the bottom; when a user lowersagents.defaults.bootstrapMaxChars(typically to fit a small/mid model context budget), the head-truncation strips exactly the rules that govern tool-dispatch reliability and external-action safety, while preserving the less-critical content.Steps to reproduce
openclaw doctor --fix. Inspect the auto-generated file:## Red Linesand## External vs Internalsections are at the bottom of the file (after## Memory,## Group Chats, etc.).~/.openclaw/agents/main/sessions/<sessionId>.jsonlfortoolCallevents.Expected behavior
The auto-generated
AGENTS.mdshould put the most operationally-critical guidance — tool-use rules and red-lines — at the top, so head-truncation bybootstrapMaxCharspreserves them. Either:<!-- bootstrap-priority: high -->), ORbootstrapMaxChars's schema entry and tell users to manually re-order theirAGENTS.mdif they trim.Actual behavior
With
bootstrapMaxChars: 1500and the default auto-generatedAGENTS.md(7,809 chars), only the first 1,500 chars are injected. Those 1,500 chars contain## First Run,## Session Startup(read SOUL.md, USER.md, etc.), and the start of## Memory. The injected content does NOT contain## Red Lines,## External vs Internal(which lists "Search the web" as safe-to-do-freely), or any explicit tool-use guidance. The agent then proceeds without instruction on when to invoke tools — and on a small/mid model, defaults to producing plausible-sounding text describing tool use rather than emitting structuredtool_callevents.Concrete repro (Hermes-3-Llama-3.1-8B on bare OpenClaw + vLLM +
--tool-call-parser hermes):AGENTS.md+bootstrapMaxChars: 1500: rung 5 ("Fetch https://example.com and summarize") returned a hallucinated generic summary; session jsonl had 0toolCallevents; no actual HTTP request was made.AGENTS.mdto put a## How to use tools (READ THIS FIRST)section at the top (with explicit examples formemory_search,web_fetch,exec, and chained workflows), keepingbootstrapMaxChars: 1500: rung 5 emitted 1 structuredtoolCall, got atoolResultwith HTTP 200 from example.com, and quoted the actual page content in the reply. Rung 7 (chained 3-step) emitted 3 structuredtoolCallevents.OpenClaw version
2026.4.9 (build 0512059)
Operating system
Ubuntu 24.04 LTS aarch64 (Linux 6.17.0-1014-nvidia)
Install method
npm global (
npm install -g [email protected]), Node v22.22.2 via nvmModel
NousResearch/Hermes-3-Llama-3.1-8B (representative; the same content-ordering bug surfaces on any small/mid model where users lower
bootstrapMaxCharsto fit context budgets)Provider / routing chain
openclaw (standalone host gateway) → vLLM (http://127.0.0.1:8002/v1) → Hermes-3-Llama-3.1-8B
Additional provider/model setup details
Standalone OpenClaw on a host (no NemoClaw sandbox). vLLM 0.19.1 Docker container at
:8002with--enable-auto-tool-choice --tool-call-parser hermes --gpu-memory-utilization 0.20 --max-model-len 32768.~/.openclaw/openclaw.jsongateway.mode=local, primary modelinference/hermes-3-llama-3.1-8b.Logs, screenshots, and evidence
Impact and severity
Affected: any user lowering
bootstrapMaxCharsto fit a small/mid model context budget — most commonly local-inference users on consumer GPUs, but also anyone optimizing token cost on cloud APIs. Per OpenClaw issue #22438 ("Tiered bootstrap file loading") and the body of public-domain research on long-prompt tool-following degradation (AGENTIF / Berkeley / Tsinghua), this user segment is significant and growing.Severity: medium. Functional workaround exists (rewrite
AGENTS.mdmanually, as we did), but the failure mode is silent and easy to misdiagnose as a model defect or a tool-call-parser defect (which is how we initially diagnosed it before the public-domain review surfaced #41304's root-cause analysis).Frequency: deterministic when
bootstrapMaxChars< ~3000 with default auto-generated content; behavior varies above that threshold.Consequence: silent tool-dispatch hallucination — the agent claims to have searched/fetched/executed without actually doing so. Particularly dangerous because the hallucinated reply often sounds correct (Hermes-3 8B's example.com "summary" was generic enough to be plausible). Real-world consequences include: missed alerts, fabricated facts presented as authoritative, security guidance ignored.
Additional information
Related issues:
bootstrapTier: minimal | standard | full. This (Track and persist Claude's returned session ID for conversation continuity #8) is more specific: the content order within each file, separate from how many files get loaded.bootstrapMaxCharsPerFileas unrecognized key (2026.4.5) #62182 (closed) "Config validation rejectsbootstrapMaxCharsPerFileas unrecognized" — closed in favor of the uniformbootstrapMaxChars. Per-file priority would be one path; content-priority within a file (this issue) is another.Recommended order of resolution:
AGENTS.mdtemplate content (single change to the bootstrap-template generator). 1-line config change for users to update existing workspaces by re-runningopenclaw doctor --fix --regenerate-bootstrap-files(if such a flag exists; otherwise document a manual reset).