feat(doctor): warn when workspace bootstrap files exceed size limits#32654
feat(doctor): warn when workspace bootstrap files exceed size limits#32654Asclepeion wants to merge 2 commits intoopenclaw:mainfrom
Conversation
When agents produce text between tool calls, each text block gets
delivered as a separate chat message ('narration leak'). This adds
a [THINK] prefix that agents can use to mark internal reasoning:
- [THINK] reasoning... → suppressed entirely
- [THINK] reasoning [/THINK] actual reply → delivers only 'actual reply'
The filter is applied at the normalizeReplyPayload layer, alongside
the existing NO_REPLY/HEARTBEAT_OK suppression. Case-insensitive,
tolerates leading whitespace, only matches at text start.
Includes 16 new tests (10 for stripThinkPrefix, 6 for normalize integration).
All existing tests pass with zero regressions.
Adds a new doctor check that scans workspace bootstrap files (AGENTS.md, SOUL.md, HEARTBEAT.md, etc.) against the configured per-file and total character limits. When files will be truncated in the system prompt, the check warns with: - File name, actual size, limit, and truncation percentage - Total size warning when all files combined exceed total limit - Config hint suggesting bootstrapMaxChars / bootstrapTotalMaxChars This makes the silent truncation visible during setup, so users discover the issue from `openclaw doctor` instead of debugging why their agent ignores instructions buried in truncated files. 6 new tests covering: within-limits, per-file warning, total warning, missing workspace, empty workspace, default limit config hint. Addresses openclaw#17052
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 58f9d3ff80
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const thinkResult = stripThinkPrefix(text); | ||
| if (thinkResult.hadThinkPrefix) { |
There was a problem hiding this comment.
Preserve THINK suppression state across streamed chunks
This check only strips when the current payload starts with [THINK], but normalizeReplyPayload is invoked independently for each queued block/tool/final payload. In streamed block replies, a long think block can be split so the first chunk starts with [THINK] (dropped) while subsequent chunks no longer start with the prefix and get delivered, leaking internal reasoning (including text before a later [/THINK]). This needs stream-level/stateful suppression until the closing tag is seen.
Useful? React with 👍 / 👎.
| continue; | ||
| } | ||
| const fileChars = file.content.length; | ||
| totalChars += fileChars; |
There was a problem hiding this comment.
Base total-size warning on injected bootstrap chars
The total warning uses raw file lengths (totalChars += fileChars) and compares them to bootstrapTotalMaxChars, but runtime budgeting applies per-file truncation before consuming the total budget in buildBootstrapContextFiles. That mismatch can produce false "Total ... content will be lost" warnings (for example, a single huge file that is already truncated by the per-file cap and never reaches the total cap), which points users at the wrong config knob.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR bundles two separate features: (1) the primary Key findings:
Confidence Score: 3/5
Last reviewed commit: 58f9d3f |
| const closeIdx = upper.indexOf(THINK_CLOSE); | ||
| if (closeIdx === -1) { | ||
| // No closing tag — the entire text is think content | ||
| return { text: "", hadThinkPrefix: true }; | ||
| } | ||
|
|
||
| // Strip everything from start through the closing tag | ||
| const afterClose = trimmed.slice(closeIdx + THINK_CLOSE.length).trim(); | ||
| return { text: afterClose, hadThinkPrefix: true }; |
There was a problem hiding this comment.
Unicode expansion causes wrong slice index
closeIdx is derived from upper.indexOf(THINK_CLOSE) where upper = trimmed.toUpperCase(). .toUpperCase() can expand certain Unicode code points — most notably ß → SS, fi → FI, ff → FF — so upper may be longer than trimmed. Using that index to slice back into trimmed then produces the wrong result.
Concrete example: "[THINK] ß [/THINK] reply" — ß expands by one code unit in upper, shifting closeIdx by 1, so trimmed.slice(closeIdx + THINK_CLOSE.length) returns "eply" instead of "reply".
The fix is to search directly on trimmed with a case-insensitive regex, which avoids the length mismatch entirely:
| const closeIdx = upper.indexOf(THINK_CLOSE); | |
| if (closeIdx === -1) { | |
| // No closing tag — the entire text is think content | |
| return { text: "", hadThinkPrefix: true }; | |
| } | |
| // Strip everything from start through the closing tag | |
| const afterClose = trimmed.slice(closeIdx + THINK_CLOSE.length).trim(); | |
| return { text: afterClose, hadThinkPrefix: true }; | |
| // Look for the closing tag (case-insensitive) using a regex so the match | |
| // index refers to `trimmed` directly, avoiding toUpperCase() length expansion. | |
| const closeMatch = /\[\/THINK\]/i.exec(trimmed); | |
| if (!closeMatch) { | |
| // No closing tag — the entire text is think content | |
| return { text: "", hadThinkPrefix: true }; | |
| } | |
| // Strip everything from start through the closing tag | |
| const afterClose = trimmed.slice(closeMatch.index + closeMatch[0].length).trim(); | |
| return { text: afterClose, hadThinkPrefix: true }; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/tokens.ts
Line: 40-48
Comment:
**Unicode expansion causes wrong slice index**
`closeIdx` is derived from `upper.indexOf(THINK_CLOSE)` where `upper = trimmed.toUpperCase()`. `.toUpperCase()` can expand certain Unicode code points — most notably `ß` → `SS`, `fi` → `FI`, `ff` → `FF` — so `upper` may be **longer** than `trimmed`. Using that index to slice back into `trimmed` then produces the wrong result.
Concrete example: `"[THINK] ß [/THINK] reply"` — `ß` expands by one code unit in `upper`, shifting `closeIdx` by 1, so `trimmed.slice(closeIdx + THINK_CLOSE.length)` returns `"eply"` instead of `"reply"`.
The fix is to search directly on `trimmed` with a case-insensitive regex, which avoids the length mismatch entirely:
```suggestion
// Look for the closing tag (case-insensitive) using a regex so the match
// index refers to `trimmed` directly, avoiding toUpperCase() length expansion.
const closeMatch = /\[\/THINK\]/i.exec(trimmed);
if (!closeMatch) {
// No closing tag — the entire text is think content
return { text: "", hadThinkPrefix: true };
}
// Strip everything from start through the closing tag
const afterClose = trimmed.slice(closeMatch.index + closeMatch[0].length).trim();
return { text: afterClose, hadThinkPrefix: true };
```
How can I resolve this? If you propose a fix, please make it concise.| const configHint = | ||
| maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars | ||
| ? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.` | ||
| : totalChars > totalMaxChars | ||
| ? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.` | ||
| : ""; |
There was a problem hiding this comment.
Config hint missing when custom maxChars still exceeded
The first condition in the ternary is maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars, which means the per-file tip is only shown when the user is using the default limit. If they've already set a custom bootstrapMaxChars and a file still exceeds it (but the total is within limits), the configHint falls through to "" — the warning names the offending file but gives no guidance on what config key to change.
Consider providing the per-file tip whenever only per-file warnings are present, regardless of whether the limit was customized:
| const configHint = | |
| maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars | |
| ? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.` | |
| : totalChars > totalMaxChars | |
| ? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.` | |
| : ""; | |
| const hasPerFileWarning = warnings.some((w) => !w.includes("Total:")); | |
| const configHint = | |
| totalChars > totalMaxChars | |
| ? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.` | |
| : hasPerFileWarning | |
| ? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.` | |
| : ""; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/doctor-bootstrap-size.ts
Line: 73-78
Comment:
**Config hint missing when custom `maxChars` still exceeded**
The first condition in the ternary is `maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars`, which means the per-file tip is only shown when the user is using the *default* limit. If they've already set a custom `bootstrapMaxChars` and a file still exceeds it (but the total is within limits), the `configHint` falls through to `""` — the warning names the offending file but gives no guidance on what config key to change.
Consider providing the per-file tip whenever only per-file warnings are present, regardless of whether the limit was customized:
```suggestion
const hasPerFileWarning = warnings.some((w) => !w.includes("Total:"));
const configHint =
totalChars > totalMaxChars
? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.`
: hasPerFileWarning
? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.`
: "";
```
How can I resolve this? If you propose a fix, please make it concise.| # [THINK] Prefix Filter — Implementation Summary | ||
|
|
||
| ## Problem | ||
|
|
||
| When AI agents process multi-step tasks, text between tool calls gets delivered as separate chat messages ("narration leak"). Agents can't reliably suppress internal reasoning because `NO_REPLY` only works for the final text segment — intermediate text blocks still get sent. |
There was a problem hiding this comment.
Developer scratch file committed to repo
THINK_FILTER_SUMMARY.md reads as an internal implementation summary written during development (design decisions, test counts, a list of files modified). This kind of document is not reference documentation for end-users or maintainers and looks like it was accidentally staged alongside the actual code changes. If the intent is to document the [THINK] prefix feature, the content should live in the official docs rather than as a root-level markdown file.
Prompt To Fix With AI
This is a comment left during a code review.
Path: THINK_FILTER_SUMMARY.md
Line: 1-5
Comment:
**Developer scratch file committed to repo**
`THINK_FILTER_SUMMARY.md` reads as an internal implementation summary written during development (design decisions, test counts, a list of files modified). This kind of document is not reference documentation for end-users or maintainers and looks like it was accidentally staged alongside the actual code changes. If the intent is to document the `[THINK]` prefix feature, the content should live in the official docs rather than as a root-level markdown file.
How can I resolve this? If you propose a fix, please make it concise.
Summary
Adds a
openclaw doctorcheck that warns when workspace bootstrap files will be silently truncated in the system prompt.Problem
Bootstrap files (AGENTS.md, SOUL.md, HEARTBEAT.md, etc.) are injected into the system prompt with a per-file character limit (default 20k) and a total limit (default 150k). When files exceed these limits, they're silently truncated — the agent receives a partial file with no indication that content was lost. Users have to discover this by debugging why their agent ignores instructions.
See #17052: a user's 6.3k char HEARTBEAT.md was truncated to 899 chars, and the agent only ever saw the philosophy header, never the actual health check instructions.
Solution
New
noteBootstrapFileSizecheck inopenclaw doctorthat:bootstrapMaxChars(per-file limit)bootstrapTotalMaxCharsExample output
Changes
src/commands/doctor-bootstrap-size.ts— the checksrc/commands/doctor-bootstrap-size.test.ts— 6 testssrc/commands/doctor.ts— register check after workspace statusTests
6 tests: within-limits (no warn), per-file warning, total warning, missing workspace, empty workspace, default limit config hint.
Related
Addresses #17052