Skip to content

feat(doctor): warn when workspace bootstrap files exceed size limits#32654

Closed
Asclepeion wants to merge 2 commits intoopenclaw:mainfrom
Asclepeion:feat/doctor-bootstrap-size
Closed

feat(doctor): warn when workspace bootstrap files exceed size limits#32654
Asclepeion wants to merge 2 commits intoopenclaw:mainfrom
Asclepeion:feat/doctor-bootstrap-size

Conversation

@Asclepeion
Copy link
Copy Markdown

Summary

Adds a openclaw doctor check that warns when workspace bootstrap files will be silently truncated in the system prompt.

Problem

Bootstrap files (AGENTS.md, SOUL.md, HEARTBEAT.md, etc.) are injected into the system prompt with a per-file character limit (default 20k) and a total limit (default 150k). When files exceed these limits, they're silently truncated — the agent receives a partial file with no indication that content was lost. Users have to discover this by debugging why their agent ignores instructions.

See #17052: a user's 6.3k char HEARTBEAT.md was truncated to 899 chars, and the agent only ever saw the philosophy header, never the actual health check instructions.

Solution

New noteBootstrapFileSize check in openclaw doctor that:

  1. Scans all workspace bootstrap files
  2. Compares each against bootstrapMaxChars (per-file limit)
  3. Compares total against bootstrapTotalMaxChars
  4. Warns with file name, actual size, limit, and truncation percentage
  5. Suggests the config key to increase limits

Example output

─── Bootstrap file size ──────────────────────
Workspace bootstrap files exceed size limits and will be truncated:
- AGENTS.md: 25.0k chars (limit 20.0k). ~20% will be truncated in system prompt.
- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.

Changes

  • New: src/commands/doctor-bootstrap-size.ts — the check
  • New: src/commands/doctor-bootstrap-size.test.ts — 6 tests
  • src/commands/doctor.ts — register check after workspace status

Tests

6 tests: within-limits (no warn), per-file warning, total warning, missing workspace, empty workspace, default limit config hint.

Related

Addresses #17052

Asclepeion and others added 2 commits March 1, 2026 12:47
When agents produce text between tool calls, each text block gets
delivered as a separate chat message ('narration leak'). This adds
a [THINK] prefix that agents can use to mark internal reasoning:

- [THINK] reasoning... → suppressed entirely
- [THINK] reasoning [/THINK] actual reply → delivers only 'actual reply'

The filter is applied at the normalizeReplyPayload layer, alongside
the existing NO_REPLY/HEARTBEAT_OK suppression. Case-insensitive,
tolerates leading whitespace, only matches at text start.

Includes 16 new tests (10 for stripThinkPrefix, 6 for normalize integration).
All existing tests pass with zero regressions.
Adds a new doctor check that scans workspace bootstrap files (AGENTS.md,
SOUL.md, HEARTBEAT.md, etc.) against the configured per-file and total
character limits. When files will be truncated in the system prompt,
the check warns with:

- File name, actual size, limit, and truncation percentage
- Total size warning when all files combined exceed total limit
- Config hint suggesting bootstrapMaxChars / bootstrapTotalMaxChars

This makes the silent truncation visible during setup, so users
discover the issue from `openclaw doctor` instead of debugging why
their agent ignores instructions buried in truncated files.

6 new tests covering: within-limits, per-file warning, total warning,
missing workspace, empty workspace, default limit config hint.

Addresses openclaw#17052
@openclaw-barnacle openclaw-barnacle bot added commands Command implementations size: M labels Mar 3, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 58f9d3ff80

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +46 to +47
const thinkResult = stripThinkPrefix(text);
if (thinkResult.hadThinkPrefix) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve THINK suppression state across streamed chunks

This check only strips when the current payload starts with [THINK], but normalizeReplyPayload is invoked independently for each queued block/tool/final payload. In streamed block replies, a long think block can be split so the first chunk starts with [THINK] (dropped) while subsequent chunks no longer start with the prefix and get delivered, leaking internal reasoning (including text before a later [/THINK]). This needs stream-level/stateful suppression until the closing tag is seen.

Useful? React with 👍 / 👎.

continue;
}
const fileChars = file.content.length;
totalChars += fileChars;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Base total-size warning on injected bootstrap chars

The total warning uses raw file lengths (totalChars += fileChars) and compares them to bootstrapTotalMaxChars, but runtime budgeting applies per-file truncation before consuming the total budget in buildBootstrapContextFiles. That mismatch can produce false "Total ... content will be lost" warnings (for example, a single huge file that is already truncated by the per-file cap and never reaches the total cap), which points users at the wrong config knob.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR bundles two separate features: (1) the primary openclaw doctor bootstrap-file size check (doctor-bootstrap-size.ts) and (2) a [THINK] prefix filter for suppressing agent internal reasoning in auto-replies (tokens.ts, normalize-reply.ts). The two features are unrelated and would be easier to review and revert independently if they were split into separate PRs.

Key findings:

  • [THINK] prefix filter — Unicode slice bug (src/auto-reply/tokens.ts): stripThinkPrefix derives closeIdx via upper.indexOf(THINK_CLOSE) where upper = trimmed.toUpperCase(). .toUpperCase() can expand certain Unicode code points (ßSS, FI, etc.), making upper longer than trimmed. Using closeIdx to slice back into trimmed then starts at the wrong offset, silently dropping characters from the content that follows [/THINK]. Switching to a case-insensitive regex (/\[\/THINK\]/i.exec(trimmed)) resolves this cleanly.
  • Bootstrap doctor — missing config hint (src/commands/doctor-bootstrap-size.ts): When a user has already set a custom bootstrapMaxChars and a file still exceeds it (while total is within limits), the configHint evaluates to "". The warning names the offending file but gives no actionable guidance. The hint should be shown regardless of whether the default limit is in use.
  • THINK_FILTER_SUMMARY.md accidentally committed: This root-level file reads as a developer working document (design decisions, test counts, list of modified files) rather than end-user or maintainer documentation, and appears to have been committed inadvertently.

Confidence Score: 3/5

  • Safe to merge the bootstrap-size doctor check; the [THINK] filter has a Unicode slice bug that should be fixed before merging.
  • The bootstrap-size check is well-tested and the logic is sound. The [THINK] filter introduces a real (if edge-case) bug in stripThinkPrefix where the toUpperCase() index is used to slice the original string — this can drop characters when the thinking block contains Unicode that expands on uppercase. The THINK_FILTER_SUMMARY.md file also appears to be an accidental commit. These two concerns lower confidence in the [THINK] filter portion of the PR.
  • src/auto-reply/tokens.ts (Unicode slice bug in stripThinkPrefix), THINK_FILTER_SUMMARY.md (accidental commit)

Last reviewed commit: 58f9d3f

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +40 to +48
const closeIdx = upper.indexOf(THINK_CLOSE);
if (closeIdx === -1) {
// No closing tag — the entire text is think content
return { text: "", hadThinkPrefix: true };
}

// Strip everything from start through the closing tag
const afterClose = trimmed.slice(closeIdx + THINK_CLOSE.length).trim();
return { text: afterClose, hadThinkPrefix: true };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unicode expansion causes wrong slice index

closeIdx is derived from upper.indexOf(THINK_CLOSE) where upper = trimmed.toUpperCase(). .toUpperCase() can expand certain Unicode code points — most notably ßSS, FI, FF — so upper may be longer than trimmed. Using that index to slice back into trimmed then produces the wrong result.

Concrete example: "[THINK] ß [/THINK] reply"ß expands by one code unit in upper, shifting closeIdx by 1, so trimmed.slice(closeIdx + THINK_CLOSE.length) returns "eply" instead of "reply".

The fix is to search directly on trimmed with a case-insensitive regex, which avoids the length mismatch entirely:

Suggested change
const closeIdx = upper.indexOf(THINK_CLOSE);
if (closeIdx === -1) {
// No closing tag — the entire text is think content
return { text: "", hadThinkPrefix: true };
}
// Strip everything from start through the closing tag
const afterClose = trimmed.slice(closeIdx + THINK_CLOSE.length).trim();
return { text: afterClose, hadThinkPrefix: true };
// Look for the closing tag (case-insensitive) using a regex so the match
// index refers to `trimmed` directly, avoiding toUpperCase() length expansion.
const closeMatch = /\[\/THINK\]/i.exec(trimmed);
if (!closeMatch) {
// No closing tag — the entire text is think content
return { text: "", hadThinkPrefix: true };
}
// Strip everything from start through the closing tag
const afterClose = trimmed.slice(closeMatch.index + closeMatch[0].length).trim();
return { text: afterClose, hadThinkPrefix: true };
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/tokens.ts
Line: 40-48

Comment:
**Unicode expansion causes wrong slice index**

`closeIdx` is derived from `upper.indexOf(THINK_CLOSE)` where `upper = trimmed.toUpperCase()`. `.toUpperCase()` can expand certain Unicode code points — most notably `ß``SS`, ```FI`, ```FF` — so `upper` may be **longer** than `trimmed`. Using that index to slice back into `trimmed` then produces the wrong result.

Concrete example: `"[THINK] ß [/THINK] reply"``ß` expands by one code unit in `upper`, shifting `closeIdx` by 1, so `trimmed.slice(closeIdx + THINK_CLOSE.length)` returns `"eply"` instead of `"reply"`.

The fix is to search directly on `trimmed` with a case-insensitive regex, which avoids the length mismatch entirely:

```suggestion
  // Look for the closing tag (case-insensitive) using a regex so the match
  // index refers to `trimmed` directly, avoiding toUpperCase() length expansion.
  const closeMatch = /\[\/THINK\]/i.exec(trimmed);
  if (!closeMatch) {
    // No closing tag — the entire text is think content
    return { text: "", hadThinkPrefix: true };
  }

  // Strip everything from start through the closing tag
  const afterClose = trimmed.slice(closeMatch.index + closeMatch[0].length).trim();
  return { text: afterClose, hadThinkPrefix: true };
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +73 to +78
const configHint =
maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars
? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.`
: totalChars > totalMaxChars
? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.`
: "";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config hint missing when custom maxChars still exceeded

The first condition in the ternary is maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars, which means the per-file tip is only shown when the user is using the default limit. If they've already set a custom bootstrapMaxChars and a file still exceeds it (but the total is within limits), the configHint falls through to "" — the warning names the offending file but gives no guidance on what config key to change.

Consider providing the per-file tip whenever only per-file warnings are present, regardless of whether the limit was customized:

Suggested change
const configHint =
maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars
? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.`
: totalChars > totalMaxChars
? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.`
: "";
const hasPerFileWarning = warnings.some((w) => !w.includes("Total:"));
const configHint =
totalChars > totalMaxChars
? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.`
: hasPerFileWarning
? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.`
: "";
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/doctor-bootstrap-size.ts
Line: 73-78

Comment:
**Config hint missing when custom `maxChars` still exceeded**

The first condition in the ternary is `maxChars === DEFAULT_BOOTSTRAP_MAX_CHARS && totalChars <= totalMaxChars`, which means the per-file tip is only shown when the user is using the *default* limit. If they've already set a custom `bootstrapMaxChars` and a file still exceeds it (but the total is within limits), the `configHint` falls through to `""` — the warning names the offending file but gives no guidance on what config key to change.

Consider providing the per-file tip whenever only per-file warnings are present, regardless of whether the limit was customized:

```suggestion
  const hasPerFileWarning = warnings.some((w) => !w.includes("Total:"));
  const configHint =
    totalChars > totalMaxChars
      ? `\n- Tip: increase total limit with agents.defaults.bootstrapTotalMaxChars in config, or trim files.`
      : hasPerFileWarning
        ? `\n- Tip: increase per-file limit with agents.defaults.bootstrapMaxChars in config.`
        : "";
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1 to +5
# [THINK] Prefix Filter — Implementation Summary

## Problem

When AI agents process multi-step tasks, text between tool calls gets delivered as separate chat messages ("narration leak"). Agents can't reliably suppress internal reasoning because `NO_REPLY` only works for the final text segment — intermediate text blocks still get sent.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developer scratch file committed to repo

THINK_FILTER_SUMMARY.md reads as an internal implementation summary written during development (design decisions, test counts, a list of files modified). This kind of document is not reference documentation for end-users or maintainers and looks like it was accidentally staged alongside the actual code changes. If the intent is to document the [THINK] prefix feature, the content should live in the official docs rather than as a root-level markdown file.

Prompt To Fix With AI
This is a comment left during a code review.
Path: THINK_FILTER_SUMMARY.md
Line: 1-5

Comment:
**Developer scratch file committed to repo**

`THINK_FILTER_SUMMARY.md` reads as an internal implementation summary written during development (design decisions, test counts, a list of files modified). This kind of document is not reference documentation for end-users or maintainers and looks like it was accidentally staged alongside the actual code changes. If the intent is to document the `[THINK]` prefix feature, the content should live in the official docs rather than as a root-level markdown file.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant