Skip to content

fix(doctor): guard apiKey trim against SecretRef objects#35456

Closed
lisitan wants to merge 10 commits intoopenclaw:mainfrom
lisitan:fix/doctor-secretref-apiKey
Closed

fix(doctor): guard apiKey trim against SecretRef objects#35456
lisitan wants to merge 10 commits intoopenclaw:mainfrom
lisitan:fix/doctor-secretref-apiKey

Conversation

@lisitan
Copy link
Copy Markdown
Contributor

@lisitan lisitan commented Mar 5, 2026

Issue #35444: openclaw doctor crashes with TypeError on SecretRef apiKey values.

When apiKey is a SecretRef object (env var reference), calling .trim() throws TypeError. Now checks typeof apiKey === "string" first.

沐沐 and others added 10 commits March 5, 2026 11:17
When a skill's SKILL.md frontmatter has name: 12306 (without quotes),
YAML parses it as a number instead of a string. This causes
TypeError when calling name.startsWith() or name.trim() on a number.

This fix adds String() coercion in three places:
- bundled-context.ts: skill.name.trim() check
- config.ts: allowlist.includes() check
- workspace.ts: normalized.includes() check

Also adds comprehensive test coverage for numeric skill names.

Note: The underlying @mariozechner/pi-coding-agent package has the
same issue in validateName() function. Users should quote numeric
skill names in YAML (name: "12306") until that package is fixed.

Fixes openclaw#35252

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Happy <[email protected]>
When blockStreaming is enabled, the deltaBuffer was being reset at
message_start without preserving content from the previous turn. This
caused text from consecutive API turns (e.g., before and after a tool
call) to be concatenated without any separator, producing garbled
output like "...with durableNO Fresh session...".

This fix adds a "\n\n" separator before resetting the deltaBuffer if
it contains content from a previous turn. The separator is added to
both deltaBuffer and blockBuffer (or blockChunker) to ensure proper
text separation in the streaming UI.

After a page refresh, the history renders correctly, confirming this
is purely a streaming display issue.

Fixes openclaw#35308

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Happy <[email protected]>
The previous fix added \n\n before resetAssistantMessageState(),
which immediately cleared the separator.

This fix:
1. Captures if separator is needed before reset
2. Calls resetAssistantMessageState() to clear buffers
3. Adds separator AFTER reset

Fixes PR openclaw#35344 review feedback
When buffer starts with paragraph break (index === 0), keep it
instead of stripping it. This ensures cross-turn separator
is prepended to next paragraph in flushOnParagraph mode.

Added test case for cross-turn separator preservation.

Fixes PR openclaw#35344 review feedback
Issue openclaw#35397: setup-podman.sh was not obeying OPENCLAW_DOCKER_APT_PACKAGES
environment variable. Now passes it as --build-arg to podman build, matching
the behavior of docker-setup.sh.
Issue openclaw#35416: Add null safety checks for model.input which may be
undefined for custom provider models. Uses optional chaining (?.) and
nullish coalescing (??) to prevent TypeError.

Changes:
- list.registry.ts: model.input?.join()
- model-scan.ts: model.input?.includes() and ...(model.input ?? [])
…w#35428)

Issue openclaw#35428: Control UI Core Files refresh never shows external edits.
Now onLoadFiles handler reloads the active file content with force: true
to fetch the latest version from disk.
…mpatibility

Issue openclaw#35443: llm-task plugin tool parameters missing type field.
Type.Unknown() generates schema without type, which llama.cpp rejects.
Now uses Type.Object({}, { additionalProperties: true }) to ensure type: object.
…law#35444)

Issue openclaw#35444: openclaw doctor crashes with TypeError on SecretRef apiKey.
When apiKey is a SecretRef object (env var reference), calling .trim()
throws TypeError. Now checks typeof apiKey === "string" first.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR bundles the stated doctor fix (guarding apiKey.trim() against SecretRef objects) with several other improvements: defensive optional-chaining on model.input, String() coercion for YAML-parsed numeric skill names, a cross-turn paragraph-separator fix for block streaming, an onLoadFiles content auto-refresh in the UI, and a setup-podman.sh build-arg addition.

However, the PR introduces three significant issues in bundled changes:

  1. UI onLoadFiles callback mismatch: The new callback loads the file list for the provided agentId but then loads file content for resolvedAgentId from the outer closure. If these differ, mismatched file data is displayed.

  2. llm-task type narrowing regression: input and schema parameters were narrowed from Type.Unknown to Type.Object, rejecting arrays at the MCP validation layer. However, the execute handler serializes these as raw JSON and the schema validator explicitly allows non-object values (with !Array.isArray check), indicating arrays were valid callers before.

  3. pi-embedded-subscribe separator race condition: The cross-turn "\n\n" separator written after message_start can be silently overwritten by a late-arriving handleMessageEnd event, which unconditionally resets both deltaBuffer and the block chunker. The code comment acknowledges that late text_end events can arrive after message_start.

Confidence Score: 2/5

  • Not safe to merge as-is — three issues in bundled changes could cause runtime regressions unrelated to the stated apiKey fix.
  • The stated doctor fix (guarding apiKey.trim()) is correct and minimal. However, the PR bundles unrelated changes introducing real issues: the onLoadFiles callback uses the wrong agent ID (causing data mismatches), the input/schema type narrowing breaks array callers (regression), and the separator fix has an unhandled race condition with late events (acknowledged by existing code comments). These bundled issues lower confidence significantly.
  • ui/src/ui/app-render.ts (agentId mismatch), extensions/llm-task/src/llm-task-tool.ts (type narrowing), and src/agents/pi-embedded-subscribe.handlers.messages.ts (separator race condition).

Last reviewed commit: addaabb

Comment on lines +642 to +649
onLoadFiles: async (agentId) => {
await loadAgentFiles(state, agentId);
if (state.agentFileActive && resolvedAgentId) {
await loadAgentFileContent(state, resolvedAgentId, state.agentFileActive, {
force: true,
});
}
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agentId parameter ignored for content reload

loadAgentFiles is called with the agentId parameter, but the subsequent loadAgentFileContent call uses resolvedAgentId from the outer closure instead. If agentId !== resolvedAgentId (e.g., when reloading files for a non-active agent), the file list and its content would be fetched for two different agents, potentially displaying stale or mismatched content.

Suggested change
onLoadFiles: async (agentId) => {
await loadAgentFiles(state, agentId);
if (state.agentFileActive && resolvedAgentId) {
await loadAgentFileContent(state, resolvedAgentId, state.agentFileActive, {
force: true,
});
}
},
onLoadFiles: async (agentId) => {
await loadAgentFiles(state, agentId);
if (state.agentFileActive && agentId) {
await loadAgentFileContent(state, agentId, state.agentFileActive, {
force: true,
});
}
},
Prompt To Fix With AI
This is a comment left during a code review.
Path: ui/src/ui/app-render.ts
Line: 642-649

Comment:
`agentId` parameter ignored for content reload

`loadAgentFiles` is called with the `agentId` parameter, but the subsequent `loadAgentFileContent` call uses `resolvedAgentId` from the outer closure instead. If `agentId !== resolvedAgentId` (e.g., when reloading files for a non-active agent), the file list and its content would be fetched for two different agents, potentially displaying stale or mismatched content.

```suggestion
                onLoadFiles: async (agentId) => {
                  await loadAgentFiles(state, agentId);
                  if (state.agentFileActive && agentId) {
                    await loadAgentFileContent(state, agentId, state.agentFileActive, {
                      force: true,
                    });
                  }
                },
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +77 to 91
input: Type.Optional(
Type.Object(
{},
{ description: "Optional input payload for the task.", additionalProperties: true },
),
),
schema: Type.Optional(
Type.Unknown({ description: "Optional JSON Schema to validate the returned JSON." }),
Type.Object(
{},
{
description: "Optional JSON Schema to validate the returned JSON.",
additionalProperties: true,
},
),
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input and schema type narrowing breaks non-object callers

Both input and schema were changed from Type.Unknown to Type.Object({}, { additionalProperties: true }). TypeBox's Type.Object only accepts plain JavaScript objects — it rejects arrays and primitives at the MCP validation layer.

The execute handler at line 174 (const input = (params as any).input as unknown) and line 177 (JSON.stringify(input ?? null, ...)) are perfectly capable of handling any JSON-serialisable value, including arrays (e.g. input: [1, 2, 3]). The schema change creates a silent inconsistency: callers that were previously accepted (passing an array as input) will now fail validation before reaching execute.

Similarly, schema validation at line 234 explicitly guards !Array.isArray(schema) before using it, implying that non-object values were always expected to be possible.

Consider restoring Type.Unknown for input, or at least using Type.Union([Type.Object({}, { additionalProperties: true }), Type.Array(Type.Unknown())]) if only arrays + objects need to be supported.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/llm-task/src/llm-task-tool.ts
Line: 77-91

Comment:
`input` and `schema` type narrowing breaks non-object callers

Both `input` and `schema` were changed from `Type.Unknown` to `Type.Object({}, { additionalProperties: true })`. TypeBox's `Type.Object` only accepts plain JavaScript objects — it rejects arrays and primitives at the MCP validation layer.

The `execute` handler at line 174 (`const input = (params as any).input as unknown`) and line 177 (`JSON.stringify(input ?? null, ...)`) are perfectly capable of handling any JSON-serialisable value, including arrays (e.g. `input: [1, 2, 3]`). The schema change creates a silent inconsistency: callers that were previously accepted (passing an array as `input`) will now fail validation before reaching `execute`.

Similarly, `schema` validation at line 234 explicitly guards `!Array.isArray(schema)` before using it, implying that non-object values were always expected to be possible.

Consider restoring `Type.Unknown` for `input`, or at least using `Type.Union([Type.Object({}, { additionalProperties: true }), Type.Array(Type.Unknown())])` if only arrays + objects need to be supported.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +76 to +87
const needsSeparator = ctx.state.deltaBuffer.trim();

ctx.resetAssistantMessageState(ctx.state.assistantTexts.length);
// Use assistant message_start as the earliest "writing" signal for typing.

if (needsSeparator) {
ctx.state.deltaBuffer = “\n\n”;
if (ctx.blockChunker) {
ctx.blockChunker.append(“\n\n”);
} else {
ctx.state.blockBuffer = “\n\n”;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late handleMessageEnd can clobber the separator written here

ctx.state.deltaBuffer is set to "\n\n" (and ctx.blockChunker.append("\n\n") is called) after the reset. However, handleMessageEnd for the previous message (lines 430–432) unconditionally resets these same fields:

ctx.state.deltaBuffer = "";
ctx.state.blockBuffer = "";
ctx.blockChunker?.reset();

The existing comment in this file explicitly notes that providers may deliver late text_end (and by extension message_end) events after message_start for the next turn. If handleMessageEnd fires after the separator is written here, the "\n\n" separator is silently lost and the original text-concatenation bug (#35308) reappears.

Consider tracking a "separator pending" flag on ctx.state that handleMessageEnd respects, or guard the handleMessageEnd reset so it does not overwrite state that belongs to a newer message (e.g. only clear deltaBuffer if it does not equal "\n\n", or check ctx.state.assistantTexts.length against the baseline).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-subscribe.handlers.messages.ts
Line: 76-87

Comment:
Late `handleMessageEnd` can clobber the separator written here

`ctx.state.deltaBuffer` is set to `"\n\n"` (and `ctx.blockChunker.append("\n\n")` is called) after the reset. However, `handleMessageEnd` for the *previous* message (lines 430–432) unconditionally resets these same fields:

```ts
ctx.state.deltaBuffer = "";
ctx.state.blockBuffer = "";
ctx.blockChunker?.reset();
```

The existing comment in this file explicitly notes that providers may deliver late `text_end` (and by extension `message_end`) events *after* `message_start` for the next turn. If `handleMessageEnd` fires after the separator is written here, the `"\n\n"` separator is silently lost and the original text-concatenation bug (#35308) reappears.

Consider tracking a "separator pending" flag on `ctx.state` that `handleMessageEnd` respects, or guard the `handleMessageEnd` reset so it does not overwrite state that belongs to a newer message (e.g. only clear `deltaBuffer` if it does not equal `"\n\n"`, or check `ctx.state.assistantTexts.length` against the baseline).

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: addaabb4fb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +199 to +201
if (paragraphBreak.index === 0) {
// Buffer starts with separator; keep it and wait for more content
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep maxChars fallback reachable after leading separator

In src/agents/pi-embedded-block-chunker.ts (#drainParagraphs), the new early return when paragraphBreak.index === 0 prevents the code from ever reaching the existing max-length fallback path for buffers that start with the injected cross-turn "\n\n" and then continue as a single long paragraph. In that common case, each non-force drain exits immediately, so block streaming stops emitting incremental chunks until a forced flush at message end, which can delay replies and let the buffer grow far beyond maxChars.

Useful? React with 👍 / 👎.

Comment on lines +29 to +30
const apiKey = resolved?.remote?.apiKey;
const hasRemoteApiKey = Boolean(typeof apiKey === "string" && apiKey.trim());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count SecretRef memory apiKey as configured

The new hasRemoteApiKey check only accepts string values, so SecretRef-style memorySearch.remote.apiKey values (the crash scenario described in this commit) are now treated as missing. In noteMemorySearchHealth, that falls through to provider-auth probing, which does not read memorySearch.remote.apiKey, so doctor can emit a false "no API key" warning even when memory search is correctly configured via SecretRef.

Useful? React with 👍 / 👎.

@joshavant
Copy link
Copy Markdown
Contributor

Thanks for the contribution here, and for helping narrow down the SecretRef doctor crash.

We merged #36835, which supersedes this PR and fixes the broader doctor/memory SecretRef surface in one place, including the related runtime embedding paths.

I’m closing this as superseded by #36835, but I appreciate the fix and the investigation that went into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants