fix(agents): unify Think/Reasoning defaults via shared resolveReasoningDefault helper#37940
fix(agents): unify Think/Reasoning defaults via shared resolveReasoningDefault helper#37940who96 wants to merge 8 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR unifies the Key changes:
Confidence Score: 5/5
Last reviewed commit: 69a12d8 |
src/auto-reply/status.ts
Outdated
| const reasoningLevel = | ||
| args.resolvedReasoning ?? | ||
| (args.config ? resolveReasoningDefault({ cfg: args.config, provider, model }) : "off"); |
There was a problem hiding this comment.
resolveReasoningDefault called without catalog
buildStatusMessage calls resolveReasoningDefault without the catalog parameter. The function first checks the catalog for a matching entry with reasoning === true, then falls back to the user-config path (cfg.models.providers.[p].models.[m].reasoning). Since no catalog is passed here, the catalog-based detection is silently skipped.
This means: for any reasoning-capable model whose capability is declared only in the bundled model catalog (but not explicitly set in the operator's config), buildStatusMessage will display Reasoning: off even when the model is actually a reasoning model. The /status command path (buildStatusReply → resolveDefaultReasoningLevel) correctly loads the catalog and avoids this problem; this synchronous path does not.
The new test in status.test.ts only covers the case where reasoning: true is set in the user config, so it doesn't catch this gap.
A straightforward fix is to add an optional catalog?: ModelCatalogEntry[] field to StatusArgs and thread it through:
// StatusArgs
catalog?: ModelCatalogEntry[];
// buildStatusMessage
const reasoningLevel =
args.resolvedReasoning ??
(args.config
? resolveReasoningDefault({ cfg: args.config, provider, model, catalog: args.catalog })
: "off");Callers that already have a catalog available (e.g. the session_status tool) can then pass it in, and the synchronous path will behave consistently with the async command path.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/status.ts
Line: 383-385
Comment:
**`resolveReasoningDefault` called without catalog**
`buildStatusMessage` calls `resolveReasoningDefault` without the `catalog` parameter. The function first checks the catalog for a matching entry with `reasoning === true`, then falls back to the user-config path (`cfg.models.providers.[p].models.[m].reasoning`). Since no catalog is passed here, the catalog-based detection is silently skipped.
This means: for any reasoning-capable model whose capability is declared only in the bundled model catalog (but not explicitly set in the operator's config), `buildStatusMessage` will display `Reasoning: off` even when the model is actually a reasoning model. The `/status` command path (`buildStatusReply` → `resolveDefaultReasoningLevel`) correctly loads the catalog and avoids this problem; this synchronous path does not.
The new test in `status.test.ts` only covers the case where `reasoning: true` is set in the user config, so it doesn't catch this gap.
A straightforward fix is to add an optional `catalog?: ModelCatalogEntry[]` field to `StatusArgs` and thread it through:
```typescript
// StatusArgs
catalog?: ModelCatalogEntry[];
// buildStatusMessage
const reasoningLevel =
args.resolvedReasoning ??
(args.config
? resolveReasoningDefault({ cfg: args.config, provider, model, catalog: args.catalog })
: "off");
```
Callers that already have a catalog available (e.g. the `session_status` tool) can then pass it in, and the synchronous path will behave consistently with the async command path.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Addressed in a63ceac155acfbb9725922526b215fd39e24b2c1. That commit threads catalog-backed reasoning defaults through the synchronous status and session_status path so the sync surface now matches the async command path.
|
@greptile-apps review - New commit 9996058 addresses the catalog parameter gap identified in the initial review. buildStatusMessage now accepts an optional catalog field via StatusArgs and passes it to resolveReasoningDefault, covering the synchronous path. A catalog-only reasoning model test was also added. |
src/agents/model-selection.ts
Outdated
| const candidate = params.catalog?.find( | ||
| (entry) => entry.provider === params.provider && entry.id === params.model, | ||
| ); | ||
| if (candidate?.reasoning === true) { | ||
| return "on"; | ||
| } | ||
| const configured = | ||
| params.cfg?.models?.providers?.[params.provider]?.models?.find( | ||
| (entry) => entry.id === params.model, | ||
| )?.reasoning ?? false; | ||
| return configured ? "on" : "off"; |
There was a problem hiding this comment.
Catalog takes priority over operator config — user reasoning: false silently ignored
resolveReasoningDefault checks the bundled catalog first. If the catalog has an entry with reasoning: true, the function returns "on" immediately and never consults the operator's per-model reasoning flag. This means an operator who explicitly sets reasoning: false in models.providers.[provider].models to opt a specific model out of the reasoning default will have that setting silently ignored whenever the model also appears in the catalog with reasoning: true.
This is the opposite priority of resolveThinkingDefault, which checks cfg.agents.defaults.thinkingDefault (i.e., operator config) first and only falls through to the catalog as a secondary signal.
For consistent and predictable behaviour, operator-declared per-model config should win over catalog defaults:
export function resolveReasoningDefault(params: {
cfg?: OpenClawConfig;
provider: string;
model: string;
catalog?: ModelCatalogEntry[];
}): "on" | "off" {
// User config takes priority over catalog defaults
const configuredEntry =
params.cfg?.models?.providers?.[params.provider]?.models?.find(
(entry) => entry.id === params.model,
);
if (configuredEntry !== undefined && "reasoning" in configuredEntry) {
return configuredEntry.reasoning ? "on" : "off";
}
const candidate = params.catalog?.find(
(entry) => entry.provider === params.provider && entry.id === params.model,
);
if (candidate?.reasoning === true) {
return "on";
}
return "off";
}
The existing test in status.test.ts ("shows reasoning as on for reasoning-capable models when not explicitly overridden") only covers the case where the catalog entry is absent and user config has reasoning: true, so the conflict scenario is not currently caught by the test suite.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-selection.ts
Line: 428-438
Comment:
**Catalog takes priority over operator config — user `reasoning: false` silently ignored**
`resolveReasoningDefault` checks the bundled catalog **first**. If the catalog has an entry with `reasoning: true`, the function returns `"on"` immediately and never consults the operator's per-model `reasoning` flag. This means an operator who explicitly sets `reasoning: false` in `models.providers.[provider].models` to opt a specific model out of the reasoning default will have that setting silently ignored whenever the model also appears in the catalog with `reasoning: true`.
This is the **opposite priority** of `resolveThinkingDefault`, which checks `cfg.agents.defaults.thinkingDefault` (i.e., operator config) first and only falls through to the catalog as a secondary signal.
For consistent and predictable behaviour, operator-declared per-model config should win over catalog defaults:
```
export function resolveReasoningDefault(params: {
cfg?: OpenClawConfig;
provider: string;
model: string;
catalog?: ModelCatalogEntry[];
}): "on" | "off" {
// User config takes priority over catalog defaults
const configuredEntry =
params.cfg?.models?.providers?.[params.provider]?.models?.find(
(entry) => entry.id === params.model,
);
if (configuredEntry !== undefined && "reasoning" in configuredEntry) {
return configuredEntry.reasoning ? "on" : "off";
}
const candidate = params.catalog?.find(
(entry) => entry.provider === params.provider && entry.id === params.model,
);
if (candidate?.reasoning === true) {
return "on";
}
return "off";
}
```
The existing test in `status.test.ts` ("shows reasoning as on for reasoning-capable models when not explicitly overridden") only covers the case where the catalog entry is absent and user config has `reasoning: true`, so the conflict scenario is not currently caught by the test suite.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Addressed in a63ceac155acfbb9725922526b215fd39e24b2c1. That change makes explicit operator config win before catalog defaults, so an explicit reasoning: false override is preserved.
|
@greptile-apps review - New commit a63ceac addresses the priority ordering issue: resolveReasoningDefault now checks operator config first, then falls back to catalog, matching resolveThinkingDefault's priority. Added test for config reasoning:false overriding catalog reasoning:true. |
a86667e to
b02bc47
Compare
|
@scoootscooob Would appreciate a quick look when you have a moment — happy to adjust if the approach needs changes. Thanks! |
fd998fa to
0888164
Compare
|
@tyler6204 @hydro13 Friendly ping — this PR has been open for 9 days with no human review. Bot reviews (Greptile 4/5) indicate safe to merge. Happy to adjust if the approach needs changes. Thanks! |
|
@greptileai review |
| const currentReasoningLevel = | ||
| resolvedCurrentReasoningLevel ?? (await modelState.resolveDefaultReasoningLevel()); |
There was a problem hiding this comment.
resolveDefaultReasoningLevel fallback is unreachable dead code
resolveCurrentDirectiveLevels (in directive-handling.levels.ts) returns currentReasoningLevel typed as ReasoningLevel — never undefined. When the session has no explicit reasoningLevel, it fills in "off" as the hardcoded default (line 34–35 of directive-handling.levels.ts):
// directive-handling.levels.ts – BEFORE this PR
const currentReasoningLevel =
(params.sessionEntry?.reasoningLevel as ReasoningLevel | undefined) ?? "off";Because the return type is ReasoningLevel (a union of non-empty strings, all truthy), the ?? guard on the consuming side:
const currentReasoningLevel =
resolvedCurrentReasoningLevel ?? (await modelState.resolveDefaultReasoningLevel());…can never evaluate the right-hand branch — "off" is a truthy string, not null/undefined. resolveDefaultReasoningLevel() is dead code here.
For the directive-only path (an inline +status directive embedded in a regular message), sessions without an explicit reasoningLevel will therefore always display Reasoning: off, even when the model catalog marks the model as reasoning: true. This is exactly the bug the PR intends to fix, but the fix doesn't reach this code path.
The root fix needs to land in resolveCurrentDirectiveLevels — change its return type to currentReasoningLevel: ReasoningLevel | undefined and remove the ?? "off" default there, then let the caller apply the catalog fallback:
// directive-handling.levels.ts
const currentReasoningLevel =
params.sessionEntry?.reasoningLevel as ReasoningLevel | undefined;
// return it as ReasoningLevel | undefined — callers apply the catalog defaultWith that change the ?? guard in get-reply-directives-apply.ts would fire correctly when the session has no persisted value, and resolveDefaultReasoningLevel() would supply the catalog-backed default.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/get-reply-directives-apply.ts
Line: 163-164
Comment:
**`resolveDefaultReasoningLevel` fallback is unreachable dead code**
`resolveCurrentDirectiveLevels` (in `directive-handling.levels.ts`) returns `currentReasoningLevel` typed as `ReasoningLevel` — never `undefined`. When the session has no explicit `reasoningLevel`, it fills in `"off"` as the hardcoded default (line 34–35 of `directive-handling.levels.ts`):
```ts
// directive-handling.levels.ts – BEFORE this PR
const currentReasoningLevel =
(params.sessionEntry?.reasoningLevel as ReasoningLevel | undefined) ?? "off";
```
Because the return type is `ReasoningLevel` (a union of non-empty strings, all truthy), the `??` guard on the consuming side:
```ts
const currentReasoningLevel =
resolvedCurrentReasoningLevel ?? (await modelState.resolveDefaultReasoningLevel());
```
…can **never** evaluate the right-hand branch — `"off"` is a truthy string, not `null`/`undefined`. `resolveDefaultReasoningLevel()` is dead code here.
For the directive-only path (an inline `+status` directive embedded in a regular message), sessions without an explicit `reasoningLevel` will therefore always display `Reasoning: off`, even when the model catalog marks the model as `reasoning: true`. This is exactly the bug the PR intends to fix, but the fix doesn't reach this code path.
The root fix needs to land in `resolveCurrentDirectiveLevels` — change its return type to `currentReasoningLevel: ReasoningLevel | undefined` and remove the `?? "off"` default there, then let the caller apply the catalog fallback:
```ts
// directive-handling.levels.ts
const currentReasoningLevel =
params.sessionEntry?.reasoningLevel as ReasoningLevel | undefined;
// return it as ReasoningLevel | undefined — callers apply the catalog default
```
With that change the `??` guard in `get-reply-directives-apply.ts` would fire correctly when the session has no persisted value, and `resolveDefaultReasoningLevel()` would supply the catalog-backed default.
How can I resolve this? If you propose a fix, please make it concise.| expect(normalizeModelSelection(null)).toBeUndefined(); | ||
| expect(normalizeModelSelection(42)).toBeUndefined(); | ||
| }); | ||
|
|
There was a problem hiding this comment.
resolveReasoningDefault tests nested inside wrong describe block
The new describe("resolveReasoningDefault", ...) block is placed as a nested child of describe("normalizeModelSelection", ...). These are unrelated utilities, so test output (and IDE tree views) will show the new tests under the normalizeModelSelection suite, making failures harder to locate.
Consider promoting it to a top-level describe, or nesting it under an existing block that is specifically about model-selection helpers (there are several at the top of the file):
| }); | |
| describe("resolveReasoningDefault", () => { | |
| it("uses explicit config reasoning false before catalog reasoning true", () => { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-selection.test.ts
Line: 852
Comment:
**`resolveReasoningDefault` tests nested inside wrong describe block**
The new `describe("resolveReasoningDefault", ...)` block is placed as a nested child of `describe("normalizeModelSelection", ...)`. These are unrelated utilities, so test output (and IDE tree views) will show the new tests under the `normalizeModelSelection` suite, making failures harder to locate.
Consider promoting it to a top-level describe, or nesting it under an existing block that is specifically about model-selection helpers (there are several at the top of the file):
```suggestion
});
describe("resolveReasoningDefault", () => {
it("uses explicit config reasoning false before catalog reasoning true", () => {
```
How can I resolve this? If you propose a fix, please make it concise.6bffec6 to
7117163
Compare
|
Both issues from the last review have been addressed:
@greptileai review |
|
Both issues from the last review have been addressed:
@greptileai review |
| const resolveDefaultReasoningLevel = async (): Promise<"on" | "off"> => { | ||
| if (defaultReasoningLevel) { | ||
| return defaultReasoningLevel; |
There was a problem hiding this comment.
Caching guard won't short-circuit for "off" — but only on first call
if (defaultReasoningLevel) is truthy for both "on" and "off" (non-empty strings), so the cache works correctly once a value has been stored. However, because the result of resolveReasoningDefault is synchronous, two concurrent await resolveDefaultReasoningLevel() calls that both see defaultReasoningLevel === undefined will both fall through and call loadModelCatalog a second time before the first write lands. This is an inherent race in the memoization pattern (also present in resolveDefaultThinkingLevel), but it's worth keeping in mind if loadModelCatalog ever becomes expensive or stateful.
No change is required for correctness; the worst case is a redundant catalog load that produces the same value.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/model-selection.ts
Line: 406-408
Comment:
**Caching guard won't short-circuit for `"off"` — but only on first call**
`if (defaultReasoningLevel)` is truthy for both `"on"` and `"off"` (non-empty strings), so the cache works correctly once a value has been stored. However, because the result of `resolveReasoningDefault` is synchronous, two concurrent `await resolveDefaultReasoningLevel()` calls that both see `defaultReasoningLevel === undefined` will both fall through and call `loadModelCatalog` a second time before the first write lands. This is an inherent race in the memoization pattern (also present in `resolveDefaultThinkingLevel`), but it's worth keeping in mind if `loadModelCatalog` ever becomes expensive or stateful.
No change is required for correctness; the worst case is a redundant catalog load that produces the same value.
How can I resolve this? If you propose a fix, please make it concise.|
Addressed both remaining observations:
@greptileai review |
|
@tyler6204 @hydro13 Friendly ping — this PR has been open for 8 days with no human review. All CI checks pass and Greptile gives 5/5 confidence. Would appreciate a review when you have a moment. Thanks! |
69a12d8 to
605a0bc
Compare
|
Rebased onto latest main (c4265a5) to resolve merge conflicts. Two earlier commits dropped as already obsoleted by upstream changes. All 5 substantive fix commits preserved. CI re-running. |
605a0bc to
12581b8
Compare
12581b8 to
7ce0cdd
Compare
|
Addressing remaining bot review conversations: Re: Re: Re: |
6b1d7ff to
ff8ab83
Compare
…evel | undefined Remove the hardcoded ?? "off" fallback in directive-handling.levels.ts so that the resolveDefaultReasoningLevel() fallback in get-reply-directives-apply.ts is actually reachable for sessions without an explicit reasoningLevel. Also promote the resolveReasoningDefault test suite to a top-level describe block so test output correctly attributes failures.
1. resolveReasoningDefault: remove unreachable entry.provider === key branch and the now-unused key variable. 2. resolveDefaultReasoningLevel: cache the in-flight Promise so concurrent callers share the same loadModelCatalog call instead of each triggering a redundant load.
ff8ab83 to
ac9f378
Compare
User Impact
Users asking
/statusorsession_statusabout a reasoning-capable model could see contradictory state such asThink: mediumwhileReasoning: offwhen no explicitreasoningLevelwas persisted. This fix makes both status surfaces report the same default reasoning state.Problem
OpenClaw already had a consistent default for
Think, butReasoningstill fell back tooffin several reply/status paths whenever the session entry did not carry an explicitreasoningLevel. In practice that produced contradictory state cards such asThink: mediumwhileReasoningwas treated as off. That mismatch then leaked into agent self-reports and confused operators about whether thinking was actually enabled.What this PR changes
resolveReasoningDefault()helper next to the existing thinking-default logicsession_statusand to interpret enabled/disabled fromThinkrather thanReasoningValidation
corepack pnpm exec vitest run src/agents/system-prompt.test.ts src/auto-reply/status.test.ts src/auto-reply/reply/commands-status.test.ts src/auto-reply/reply/commands.test.ts src/auto-reply/reply/commands-approve.test.ts src/auto-reply/reply/commands-parsing.test.ts src/auto-reply/reply/commands-policy.test.tscorepack pnpm buildAI Disclosure