Onboarding: heuristic vision inputs for non-Azure custom models (#51869) (#51869) by lurui1997 · Pull Request #51965 · openclaw/openclaw

lurui1997 · 2026-03-22T01:00:41Z

Summary

See commit message on branch fix/51869-onboard-custom-vision-heuristic.

Test plan

CI / targeted tests as noted in commit

Made with Cursor

…claw#51869) Mark common GPT-4o/5, o3/o4, Claude 3+, and Gemini-style model ids as supporting image input when applying custom OpenAI-compatible provider config. Made-with: Cursor

greptile-apps · 2026-03-22T01:04:28Z

Greptile Summary

This PR adds a isLikelyVisionCapableNonAzureModel heuristic to onboard-custom.ts so that well-known vision-capable model IDs (OpenAI GPT-4o/4.1/turbo/5+, o3/o4, Claude 3/4.5+, Gemini) automatically default to input: ["text", "image"] during fresh onboarding on non-Azure custom providers. Previously all non-Azure custom models defaulted to input: ["text"], requiring manual config edits for multimodal use.

The vision heuristic is correctly scoped to the non-Azure branch of applyCustomApiConfig; existing user-customised model entries are preserved on re-onboard (consistent with prior behaviour).
The isClaude45OrHigher helper's third regex alternative (-[5-9](?:\b|[.-])) can false-positive on claude-3-5-* ids due to the non-greedy quantifier consuming 3 and landing on -5. This is currently harmless at the call site (the \bclaude-3\b guard fires first), but the function's stated semantic contract is broken in isolation and could mislead future callers.
o3/o4 are now explicitly recognised as vision-capable, but their reasoning flag stays false on the non-Azure path — inconsistent with the Azure branch which already marks them as reasoning models via isLikelyReasoningModel.
The single new test exercises the happy path (gpt-4o-mini) but does not cover the Claude, Gemini, o3/o4, or negative branches of the heuristic.

Confidence Score: 4/5

Safe to merge after addressing the isClaude45OrHigher regex false-positive; the o3/o4 reasoning flag inconsistency and thin test coverage are non-blocking follow-ups.
The change is small and well-contained in the non-Azure onboarding path. The heuristic logic is correct for all tested cases and the existing re-onboard preservation tests remain green. The isClaude45OrHigher third-alternative false positive is a P1 concern because the function's contract is demonstrably wrong (it matches claude-3-5-sonnet), but it does not produce an incorrect end result today due to the surrounding guard. The o3/o4 reasoning gap and limited test coverage are P2 improvements rather than blockers.
src/commands/onboard-custom.ts — isClaude45OrHigher regex and o3/o4 reasoning handling; src/commands/onboard-custom.test.ts — additional heuristic branch coverage.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 56-60

Comment:
**`isClaude45OrHigher` third alternative matches `claude-3-5-*` models**

The third regex alternative `-[5-9](?:\b|[.-])` can fire for `claude-3-5-sonnet` because the non-greedy `[^\s/]*?` consumes `3` and then `-5` satisfies `-[5-9]`, followed by `-` which satisfies `[.-]`. In practice this is harmless here because the call site guards with `\bclaude-3\b || isClaude45OrHigher(…)` (so the result is still `true`), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit:

```suggestion
function isClaude45OrHigher(id: string): boolean {
  // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.
  return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test(
    id,
  );
}
```

Alternatively, the simplest fix is to only test the third branch when the version number directly follows `claude-`:
```
/\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i
```
Either way, the edge case should be covered by a dedicated test for `claude-3-5-sonnet` to confirm it is *not* matched by `isClaude45OrHigher` in isolation.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 68-70

Comment:
**`o3`/`o4` vision flag inconsistent with Azure `reasoning` detection**

This PR explicitly adds `o3` and `o4` to the vision heuristic, but the non-Azure `nextModel` block always sets `reasoning: false`. The Azure branch already identifies these as reasoning models:

```ts
// Azure branch (line 671)
const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId);
```

A user who onboards an `o3` or `o4` model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get `reasoning: false` despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/commands/onboard-custom.test.ts
Line: 597-609

Comment:
**Test only exercises one of four heuristic branches**

The new test only verifies `gpt-4o-mini` (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately:

- A Claude 3 model (e.g. `claude-3-sonnet`) → `["text", "image"]`
- A `claude-opus-4-5`-style model → `["text", "image"]`
- A Gemini model (e.g. `gemini-1.5-pro`) → `["text", "image"]`
- An `o3-mini` model → `["text", "image"]`
- A generic custom model (e.g. `my-llm`) → `["text"]` (negative case)

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "Onboarding: heuristi..."}

greptile-apps · 2026-03-22T01:04:32Z

src/commands/onboard-custom.ts

+function isClaude45OrHigher(id: string): boolean {
+  // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.
+  return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-[5-9](?:\b|[.-]))/i.test(
+    id,
+  );


isClaude45OrHigher third alternative matches claude-3-5-* models

The third regex alternative -[5-9](?:\b|[.-]) can fire for claude-3-5-sonnet because the non-greedy [^\s/]*? consumes 3 and then -5 satisfies -[5-9], followed by - which satisfies [.-]. In practice this is harmless here because the call site guards with \bclaude-3\b || isClaude45OrHigher(…) (so the result is still true), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit:

Suggested change

function isClaude45OrHigher(id: string): boolean {

// Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.

return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-[5-9](?:\b|[.-]))/i.test(

id,

);

function isClaude45OrHigher(id: string): boolean {

// Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.

return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test(

id,

);

}

Alternatively, the simplest fix is to only test the third branch when the version number directly follows claude-:

/\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i

Either way, the edge case should be covered by a dedicated test for claude-3-5-sonnet to confirm it is not matched by isClaude45OrHigher in isolation.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/commands/onboard-custom.ts Line: 56-60 Comment: **`isClaude45OrHigher` third alternative matches `claude-3-5-*` models** The third regex alternative `-[5-9](?:\b|[.-])` can fire for `claude-3-5-sonnet` because the non-greedy `[^\s/]*?` consumes `3` and then `-5` satisfies `-[5-9]`, followed by `-` which satisfies `[.-]`. In practice this is harmless here because the call site guards with `\bclaude-3\b || isClaude45OrHigher(…)` (so the result is still `true`), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit: ```suggestion function isClaude45OrHigher(id: string): boolean { // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors. return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test( id, ); } ``` Alternatively, the simplest fix is to only test the third branch when the version number directly follows `claude-`: ``` /\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i ``` Either way, the edge case should be covered by a dedicated test for `claude-3-5-sonnet` to confirm it is *not* matched by `isClaude45OrHigher` in isolation. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-22T01:04:33Z

src/commands/onboard-custom.ts

+  if (/\b(o3|o4)\b/i.test(modelId)) {
+    return true;
+  }


o3/o4 vision flag inconsistent with Azure reasoning detection

This PR explicitly adds o3 and o4 to the vision heuristic, but the non-Azure nextModel block always sets reasoning: false. The Azure branch already identifies these as reasoning models:

// Azure branch (line 671) const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId);

A user who onboards an o3 or o4 model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get reasoning: false despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/commands/onboard-custom.ts Line: 68-70 Comment: **`o3`/`o4` vision flag inconsistent with Azure `reasoning` detection** This PR explicitly adds `o3` and `o4` to the vision heuristic, but the non-Azure `nextModel` block always sets `reasoning: false`. The Azure branch already identifies these as reasoning models: ```ts // Azure branch (line 671) const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId); ``` A user who onboards an `o3` or `o4` model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get `reasoning: false` despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-22T01:04:34Z

src/commands/onboard-custom.test.ts

+  it("sets image input for likely vision non-Azure OpenAI-compatible models", () => {
+    const result = applyCustomApiConfig({
+      config: {},
+      baseUrl: "https://llm.example.com/v1",
+      modelId: "gpt-4o-mini",
+      compatibility: "openai",
+      apiKey: "key123",
+      providerId: "custom",
+    });
+    const model = result.config.models?.providers?.custom?.models?.[0];
+    expect(model?.input).toEqual(["text", "image"]);
+    expect(model?.reasoning).toBe(false);
+  });


Test only exercises one of four heuristic branches

The new test only verifies gpt-4o-mini (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately:

A Claude 3 model (e.g. claude-3-sonnet) → ["text", "image"]

A claude-opus-4-5-style model → ["text", "image"]

A Gemini model (e.g. gemini-1.5-pro) → ["text", "image"]

An o3-mini model → ["text", "image"]

A generic custom model (e.g. my-llm) → ["text"] (negative case)

Prompt To Fix With AI

This is a comment left during a code review. Path: src/commands/onboard-custom.test.ts Line: 597-609 Comment: **Test only exercises one of four heuristic branches** The new test only verifies `gpt-4o-mini` (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately: - A Claude 3 model (e.g. `claude-3-sonnet`) → `["text", "image"]` - A `claude-opus-4-5`-style model → `["text", "image"]` - A Gemini model (e.g. `gemini-1.5-pro`) → `["text", "image"]` - An `o3-mini` model → `["text", "image"]` - A generic custom model (e.g. `my-llm`) → `["text"]` (negative case) How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 30e0039cb2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-22T01:04:57Z

src/commands/onboard-custom.ts

+  if (/\b(o3|o4)\b/i.test(modelId)) {
+    return true;
+  }
+  if (/\bclaude-3\b/i.test(modelId) || isClaude45OrHigher(modelId)) {


Recognize Claude 4.x IDs as image-capable

This branch only treats claude-3* and claude-4.5+ names as multimodal, so onboarding claude-sonnet-4, claude-opus-4, or claude-opus-4-1-* on a non-Azure custom endpoint still stores input: ["text"]. Downstream image flows gate on model.input.includes("image") (for example src/media-understanding/providers/image.ts:56), which means those Anthropic 4/4.1 models will continue rejecting image uploads in exactly the scenario this change is trying to fix.

Useful? React with 👍 / 👎.

Onboarding: heuristic vision inputs for non-Azure custom models (open…

30e0039

…claw#51869) Mark common GPT-4o/5, o3/o4, Claude 3+, and Gemini-style model ids as supporting image input when applying custom OpenAI-compatible provider config. Made-with: Cursor

openclaw-barnacle bot added commands Command implementations size: XS labels Mar 22, 2026

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

BradGroux mentioned this pull request Mar 26, 2026

WORKING: All Microsoft Issues and PRs #49126

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Onboarding: heuristic vision inputs for non-Azure custom models (#51869) (#51869)#51965

Onboarding: heuristic vision inputs for non-Azure custom models (#51869) (#51869)#51965
lurui1997 wants to merge 1 commit intoopenclaw:mainfrom
lurui1997:fix/51869-onboard-custom-vision-heuristic

lurui1997 commented Mar 22, 2026

Uh oh!

greptile-apps bot commented Mar 22, 2026

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lurui1997 commented Mar 22, 2026

Summary

Test plan

Uh oh!

greptile-apps bot commented Mar 22, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant