Onboarding: heuristic vision inputs for non-Azure custom models (#51869) (#51869)#51965
Onboarding: heuristic vision inputs for non-Azure custom models (#51869) (#51869)#51965lurui1997 wants to merge 1 commit intoopenclaw:mainfrom
Conversation
…claw#51869) Mark common GPT-4o/5, o3/o4, Claude 3+, and Gemini-style model ids as supporting image input when applying custom OpenAI-compatible provider config. Made-with: Cursor
Greptile SummaryThis PR adds a
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 56-60
Comment:
**`isClaude45OrHigher` third alternative matches `claude-3-5-*` models**
The third regex alternative `-[5-9](?:\b|[.-])` can fire for `claude-3-5-sonnet` because the non-greedy `[^\s/]*?` consumes `3` and then `-5` satisfies `-[5-9]`, followed by `-` which satisfies `[.-]`. In practice this is harmless here because the call site guards with `\bclaude-3\b || isClaude45OrHigher(…)` (so the result is still `true`), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit:
```suggestion
function isClaude45OrHigher(id: string): boolean {
// Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.
return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test(
id,
);
}
```
Alternatively, the simplest fix is to only test the third branch when the version number directly follows `claude-`:
```
/\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i
```
Either way, the edge case should be covered by a dedicated test for `claude-3-5-sonnet` to confirm it is *not* matched by `isClaude45OrHigher` in isolation.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 68-70
Comment:
**`o3`/`o4` vision flag inconsistent with Azure `reasoning` detection**
This PR explicitly adds `o3` and `o4` to the vision heuristic, but the non-Azure `nextModel` block always sets `reasoning: false`. The Azure branch already identifies these as reasoning models:
```ts
// Azure branch (line 671)
const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId);
```
A user who onboards an `o3` or `o4` model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get `reasoning: false` despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/commands/onboard-custom.test.ts
Line: 597-609
Comment:
**Test only exercises one of four heuristic branches**
The new test only verifies `gpt-4o-mini` (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately:
- A Claude 3 model (e.g. `claude-3-sonnet`) → `["text", "image"]`
- A `claude-opus-4-5`-style model → `["text", "image"]`
- A Gemini model (e.g. `gemini-1.5-pro`) → `["text", "image"]`
- An `o3-mini` model → `["text", "image"]`
- A generic custom model (e.g. `my-llm`) → `["text"]` (negative case)
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "Onboarding: heuristi..." |
| function isClaude45OrHigher(id: string): boolean { | ||
| // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors. | ||
| return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-[5-9](?:\b|[.-]))/i.test( | ||
| id, | ||
| ); |
There was a problem hiding this comment.
isClaude45OrHigher third alternative matches claude-3-5-* models
The third regex alternative -[5-9](?:\b|[.-]) can fire for claude-3-5-sonnet because the non-greedy [^\s/]*? consumes 3 and then -5 satisfies -[5-9], followed by - which satisfies [.-]. In practice this is harmless here because the call site guards with \bclaude-3\b || isClaude45OrHigher(…) (so the result is still true), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit:
| function isClaude45OrHigher(id: string): boolean { | |
| // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors. | |
| return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-[5-9](?:\b|[.-]))/i.test( | |
| id, | |
| ); | |
| function isClaude45OrHigher(id: string): boolean { | |
| // Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors. | |
| return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test( | |
| id, | |
| ); | |
| } |
Alternatively, the simplest fix is to only test the third branch when the version number directly follows claude-:
/\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i
Either way, the edge case should be covered by a dedicated test for claude-3-5-sonnet to confirm it is not matched by isClaude45OrHigher in isolation.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 56-60
Comment:
**`isClaude45OrHigher` third alternative matches `claude-3-5-*` models**
The third regex alternative `-[5-9](?:\b|[.-])` can fire for `claude-3-5-sonnet` because the non-greedy `[^\s/]*?` consumes `3` and then `-5` satisfies `-[5-9]`, followed by `-` which satisfies `[.-]`. In practice this is harmless here because the call site guards with `\bclaude-3\b || isClaude45OrHigher(…)` (so the result is still `true`), but the function's semantic contract ("4-5 or higher") no longer holds in isolation. Anchoring the third alternative with a preceding version separator makes the intent explicit:
```suggestion
function isClaude45OrHigher(id: string): boolean {
// Match claude-*-4-5+, claude-*-45+, claude-*4.5+, or future 5.x+ majors.
return /\bclaude-[^\s/]*?(?:-4-?(?:[5-9]|[1-9]\d)\b|4\.(?:[5-9]|[1-9]\d)\b|-(?:[5-9])(?:\b|[.-]))/i.test(
id,
);
}
```
Alternatively, the simplest fix is to only test the third branch when the version number directly follows `claude-`:
```
/\bclaude-(?:[a-z]+-)*(?:4-?(?:[5-9]|[1-9]\d)|[5-9])(?:\b|[.-])/i
```
Either way, the edge case should be covered by a dedicated test for `claude-3-5-sonnet` to confirm it is *not* matched by `isClaude45OrHigher` in isolation.
How can I resolve this? If you propose a fix, please make it concise.| if (/\b(o3|o4)\b/i.test(modelId)) { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
o3/o4 vision flag inconsistent with Azure reasoning detection
This PR explicitly adds o3 and o4 to the vision heuristic, but the non-Azure nextModel block always sets reasoning: false. The Azure branch already identifies these as reasoning models:
// Azure branch (line 671)
const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId);A user who onboards an o3 or o4 model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get reasoning: false despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/onboard-custom.ts
Line: 68-70
Comment:
**`o3`/`o4` vision flag inconsistent with Azure `reasoning` detection**
This PR explicitly adds `o3` and `o4` to the vision heuristic, but the non-Azure `nextModel` block always sets `reasoning: false`. The Azure branch already identifies these as reasoning models:
```ts
// Azure branch (line 671)
const isLikelyReasoningModel = isAzure && /\b(o[134]|gpt-([5-9]|\d{2,}))\b/i.test(modelId);
```
A user who onboards an `o3` or `o4` model on a non-Azure custom endpoint (e.g., a self-hosted proxy or a third-party provider) would therefore get `reasoning: false` despite the model being a reasoning model. Consider mirroring the Azure logic for non-Azure as well, or at minimum document the intentional gap in a comment.
How can I resolve this? If you propose a fix, please make it concise.| it("sets image input for likely vision non-Azure OpenAI-compatible models", () => { | ||
| const result = applyCustomApiConfig({ | ||
| config: {}, | ||
| baseUrl: "https://llm.example.com/v1", | ||
| modelId: "gpt-4o-mini", | ||
| compatibility: "openai", | ||
| apiKey: "key123", | ||
| providerId: "custom", | ||
| }); | ||
| const model = result.config.models?.providers?.custom?.models?.[0]; | ||
| expect(model?.input).toEqual(["text", "image"]); | ||
| expect(model?.reasoning).toBe(false); | ||
| }); |
There was a problem hiding this comment.
Test only exercises one of four heuristic branches
The new test only verifies gpt-4o-mini (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately:
- A Claude 3 model (e.g.
claude-3-sonnet) →["text", "image"] - A
claude-opus-4-5-style model →["text", "image"] - A Gemini model (e.g.
gemini-1.5-pro) →["text", "image"] - An
o3-minimodel →["text", "image"] - A generic custom model (e.g.
my-llm) →["text"](negative case)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/onboard-custom.test.ts
Line: 597-609
Comment:
**Test only exercises one of four heuristic branches**
The new test only verifies `gpt-4o-mini` (the OpenAI branch). Consider adding cases for the other branches so regressions surface immediately:
- A Claude 3 model (e.g. `claude-3-sonnet`) → `["text", "image"]`
- A `claude-opus-4-5`-style model → `["text", "image"]`
- A Gemini model (e.g. `gemini-1.5-pro`) → `["text", "image"]`
- An `o3-mini` model → `["text", "image"]`
- A generic custom model (e.g. `my-llm`) → `["text"]` (negative case)
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30e0039cb2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (/\b(o3|o4)\b/i.test(modelId)) { | ||
| return true; | ||
| } | ||
| if (/\bclaude-3\b/i.test(modelId) || isClaude45OrHigher(modelId)) { |
There was a problem hiding this comment.
Recognize Claude 4.x IDs as image-capable
This branch only treats claude-3* and claude-4.5+ names as multimodal, so onboarding claude-sonnet-4, claude-opus-4, or claude-opus-4-1-* on a non-Azure custom endpoint still stores input: ["text"]. Downstream image flows gate on model.input.includes("image") (for example src/media-understanding/providers/image.ts:56), which means those Anthropic 4/4.1 models will continue rejecting image uploads in exactly the scenario this change is trying to fix.
Useful? React with 👍 / 👎.
Summary
See commit message on branch
fix/51869-onboard-custom-vision-heuristic.Test plan
Fixes #51869
Made with Cursor