Skip to content

fix(model-capabilities): harden runtime capability handling#2829

Merged
code-yeongyu merged 3 commits intocode-yeongyu:devfrom
RaviTharuma:fix/model-capabilities-review-followup
Mar 25, 2026
Merged

fix(model-capabilities): harden runtime capability handling#2829
code-yeongyu merged 3 commits intocode-yeongyu:devfrom
RaviTharuma:fix/model-capabilities-review-followup

Conversation

@RaviTharuma
Copy link
Copy Markdown
Contributor

@RaviTharuma RaviTharuma commented Mar 25, 2026

Summary

Why a follow-up PR

#2826 merged before these review fixes landed on the branch, so this PR carries the one remaining delta on top of current dev.

Verification

  • bun test src/shared/model-capabilities.test.ts src/shared/model-capabilities-cache.test.ts src/shared/connected-providers-cache.test.ts src/shared/model-settings-compatibility.test.ts --bail
  • bun test src/plugin/chat-params.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

Summary by cubic

Hardens runtime model capability handling and adds guardrails to surface capability diagnostics and auto-refresh the bundled capabilities snapshot for accuracy over time.

  • New Features

    • Add explicit model ID alias resolver (exact and pattern rules) with diagnostics; omo doctor shows capability resolution mode and warns when models rely on alias/heuristic/unknown fallback.
    • Add scheduled GitHub workflow to refresh src/generated/model-capabilities.generated.json weekly and open a PR.
  • Bug Fixes

    • Read SDK v2 runtime shape: parse booleans (reasoning, temperature, toolCall/topP), modalities (arrays or boolean maps, lowercased), variants, and limit from runtimeModel.capabilities or root; keep variant arrays intact.
    • Honor root-level thinking flags (thinking/supportsThinking) when providers don’t nest capabilities.
    • Merge repeated snapshot entries without materializing empty modalities or limit objects; treat arrays as non-records.
    • Detect prefixed o‑series IDs (e.g., openai/o3-mini) in heuristics.
    • Update GPT‑5 variants (remove max); compatibility downgrades requested max to xhigh.
    • Preserve normalized model IDs when provider metadata id is not a string.

Written for commit a15f607. Summary will update on new commits.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu follow-up PR for the six Cubic review fixes from #2826.

This is just the post-merge delta from 613ef8ee on top of current dev.

Verification:

  • bun test src/shared/model-capabilities.test.ts src/shared/model-capabilities-cache.test.ts src/shared/connected-providers-cache.test.ts src/shared/model-settings-compatibility.test.ts --bail
  • bun test src/plugin/chat-params.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 8 files

Confidence score: 4/5

  • This PR is likely safe to merge, with a focused medium-risk gap rather than a broad stability concern.
  • In src/shared/model-capabilities.ts, readRuntimeModelThinkingSupport currently only reads runtimeModel.capabilities, so providers using root-level thinking/supportsThinking can be misclassified and lose thinking-support behavior.
  • Given the issue is moderate severity (5/10) and isolated to capability detection logic, the overall merge risk looks limited but worth a follow-up fix.
  • Pay close attention to src/shared/model-capabilities.ts - root-level thinking flags may be ignored, causing incorrect provider capability detection.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/shared/model-capabilities.ts">

<violation number="1" location="src/shared/model-capabilities.ts:273">
P2: readRuntimeModelThinkingSupport ignores root-level thinking/supportsThinking flags; it only checks runtimeModel.capabilities, so providers that set these at the root will be missed.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu Cubic follow-up addressed in 7c0289d7.

Additional fix:

  • root-level thinking / supportsThinking flags are now honored in runtime model metadata

Verification rerun:

  • bun test src/shared/model-capabilities.test.ts src/plugin/chat-params.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

@cubic-dev-ai please re-review.

@cubic-dev-ai
Copy link
Copy Markdown

cubic-dev-ai bot commented Mar 25, 2026

@code-yeongyu Cubic follow-up addressed in 7c0289d7.

Additional fix:

  • root-level thinking / supportsThinking flags are now honored in runtime model metadata

...

@RaviTharuma I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/shared/model-capabilities.ts">

<violation number="1" location="src/shared/model-capabilities.ts:279">
P2: Custom agent: **Opencode Compatibility**

The fallback loop checking `runtimeCapabilities` is now unreachable dead code since `readRuntimeModelBoolean` already checks the `capabilities` object.</violation>
</file>

<file name="src/shared/model-capabilities.test.ts">

<violation number="1" location="src/shared/model-capabilities.test.ts:121">
P2: The new test is tautological: it uses a model whose snapshot already enables reasoning, so it does not actually verify root-level runtime `supportsThinking` handling.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 8 files

Confidence score: 4/5

  • This PR looks safe to merge overall: the reported issues are moderate-to-low severity (5/10 and 4/10) and don’t indicate a clear runtime breakage on their own.
  • The most important risk is in src/shared/model-capabilities.test.ts: the root-level thinking-flag test can pass via snapshot fallback (gpt-5.4 with reasoning: true), so it may miss a real regression in runtime root-level parsing behavior.
  • In src/shared/model-capabilities.ts, the extra loop appears to be redundant/dead code because readRuntimeModelBoolean already handles those keys; this is mainly a maintainability/clarity concern rather than an immediate user-facing bug.
  • Pay close attention to src/shared/model-capabilities.test.ts and src/shared/model-capabilities.ts - strengthen the test to verify actual runtime parsing and remove or refactor the unreachable return path.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/shared/model-capabilities.ts">

<violation number="1" location="src/shared/model-capabilities.ts:289">
P2: Redundant dead code: this loop re-checks `runtimeCapabilities` keys already handled by `readRuntimeModelBoolean`, making its return path effectively unreachable.</violation>
</file>

<file name="src/shared/model-capabilities.test.ts">

<violation number="1" location="src/shared/model-capabilities.test.ts:87">
P2: The new root-level thinking-flag test can pass due to snapshot fallback (`gpt-5.4` has `reasoning: true`), so it does not actually prove runtime root-level parsing works.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu I pushed a follow-up hardening commit on top of the current model-capabilities PR: a15f6076.

This extends the earlier review-fix work with long-term maintenance guardrails so model capability drift does not silently accumulate:

  • extracted alias normalization into a dedicated alias layer with named rules and tests
  • added capability provenance / resolution diagnostics (snapshot-backed, alias-backed, heuristic-backed, unknown)
  • extended doctor model-resolution output to surface capability resolution mode
  • added warnings when configured models rely on compatibility fallback
  • added coverage tests asserting every built-in OmO agent/category/fallback model is snapshot-backed
  • added a scheduled GitHub Actions workflow to refresh src/generated/model-capabilities.generated.json from models.dev and open a PR when it changes

Fresh verification on this head:

  • bun test src/shared/model-capability-aliases.test.ts src/shared/model-capabilities.test.ts src/shared/model-capabilities-cache.test.ts src/cli/doctor/checks/model-resolution.test.ts src/cli/doctor/checks/config.test.ts src/cli/refresh-model-capabilities.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

One maintainer decision I want to make explicit: the new scheduled workflow auto-opens a PR for bundled models.dev snapshot drift. If you prefer not to have repo-level scheduled PR automation, I can switch that piece to a doctor/manual-refresh-only strategy instead.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/shared/model-capability-aliases.ts">

<violation number="1" location="src/shared/model-capability-aliases.ts:45">
P2: Thinking-suffix alias matching fails for provider-prefixed Claude model IDs, causing alias resolution to be skipped.</violation>

<violation number="2" location="src/shared/model-capability-aliases.ts:56">
P2: Direct indexing into a plain object allows prototype keys (e.g. `constructor`) to bypass alias checks and return an invalid `exact-alias` resolution with undefined fields.</violation>
</file>

<file name=".github/workflows/refresh-model-capabilities.yml">

<violation number="1" location=".github/workflows/refresh-model-capabilities.yml:32">
P2: Automated PRs created by this workflow use the default `GITHUB_TOKEN`, which can prevent downstream `pull_request` CI workflows from auto-running.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


export function resolveModelIDAlias(modelID: string): ModelIDAliasResolution {
const normalizedModelID = normalizeLookupModelID(modelID)
const exactRule = EXACT_ALIAS_RULES[normalizedModelID]
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Direct indexing into a plain object allows prototype keys (e.g. constructor) to bypass alias checks and return an invalid exact-alias resolution with undefined fields.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/model-capability-aliases.ts, line 56:

<comment>Direct indexing into a plain object allows prototype keys (e.g. `constructor`) to bypass alias checks and return an invalid `exact-alias` resolution with undefined fields.</comment>

<file context>
@@ -0,0 +1,84 @@
+
+export function resolveModelIDAlias(modelID: string): ModelIDAliasResolution {
+  const normalizedModelID = normalizeLookupModelID(modelID)
+  const exactRule = EXACT_ALIAS_RULES[normalizedModelID]
+  if (exactRule) {
+    return {
</file context>
Fix with Cubic

const PATTERN_ALIAS_RULES: ReadonlyArray<PatternAliasRule> = [
{
ruleID: "anthropic-thinking-suffix",
match: (normalizedModelID) => normalizedModelID.startsWith("claude-") && normalizedModelID.endsWith("-thinking"),
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Thinking-suffix alias matching fails for provider-prefixed Claude model IDs, causing alias resolution to be skipped.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/model-capability-aliases.ts, line 45:

<comment>Thinking-suffix alias matching fails for provider-prefixed Claude model IDs, causing alias resolution to be skipped.</comment>

<file context>
@@ -0,0 +1,84 @@
+const PATTERN_ALIAS_RULES: ReadonlyArray<PatternAliasRule> = [
+  {
+    ruleID: "anthropic-thinking-suffix",
+    match: (normalizedModelID) => normalizedModelID.startsWith("claude-") && normalizedModelID.endsWith("-thinking"),
+    canonicalize: (normalizedModelID) => normalizedModelID.replace(/-thinking$/i, ""),
+  },
</file context>
Suggested change
match: (normalizedModelID) => normalizedModelID.startsWith("claude-") && normalizedModelID.endsWith("-thinking"),
match: (normalizedModelID) => /(?:^|\/)claude-.*-thinking$/i.test(normalizedModelID),
Fix with Cubic

run: bun run build:model-capabilities

- name: Create refresh pull request
uses: peter-evans/create-pull-request@v7
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Automated PRs created by this workflow use the default GITHUB_TOKEN, which can prevent downstream pull_request CI workflows from auto-running.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/refresh-model-capabilities.yml, line 32:

<comment>Automated PRs created by this workflow use the default `GITHUB_TOKEN`, which can prevent downstream `pull_request` CI workflows from auto-running.</comment>

<file context>
@@ -0,0 +1,43 @@
+        run: bun run build:model-capabilities
+
+      - name: Create refresh pull request
+        uses: peter-evans/create-pull-request@v7
+        with:
+          commit-message: "chore: refresh model capabilities snapshot"
</file context>
Fix with Cubic

@code-yeongyu code-yeongyu merged commit 76420b3 into code-yeongyu:dev Mar 25, 2026
8 of 9 checks passed
@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu I verified the current red CI is not coming from this PR. I reproduced the same todo-sync failure on a clean upstream/dev checkout at 5e9231e, and the failing files have no diff from upstream/dev. I split that out into a separate base-branch fix here: #2832. So the current state is: #2829 remains the model-capabilities follow-up, and #2832 is the unrelated CI unblock for the existing todo-sync mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants