Skip to content

feat(model-capabilities): add models.dev snapshot and runtime capability refresh#2826

Merged
code-yeongyu merged 1 commit intocode-yeongyu:devfrom
RaviTharuma:feat/model-capabilities-models-dev
Mar 25, 2026
Merged

feat(model-capabilities): add models.dev snapshot and runtime capability refresh#2826
code-yeongyu merged 1 commit intocode-yeongyu:devfrom
RaviTharuma:feat/model-capabilities-models-dev

Conversation

@RaviTharuma
Copy link
Copy Markdown
Contributor

@RaviTharuma RaviTharuma commented Mar 25, 2026

Summary

  • add a bundled models.dev snapshot plus a runtime cache/refresh path for model capabilities
  • introduce a central capability resolver and use it to normalize variant, reasoningEffort, thinking, maxTokens, temperature, and topP
  • add startup auto-refresh, a manual refresh-model-capabilities CLI command, config schema support, and focused regression tests

Verification

  • bun test src/shared/model-capabilities-cache.test.ts src/plugin/chat-params.test.ts --bail
  • bun test src/shared/model-capabilities.test.ts src/shared/model-capabilities-cache.test.ts src/shared/model-settings-compatibility.test.ts src/shared/connected-providers-cache.test.ts src/plugin/chat-params.test.ts src/hooks/auto-update-checker/hook.test.ts src/cli/refresh-model-capabilities.test.ts src/config/schema.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

Note

  • bun test --bail still hits src/tools/task/todo-sync.test.ts, but that failure is already present on clean upstream/dev and is unrelated to this PR.

Summary by cubic

Adds a bundled models.dev snapshot and a runtime cache/refresh for model capabilities, then uses it to normalize chat params and provider metadata. This drops unsupported settings, clamps limits, and enables auto-refresh on startup.

  • New Features

    • Bundled src/generated/model-capabilities.generated.json and a cache store with refresh from https://models.dev/api.json (script: build:model-capabilities).
    • Central resolver getModelCapabilities plus updated compatibility logic to handle variant, reasoningEffort, thinking, maxTokens, temperature, and topP. chat-params now drops unsupported fields and clamps maxTokens.
    • Startup auto-refresh via the update checker with a timeout and config control.
    • New CLI: refresh-model-capabilities (options: --directory, --source-url, --json).
    • Config schema: model_capabilities.{enabled,auto_refresh_on_start,refresh_timeout_ms,source_url}.
    • Provider cache now stores rich model metadata and exposes findProviderModelMetadata.
  • Migration

    • No action required. Defaults keep working; capabilities auto-refresh on start.
    • To refresh manually: run refresh-model-capabilities (optionally --source-url <URL>).
    • Optional config under model_capabilities: set enabled, auto_refresh_on_start, refresh_timeout_ms, or source_url.

Written for commit 2af9324. Summary will update on new commits.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu this is ready for review. It adds the separate models.dev-backed model capability layer we discussed, keeps the old PR stack untouched, and includes focused verification in the PR body. The only remaining full-suite failure is the pre-existing src/tools/task/todo-sync.test.ts failure already reproducible on clean upstream/dev.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 29 files

Confidence score: 2/5

  • I’m marking this as high risk because there are multiple high-severity, high-confidence compatibility issues (7–8/10) that can cause incorrect capability detection and provider behavior at runtime.
  • The most severe issue is in src/shared/model-capabilities.ts: Opencode SDK v2 structured capabilities fields (e.g., reasoning/temperature/toolcall) are ignored, which can lead to wrong capability resolution for supported models.
  • src/shared/model-capability-heuristics.ts and src/shared/connected-providers-cache.ts include concrete regression risks: unsupported gpt-5 "max" handling, overly strict openai-reasoning matching, and metadata id overwrite that can break normalized lookups.
  • Pay close attention to src/shared/model-capabilities.ts, src/shared/model-capability-heuristics.ts, src/shared/connected-providers-cache.ts, and src/shared/model-capabilities-cache.ts - capability resolution, variant handling, and ID/normalization invariants are at risk.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/shared/model-capability-heuristics.ts">

<violation number="1" location="src/shared/model-capability-heuristics.ts:27">
P2: `openai-reasoning` regex is too strict (`^` anchor) and misses valid prefixed model IDs, preventing reasoning-family capability resolution.</violation>

<violation number="2" location="src/shared/model-capability-heuristics.ts:34">
P1: Custom agent: **Opencode Compatibility**

The `max` variant is not supported for `gpt-5` models in the Opencode SDK. It is Anthropic-specific (used via `adaptiveEfforts` for models like `opus-4.6`). Remove `\"max\"` from the `gpt-5` variants list to ensure Opencode compatibility.</violation>
</file>

<file name="src/shared/model-capabilities.ts">

<violation number="1" location="src/shared/model-capabilities.ts:75">
P2: isRecord treats arrays as records, so normalizeVariantKeys will accept array variants and turn them into numeric string keys ("0", "1"), corrupting runtime variants.</violation>

<violation number="2" location="src/shared/model-capabilities.ts:190">
P1: Custom agent: **Opencode Compatibility**

The capability resolver ignores the Opencode SDK v2's structured `capabilities` object.

In the Opencode SDK v2, properties like `reasoning`, `temperature`, and `toolcall` (no underscore), as well as input modalities, are nested under a `capabilities` property rather than at the root level. The current resolver only checks root-level properties (e.g., `runtimeModel["tool_call"]`, `runtimeModel["reasoning"]`) and expects string arrays for `modalities`, which breaks compatibility with the v2 SDK structure.</violation>
</file>

<file name="src/shared/model-capabilities-cache.ts">

<violation number="1" location="src/shared/model-capabilities-cache.ts:90">
P2: merge logic always creates `modalities`/`limit` objects, turning absent optional fields into empty `{}` and violating the snapshot normalization invariant.</violation>
</file>

<file name="src/shared/connected-providers-cache.ts">

<violation number="1" location="src/shared/connected-providers-cache.ts:201">
P1: Spread order allows `rawMetadata.id` to overwrite normalized `id`, breaking metadata ID normalization and lookups.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@code-yeongyu code-yeongyu merged commit 99b3980 into code-yeongyu:dev Mar 25, 2026
7 of 8 checks passed
@RaviTharuma
Copy link
Copy Markdown
Contributor Author

@code-yeongyu addressed Cubic review in 613ef8ee.

Fixed:

  • removed unsupported GPT-5 max heuristic variant
  • made o-series heuristic match provider-prefixed IDs
  • hardened runtime variant parsing for arrays
  • added SDK v2 capabilities support (reasoning, temperature, toolcall, nested variants, capability-based modalities)
  • fixed provider metadata id spread order
  • preserved undefined for absent snapshot modalities / limit

Verification:

  • bun test src/shared/model-capabilities.test.ts src/shared/model-capabilities-cache.test.ts src/shared/connected-providers-cache.test.ts src/shared/model-settings-compatibility.test.ts --bail
  • bun test src/plugin/chat-params.test.ts --bail
  • bunx tsc --noEmit
  • bun run build
  • git diff --check

All six Cubic threads are replied to and resolved.

@RaviTharuma
Copy link
Copy Markdown
Contributor Author

Correction on the follow-up commit: #2826 had already merged before 613ef8ee landed on the branch. I opened #2829 to carry the six Cubic review fixes on top of current dev: #2829

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants