refactor(agents): unify canonical codex model facts#38173
refactor(agents): unify canonical codex model facts#38173who96 wants to merge 3 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR completes phase 1 of RFC #37804 by extracting all canonical codex forward-compat model facts into a single Key observations:
Confidence Score: 4/5
|
| models.push({ | ||
| ...baseModel, | ||
| id: facts.id, | ||
| name: facts.name, | ||
| ...(typeof facts.contextWindow === "number" ? { contextWindow: facts.contextWindow } : {}), | ||
| ...(typeof facts.maxTokens === "number" ? { maxTokens: facts.maxTokens } : {}), | ||
| } as T); |
There was a problem hiding this comment.
reasoning and input canonical facts not applied to catalog entries
applyCanonicalForwardCompatCatalogEntries overrides contextWindow and maxTokens from canonical facts but silently inherits reasoning and input from the base model via the spread. This diverges from the full runtime fallback path (lines 58-69 in pi-embedded-runner/model.ts), which correctly uses facts.reasoning and facts.input.
Concrete example: the gpt-5.3-codex-spark canonical facts declare input: ["text", "image"] (model-facts.ts line 84), but the catalog test mocks the base gpt-5.3-codex with input: ["text"] only (model-catalog.test.ts line 111). When applyCanonicalForwardCompatCatalogEntries spreads the base model, the derived spark entry will inherit input: ["text"], silently contradicting the canonical fact. The test doesn't assert on input for the spark entry (lines 131-133), so this gap goes undetected. The same issue applies to gpt-5.4: the test doesn't verify that reasoning and input match the canonical facts (lines 177-183).
Consider applying canonical reasoning and input explicitly, just as they are applied in the full runtime fallback:
| models.push({ | |
| ...baseModel, | |
| id: facts.id, | |
| name: facts.name, | |
| ...(typeof facts.contextWindow === "number" ? { contextWindow: facts.contextWindow } : {}), | |
| ...(typeof facts.maxTokens === "number" ? { maxTokens: facts.maxTokens } : {}), | |
| } as T); | |
| models.push({ | |
| ...baseModel, | |
| id: facts.id, | |
| name: facts.name, | |
| reasoning: facts.reasoning, | |
| input: [...facts.input], | |
| ...(typeof facts.contextWindow === "number" ? { contextWindow: facts.contextWindow } : {}), | |
| ...(typeof facts.maxTokens === "number" ? { maxTokens: facts.maxTokens } : {}), | |
| } as T); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-facts.ts
Line: 134-140
Comment:
`reasoning` and `input` canonical facts not applied to catalog entries
`applyCanonicalForwardCompatCatalogEntries` overrides `contextWindow` and `maxTokens` from canonical facts but silently inherits `reasoning` and `input` from the base model via the spread. This diverges from the full runtime fallback path (lines 58-69 in `pi-embedded-runner/model.ts`), which correctly uses `facts.reasoning` and `facts.input`.
Concrete example: the `gpt-5.3-codex-spark` canonical facts declare `input: ["text", "image"]` (model-facts.ts line 84), but the catalog test mocks the base `gpt-5.3-codex` with `input: ["text"]` only (model-catalog.test.ts line 111). When `applyCanonicalForwardCompatCatalogEntries` spreads the base model, the derived spark entry will inherit `input: ["text"]`, silently contradicting the canonical fact. The test doesn't assert on `input` for the spark entry (lines 131-133), so this gap goes undetected. The same issue applies to `gpt-5.4`: the test doesn't verify that `reasoning` and `input` match the canonical facts (lines 177-183).
Consider applying canonical `reasoning` and `input` explicitly, just as they are applied in the full runtime fallback:
```suggestion
models.push({
...baseModel,
id: facts.id,
name: facts.name,
reasoning: facts.reasoning,
input: [...facts.input],
...(typeof facts.contextWindow === "number" ? { contextWindow: facts.contextWindow } : {}),
...(typeof facts.maxTokens === "number" ? { maxTokens: facts.maxTokens } : {}),
} as T);
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Addressed in 61aa0b365d8fd63f660c7b2e97e3b0ffbf28e8d5. That commit applies canonical reasoning and input to catalog-derived entries and extends the spark / gpt-5.4 assertions to lock the behavior.
|
@greptile-apps review - New commit applies canonical reasoning and input facts explicitly in applyCanonicalForwardCompatCatalogEntries, matching the runtime fallback path. Test assertions for spark and gpt-5.4 now verify reasoning and input values. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61aa0b365d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ...(typeof facts.contextWindow === "number" ? { contextWindow: facts.contextWindow } : {}), | ||
| ...(typeof facts.maxTokens === "number" ? { maxTokens: facts.maxTokens } : {}), |
There was a problem hiding this comment.
Apply canonical reasoning/input in codex template fallback
When a Codex fallback is built from a discovered template, this branch only overrides contextWindow/maxTokens, so reasoning and input still come from the template model. If the template metadata is stale (for example, reports text-only or non-reasoning), resolveModel("openai-codex", "gpt-5.4", ...) will return capabilities that contradict the canonical facts layer, which can disable thinking/vision behavior for these forward-compat models.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 8b666a6d8bb695ba601a14daaa07a8f19f12e1c5. That change renames the helper and applies canonical reasoning and input in the template fallback path so runtime fallback matches the canonical facts layer.
|
@greptile-apps review - New commit 8b666a6 renames resolveOpenAICodexGpt53FallbackModel to resolveOpenAICodexForwardCompatModel and applies canonical reasoning/input in the template fallback path, addressing both Greptile comments. |
User Impact
Users trying newer Codex models such as
openai-codex/gpt-5.4could see capability drift between what the catalog listed and what runtime actually used, especially around reasoning/image capability and forward-compat fallback behavior. This phase makes catalog and runtime agree on the same canonical codex facts.Summary
src/agentsmodel-catalogand embedded runtime resolution consume the same codex facts layergpt-5.4facts and codex transport/limit orderingContext
src/agentsfact-source layer; CLI / auto-reply / picker consumer migration is deferred to later phasesTest Plan
pnpm exec vitest run src/agents/model-catalog.test.ts src/agents/pi-embedded-runner/model.test.ts src/agents/model-selection.test.tspnpm tsgoRelated: #37804 (RFC: treat models.providers-injected models as first-class forward-compat models)
Ref: #37623 (openai-codex/gpt-5.4 configurable but not supported in runtime)