feat(core): reasoning for all providers (auto fallback)#90
Conversation
There was a problem hiding this comment.
Findings
- [Blocker] Silent fallback path introduced for most models — this conflicts with the project constraint that fallbacks/errors must be surfaced, and the new behavior now applies
reasoning: 'auto'to all non-Claude-4 models while explicitly documenting that unsupported models are silently ignored. Evidencepackages/core/src/index.ts:408,packages/core/src/index.ts:411,packages/providers/src/index.ts:16
Suggested fix:function reasoningForModel(model: ModelRef): ReasoningLevel | undefined { if (model.provider === 'anthropic' && CLAUDE_4_MODEL_RE.test(model.modelId)) { return 'high'; } if (isKnownReasoningModel(model)) { return 'auto'; } return undefined; }
Summary
- Review mode: initial
- 1 blocker found in added/modified lines.
docs/VISION.mdanddocs/PRINCIPLES.md: Not found in repo/docs; policy validation usedCLAUDE.mdin this checkout.
Testing
- Not run (automation)
| // Other reasoning-capable models (OpenAI o-series / GPT-5, DeepSeek R1, | ||
| // Qwen QwQ, OpenRouter pass-through, etc.) — let pi-ai's adapter decide | ||
| // based on the model capability flag. Non-reasoning models drop it. | ||
| return 'auto'; |
There was a problem hiding this comment.
Applying return 'auto' for all non-Claude-4 models introduces a silent fallback path for unsupported models. The repo constraints require surfaced failures instead of silent behavior. Consider gating auto to known reasoning-capable models and returning undefined otherwise (or surfacing an explicit warning/error path).
… sentinel Codex review on PR #90 flagged 'auto' as a silent fallback: pi-ai's ThinkingLevel does not include 'auto', so OpenAI / OpenRouter non-reasoning models silently drop it while older Anthropic Claudes coerce it to 'high' and unintentionally enable extended thinking. Both behaviours violate the project's "no silent fallbacks" rule. Replace the sentinel with an explicit whitelist: - Anthropic Claude 4.x -> 'high' (extended thinking) - OpenAI o1/o3/o4 + GPT-5 -> 'high' - DeepSeek R1 -> 'high' - Qwen QwQ -> 'high' - everything else -> undefined (pi-ai sees no reasoning field) Also revert the providers ReasoningLevel union back to the named effort levels pi-ai actually understands.
There was a problem hiding this comment.
Findings
- [Major] Over-broad reasoning-model detection can reintroduce silent fallback behavior —
reasoningForModelcurrently enables reasoning based on substring matches regardless of provider, so pass-through model IDs (especially onopenrouter) that happen to containo1/o3/o4/gpt-5/r1/qwqcan receivereasoningunintentionally, which conflicts with the no-silent-fallback constraint. Evidencepackages/core/src/index.ts:412.
Suggested fix:function reasoningForModel(model: ModelRef): ReasoningLevel | undefined { const id = model.modelId.toLowerCase(); if (model.provider === 'anthropic' && /claude-(?:opus|sonnet)-4/.test(id)) return 'high'; if ( model.provider === 'openai' && (/^gpt-5(?:$|[-:])/i.test(model.modelId) || /^o(?:1|3|4)(?:$|[-:])/i.test(model.modelId)) ) { return 'high'; } if (model.provider === 'groq' && /deepseek.*r1|r1.*deepseek/i.test(model.modelId)) return 'high'; if (model.provider === 'qwen' && /\bqwq\b/i.test(model.modelId)) return 'high'; return undefined; }
Summary
- Review mode: initial
- 1 issue found (Major) in reasoning capability detection.
docs/VISION.mdanddocs/PRINCIPLES.mdwere not found in repo/docs in this checkout.
Testing
- Not run (automation)
open-codesign Bot
| if (model.provider === 'anthropic' && CLAUDE_4_MODEL_RE.test(model.modelId)) { | ||
| return 'high'; | ||
| } | ||
| if (/\b(o1|o3|o4|gpt-5)/i.test(model.modelId)) return 'high'; |
There was a problem hiding this comment.
[Major] This regex is provider-agnostic and substring-based, so model IDs from pass-through providers can accidentally match and receive reasoning, reintroducing silent fallback behavior. Consider provider-scoped, anchored allowlists (e.g., openai-only for gpt-5/o1|o3|o4) and default undefined elsewhere.
Suggested fix:
if (
model.provider === 'openai' &&
(/^gpt-5(?:$|[-:])/i.test(model.modelId) || /^o(?:1|3|4)(?:$|[-:])/i.test(model.modelId))
) {
return 'high';
}… sentinel Codex review on PR #90 flagged 'auto' as a silent fallback: pi-ai's ThinkingLevel does not include 'auto', so OpenAI / OpenRouter non-reasoning models silently drop it while older Anthropic Claudes coerce it to 'high' and unintentionally enable extended thinking. Both behaviours violate the project's "no silent fallbacks" rule. Replace the sentinel with an explicit whitelist: - Anthropic Claude 4.x -> 'high' (extended thinking) - OpenAI o1/o3/o4 + GPT-5 -> 'high' - DeepSeek R1 -> 'high' - Qwen QwQ -> 'high' - everything else -> undefined (pi-ai sees no reasoning field) Also revert the providers ReasoningLevel union back to the named effort levels pi-ai actually understands.
Codex flagged that the previous whitelist still ran substring checks (o1/o3/r1/qwq) against `model.modelId` for ANY provider. OpenRouter, Groq, Cerebras and other pass-through providers happily forward an arbitrary upstream id like `mystery-lab/o1-lookalike` or `anthropic/claude-sonnet-4` — so reasoning got silently enabled on models the upstream provider does not advertise as reasoning-capable. That re-introduced the silent-fallback behavior PR #90 set out to kill. Tighten the gate to a (provider, modelId) pair: - anthropic + claude-(opus|sonnet)-4 → 'high' - openai + ^(o1|o3|o4|gpt-5)(...)$ → 'high' - everything else (incl. all pass-through providers) → undefined Add tests for the OpenRouter pass-through cases and o3-mini / claude-opus-4-7 first-party cases.
cf033d1 to
4469c23
Compare
|
Addressing Codex [Major] re: substring matching across all providers.
Tightened the gate to require both axes to match a first-party provider we trust:
Decision rationale: pass-through providers serve arbitrary upstream model ids, so the model id alone is not a trustable capability signal. Users who want reasoning on a pass-through provider should configure the first-party provider directly (BYOK is already the model). Better to under-enable than to silently drop or mistranslate the reasoning knob. Added tests for: openrouter |
There was a problem hiding this comment.
Findings
- None.
Summary
- Review mode: follow-up after new commits
- No issues found in added/modified lines of the latest PR diff.
- Residual risk:
docs/VISION.mdanddocs/PRINCIPLES.mdwere not found in this checkout (Not found in repo/docs), so principle-level validation was limited toCLAUDE.mdplus diff evidence.
Testing
- Not run (automation)
open-codesign Bot
Claude 4 仍然 explicit 'high',其他 provider 改 'auto' 让 pi-ai 决定。让 OpenRouter elephant-alpha / OpenAI o-series / DeepSeek R1 等也能享受 reasoning。