Skip to content

fix: preserve unsigned thinking blocks for DeepSeek Anthropic endpoint#18509

Open
SivanCola wants to merge 3 commits intoNousResearch:mainfrom
SivanCola:fix/deepseek-kimi-thinking-blocks
Open

fix: preserve unsigned thinking blocks for DeepSeek Anthropic endpoint#18509
SivanCola wants to merge 3 commits intoNousResearch:mainfrom
SivanCola:fix/deepseek-kimi-thinking-blocks

Conversation

@SivanCola
Copy link
Copy Markdown

@SivanCola SivanCola commented May 1, 2026

Summary

  • Fix HTTP 400 errors when using DeepSeek /anthropic endpoint with thinking mode enabled
  • Widen DeepSeek V-series model regex to accept bracket suffixes like [1m] (e.g. deepseek-v4-flash[1m] for 1M context)

Problem

When replaying multi-turn conversations through DeepSeek's /anthropic endpoint, the API returns:

HTTP 400: The `content[].thinking` in the thinking mode must be passed back to the API.

Root cause: _extract_preserved_thinking_blocks() extracts signed thinking blocks from reasoning_details first, setting _already_has_thinking=True. This prevents the reasoning_content path from creating unsigned blocks. The signed blocks are then stripped by the thinking-block management pass (since DeepSeek cannot validate Anthropic signatures), resulting in no thinking blocks at all.

Fix

In convert_messages_to_anthropic(), skip extracting signed thinking blocks from reasoning_details when the target endpoint is DeepSeek /anthropic. This allows the reasoning_content path to create unsigned blocks that DeepSeek accepts.

Also widen the _DEEPSEEK_V_SERIES_RE regex from ^deepseek-v\d+([-.].+)?$ to ^deepseek-v\d+.*$ so model names with bracket suffixes (like deepseek-v4-flash[1m]) are correctly recognized as V-series models instead of being normalized to deepseek-chat.

Test plan

  • Verified with DeepSeek deepseek-v4-flash[1m] via /anthropic endpoint — multi-turn conversations now work without HTTP 400
  • Verified model normalization: deepseek-v4-flash[1m]deepseek-v4-flash[1m] (not deepseek-chat)

…dpoints

DeepSeek /anthropic and Kimi /coding endpoints require thinking blocks
to be passed back in subsequent API requests when thinking mode is
enabled. Previously, signed thinking blocks from reasoning_details were
extracted first, which set _already_has_thinking=True and prevented the
unsigned reasoning_content path from creating blocks. The signed blocks
were then stripped by the thinking-block management pass (since
DeepSeek/Kimi cannot validate Anthropic signatures), resulting in no
thinking blocks at all and HTTP 400 errors.

Fix: skip extracting signed thinking blocks from reasoning_details when
the target endpoint is DeepSeek /anthropic or Kimi /coding, allowing the
reasoning_content path to create unsigned blocks that these endpoints
accept.

Also widen the DeepSeek V-series model regex to accept bracket suffixes
like [1m] (e.g. deepseek-v4-flash[1m] for 1M context), which were
previously falling back to deepseek-chat.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels May 1, 2026
@alt-glitch
Copy link
Copy Markdown
Collaborator

Related to #17991 (DeepSeek /anthropic thinking blocks) and #17210 (Kimi thinking blocks) — this PR combines both fixes with a different approach (skip signed extraction rather than preserve all).

Scope the fix to DeepSeek /anthropic endpoint only, as Kimi has not
been verified to have the same issue.
@SivanCola SivanCola changed the title fix: preserve unsigned thinking blocks for DeepSeek/Kimi Anthropic endpoints fix: preserve unsigned thinking blocks for DeepSeek Anthropic endpoint May 1, 2026
@alt-glitch
Copy link
Copy Markdown
Collaborator

Related to #17991 and #17210

…canonical models

Add [1m] context-window variants to _DEEPSEEK_CANONICAL_MODELS and
_PROVIDER_MODELS so these model IDs resolve explicitly (not just via
the V-series regex fallback).

Related: NousResearch#18509
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/deepseek DeepSeek API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants