refactor(anthropic): fix N+1 thinking message storage issue#7958
refactor(anthropic): fix N+1 thinking message storage issue#7958DOsinga merged 1 commit intoblock:mainfrom
Conversation
ae84c8a to
c95de17
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c95de17474
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The thinking delta handling was proposed in this PR initially, but block#7626 merged first. This adds certain optimizations proposed earlier in this PR. As thinking_delta is yielded and stored individually, then the complete block re-emitted at content_block_stop — N+1 entries per thinking block end up in both database and session memory. Streaming partial thinking deltas serves no purpose: without a signature they cannot be sent back to the API (format_messages drops them), and the UI renders thinking as a complete block rather than incrementally. Emit once at content_block_stop. Replace JSON delta comparisons with typed ContentDelta enum. Add streaming tests. Change-Id: Id42262e30f1eea8d24f1ddb87381395714a91821 Signed-off-by: Rabi Mishra <[email protected]>
c95de17 to
fc16a5e
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc16a5e473
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
DOsinga
left a comment
There was a problem hiding this comment.
The fix is correct — buffering thinking deltas and emitting a single complete message at content_block_stop is the right approach. The N+1 storage issue is real and this resolves it cleanly. The ContentDelta enum is a nice improvement over stringly-typed JSON checks. Tests follow the existing run_streaming_test pattern and test real behaviour. Good work!
Signed-off-by: Rabi Mishra <[email protected]> Signed-off-by: esnyder <[email protected]>
Signed-off-by: Rabi Mishra <[email protected]> Signed-off-by: esnyder <[email protected]>
* origin/main: (62 commits) Tweak the release process: no more merge to main (#7994) fix: gemini models via databricks (#8042) feat(apps): Pass toolInfo to MCP Apps via hostContext (#7506) fix: remove configured marker when deleting oauth provider configuration (#7887) docs: add vmware-aiops MCP extension documentation (#8055) Show setup instructions for ACP providers in settings modal (#8065) deps: replace sigstore-verification with sigstore-verify to kill vulns (#8064) feat(acp): add session/set_config and stabilize list, delete and close (#7984) docs: Correct `gosoe` typo to `goose` (#8062) fix: use default provider and model when provider in session no longer exists (#8035) feat: add GOOSE_SHELL env var to configure preferred shell (#7909) fix(desktop): fullscreen header bar + always-visible close controls (#8033) docs: add Claude Code approve mode permission routing documentation (#7949) chatgpt_codex: Support reasoning and gpt-5.4 (#7941) refactor(anthropic): fix N+1 thinking message storage issue (#7958) fix: handle mid-stream error events in OpenAI SSE streaming (#8031) Fix apps extension: coerce string arguments from inner LLM responses (#8030) feat: ability to expand sidebar to see chats names (#7816) Fix config for GOOSE_MAX_BACKGROUND_TASKS (#7940) set MACOSX_DEPLOYMENT_TARGET=12.0 (#7947) ...
…pstream * wpfleger/socket-support: (62 commits) Tweak the release process: no more merge to main (#7994) fix: gemini models via databricks (#8042) feat(apps): Pass toolInfo to MCP Apps via hostContext (#7506) fix: remove configured marker when deleting oauth provider configuration (#7887) docs: add vmware-aiops MCP extension documentation (#8055) Show setup instructions for ACP providers in settings modal (#8065) deps: replace sigstore-verification with sigstore-verify to kill vulns (#8064) feat(acp): add session/set_config and stabilize list, delete and close (#7984) docs: Correct `gosoe` typo to `goose` (#8062) fix: use default provider and model when provider in session no longer exists (#8035) feat: add GOOSE_SHELL env var to configure preferred shell (#7909) fix(desktop): fullscreen header bar + always-visible close controls (#8033) docs: add Claude Code approve mode permission routing documentation (#7949) chatgpt_codex: Support reasoning and gpt-5.4 (#7941) refactor(anthropic): fix N+1 thinking message storage issue (#7958) fix: handle mid-stream error events in OpenAI SSE streaming (#8031) Fix apps extension: coerce string arguments from inner LLM responses (#8030) feat: ability to expand sidebar to see chats names (#7816) Fix config for GOOSE_MAX_BACKGROUND_TASKS (#7940) set MACOSX_DEPLOYMENT_TARGET=12.0 (#7947) ...
Summary
Thinking delta handling was initially proposed in this PR, but #7626 merged afterwards seems to have added it now. This adds certain optimizations proposed earlier in this PR.
As thinking_delta is yielded and stored individually, then the complete block re-emitted at content_block_stop, N+1 entries per thinking block end up in both database and session memory though they're filtered in format_messages when replaying.
Partial thinking deltas serves no purpose without a signature and the UI renders thinking as a complete block rather than incrementally. Emit once at content_block_stop.
Replace JSON delta comparisons with typed ContentDelta enum. Add streaming tests.
Testing
Locally tested with anthropic provider.