Skip to content

fix: MOIM injection crashes session when conversation ends with assistant message#7421

Closed
angiejones wants to merge 1 commit intomainfrom
fix/moim-trailing-assistant
Closed

fix: MOIM injection crashes session when conversation ends with assistant message#7421
angiejones wants to merge 1 commit intomainfrom
fix/moim-trailing-assistant

Conversation

@angiejones
Copy link
Copy Markdown
Collaborator

@angiejones angiejones commented Feb 22, 2026

Summary

MOIM injection kills sessions when the conversation ends with an assistant message (e.g. text-only response during retry). The conversation gets sent to the API ending with an assistant message, and models that don't support prefill reject it with a 400.

The buginject_moim always inserts before the last assistant message:

[0] User:      "clean up comments in check.sh"
[1] Assistant:  "Let me first look at the current state of check.sh."
[2] Assistant:  [tool_request: text_editor view check.sh]
...
[9] User:       [tool_response: <file contents>]
[10] Assistant: "Now I'll clean this up..."   ← text-only, no tool call

Old code inserts MOIM at index 10 (before the last assistant):

let mut messages = conversation.messages().clone();
let idx = messages
    .iter()
    .rposition(|m| m.role == Role::Assistant)  // finds message [10]
    .unwrap_or(0);
messages.insert(idx, Message::user().with_text(moim));  // inserts BEFORE [10]

This creates:

[9]  User:      [tool_response]
[10] User:      <MOIM>              ← inserted here
[11] Assistant:  "Now I'll clean this up..."  ← trailing

fix_conversation removes the trailing assistant → has_unexpected_issues fires → falls back to the original conversation which still ends with assistant → 400 from Anthropic: "This model does not support assistant message prefill. The conversation must end with a user message." Retried 3x, all fail, session dies.

The fix is to append MOIM after the last message when the conversation ends with an assistant message.

Type of Change

  • Bug fix
  • Tests

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

New test test_moim_injection_trailing_assistant reproduces the exact scenario. Verified it fails on old code and passes with the fix:

All existing tests pass (4 moim + 38 conversation). Clippy clean.

Copilot AI review requested due to automatic review settings February 22, 2026 20:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug where MOIM (Model-Oriented Information Message) injection crashes the session when conversations end with an assistant message. The bug occurred because the old logic always inserted MOIM before the last assistant message, creating a trailing assistant message that would be removed by fix_conversation, causing API validation failures with a 400 error from providers like Anthropic.

Changes:

  • Modified inject_moim to append MOIM after the last message when conversation ends with assistant
  • Added test case validating the fix for trailing assistant message scenarios

@tlongwell-block
Copy link
Copy Markdown
Collaborator

Hey, @angiejones 👋

The MOIM never moves a real user message back. And it never pushes an assistant message forward.

It would be better to fix this in the reply loop, I think. This is definitely a real issue, but the MOIM injection is not the cause and we probably shouldn't try to fix it there.

@tlongwell-block
Copy link
Copy Markdown
Collaborator

tlongwell-block commented Feb 22, 2026

The conversation being sent back to the inference provider should not end with an assistant message ever.

I think we're seeing that error crop up when a tool call request exceeds the max tokens for a single reply, which then gets sent half baked to the inference provider instead of correctly processed. I haven't RCA'ed it, just my initial thoughts

This case used to fail silently, but since Anthropic stopped allowing prefills, it now fails loudly (again, just a guess)

@angiejones
Copy link
Copy Markdown
Collaborator Author

The conversation being sent back to the inference provider should not end with an assistant message ever.

I think we're seeing that error crop up when a tool call request exceeds the max tokens for a single reply, which then gets sent half baked to the inference provider instead of correctly processed. I haven't RCA'ed it, just my initial thoughts

This case used to fail silently, but since Anthropic stopped allowing prefills, it now fails loudly (again, just a guess)

ok sending you my session for debugging

@tlongwell-block
Copy link
Copy Markdown
Collaborator

Working in #7424

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants