Skip to content

fix(session): detect post-compaction no-text degradation and trigger recovery (fixes #2232)#2668

Merged
code-yeongyu merged 3 commits intocode-yeongyu:devfrom
MoerAI:fix/session-degradation-detection
Mar 24, 2026
Merged

fix(session): detect post-compaction no-text degradation and trigger recovery (fixes #2232)#2668
code-yeongyu merged 3 commits intocode-yeongyu:devfrom
MoerAI:fix/session-degradation-detection

Conversation

@MoerAI
Copy link
Copy Markdown
Contributor

@MoerAI MoerAI commented Mar 18, 2026

Summary

  • Detect post-compaction session degradation (no-text assistant messages) and trigger recovery

Problem

Long-running sessions degrade after compaction. OpenCode core fails to restore agent configuration (agent.variant becomes undefined in prompt.ts:947), causing assistant messages to contain only structural parts (step-start/step-finish) with no text. Sessions become unusable.

Fix

Added post-compaction tail monitoring that watches the next assistant messages after session.compacted. When 3 consecutive assistant messages contain only step-start/step-finish parts without text, the hook logs a warning and triggers automatic recovery by re-running compaction summarize with the resolved compaction model.

Changes

File Change
src/hooks/preemptive-compaction.ts Wired in post-compaction degradation monitor and routed session.compacted/assistant update events into it
src/hooks/preemptive-compaction-degradation-monitor.ts Added dedicated no-text tail monitor state machine and recovery trigger logic
src/hooks/preemptive-compaction-no-text-tail.ts Added helper for detecting step-only/no-text assistant outputs from event parts or SDK message history
src/hooks/preemptive-compaction.degradation-monitor.test.ts Added regression tests covering detection threshold and streak reset behavior

Fixes #2232


Summary by cubic

Detects and auto-recovers from post-compaction “no-text” assistant responses so long-running sessions stay usable. Monitors messages after compaction, retries summarize when degradation is detected, and adds SDK compatibility and race-condition safeguards. Fixes #2232.

  • Bug Fixes
    • Post-compaction monitor: watch the next 5 assistant messages; trigger recovery after 3 step-only/no-text messages; reset streak on valid text.
    • Recovery: re-resolve the compaction model, run session.summarize with a timeout, show a warning toast; prevent duplicate runs while compaction is active.
    • Reliability: normalize SDK message responses; align session.messages() call with existing parameter shape; tolerate partial message.updated (only cache tokens when present); support both session.compacted payload shapes; clear monitor state on session.deleted; cross-runtime safe timeout clearing; epoch-based cleanup to avoid races.
    • Tests added for detection threshold and streak reset.

Written for commit 071fab1. Summary will update on new commits.

@MoerAI
Copy link
Copy Markdown
Contributor Author

MoerAI commented Mar 18, 2026

I have read the CLA Document and I hereby sign the CLA

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8f3186d65

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +173 to 180
if (info.providerID && info.tokens) {
tokenCache.set(info.sessionID, {
providerID: info.providerID,
modelID: info.modelID ?? "",
tokens: info.tokens,
})
}
compactedSessions.delete(info.sessionID)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve the post-compaction cooldown until token cache is refreshed

If the first assistant message.updated after session.compacted arrives without providerID/tokens (the degradation case this patch is trying to observe can do that), this branch still clears compactedSessions but leaves the old pre-compaction token cache in place. The next tool.execute.after then sees the stale 78%+ token snapshot and immediately calls session.summarize() again, so a single degraded update can start an unintended compaction loop. Only dropping the cooldown after refreshing tokenCache avoids that regression.

Useful? React with 👍 / 👎.

})
if (!Array.isArray(messages) || messages.length === 0) return false

const target = findMessageByID(messages, messageID) ?? messages[messages.length - 1]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not treat the previous history entry as the current assistant reply

When session.messages() is slightly behind the live message.updated event, findMessageByID() returns undefined and this falls back to messages[messages.length - 1]. If that last persisted entry is the prior step-only reply, a brand-new assistant message that already has real text gets counted as another no-text tail, and three such stale reads will trigger an unnecessary recovery compaction. Returning false when the requested messageID is missing would avoid these false positives.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 4 files

Confidence score: 2/5

  • High-confidence Opencode SDK compatibility mismatches in src/hooks/preemptive-compaction-no-text-tail.ts and src/hooks/preemptive-compaction.ts (nested session.messages() config and unsupported AssistantMessage.parts) are likely to cause runtime failures or broken compaction behavior.
  • src/hooks/preemptive-compaction-degradation-monitor.ts includes unsupported auto: true usage in both summarize payload and type shape, which creates a concrete regression risk against v1 SDK contracts.
  • This is scored as high risk because multiple issues are severe (8–9/10) with very high confidence, and they affect core session/compaction flows rather than isolated edge cases.
  • Pay close attention to src/hooks/preemptive-compaction-no-text-tail.ts, src/hooks/preemptive-compaction.ts, src/hooks/preemptive-compaction-degradation-monitor.ts - SDK contract mismatches and cleanup race conditions may break monitoring and summarization flows.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/hooks/preemptive-compaction.ts">

<violation number="1" location="src/hooks/preemptive-compaction.ts:168">
P1: Custom agent: **Opencode Compatibility**

The `parts` field does not exist on `AssistantMessage` in the Opencode SDK.</violation>
</file>

<file name="src/hooks/preemptive-compaction-no-text-tail.ts">

<violation number="1" location="src/hooks/preemptive-compaction-no-text-tail.ts:40">
P0: Custom agent: **Opencode Compatibility**

The Opencode SDK requires a flat configuration object for `session.messages()` (i.e. `{ sessionID, directory }`), rather than a nested `{ path, query }` object. Because the client type is manually redefined here, TypeScript won't catch this mismatch, leading to runtime failures when querying the session.</violation>
</file>

<file name="src/hooks/preemptive-compaction-degradation-monitor.ts">

<violation number="1" location="src/hooks/preemptive-compaction-degradation-monitor.ts:19">
P1: Custom agent: **Opencode Compatibility**

Remove `auto: true` from the `summarize` body type definition. The v1 Opencode SDK (`SessionSummarizeData`) does not support the `auto` property, which was only introduced in v2. Including it will cause a TypeScript error when injecting the actual v1 client.</violation>

<violation number="2" location="src/hooks/preemptive-compaction-degradation-monitor.ts:121">
P1: Custom agent: **Opencode Compatibility**

Remove `auto: true` from the `summarize` payload. The v1 Opencode SDK does not accept this field in `SessionSummarizeData`.</violation>

<violation number="3" location="src/hooks/preemptive-compaction-degradation-monitor.ts:136">
P2: Unconditional cleanup in `triggerRecovery` can race with `session.compacted` re-initialization and silently disable post-compaction monitoring.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@MoerAI
Copy link
Copy Markdown
Contributor Author

MoerAI commented Mar 18, 2026

Fixed all 5 issues from cubic review: (1) removed invalid AssistantMessage parts access and now resolve tail content via session message lookup, (2) updated no-text-tail session.messages call to flat { sessionID, directory } shape, (3) removed unsupported auto field from degradation-monitor summarize body type, (4) removed auto from degradation-monitor summarize payload, and (5) added an epoch guard in recovery cleanup to avoid racing with session.compacted re-initialization.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a3db64b931

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +182 to +185
await postCompactionMonitor.onAssistantMessageUpdated({
sessionID: info.sessionID,
id: info.id,
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass live assistant parts into the degradation detector

This recovery path throws away the parts already present on message.updated and re-classifies the reply only from session.messages(). Right after compaction that history call can lag behind the live event, so three real step-only replies can be missed and the 5-message window expires without ever triggering recovery. Fresh evidence: the new preemptive-compaction.degradation-monitor test already sends step-only parts, but summarize() never fires unless session.messages() is mocked to contain those replies too.

Useful? React with 👍 / 👎.

Comment on lines +141 to +145
} finally {
compactionInProgress.delete(sessionID)
if ((postCompactionEpoch.get(sessionID) ?? 0) === recoveryEpoch) {
clear(sessionID)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the post-compaction monitor armed after recovery failures

If the recovery session.summarize() call rejects or times out, this finally block still clears the monitor state for the current epoch. After that, later no-text assistant replies in the same degraded session are ignored because postCompactionRemaining has been removed, so one transient provider/SDK failure permanently disables the new auto-recovery path until another compaction happens.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/hooks/preemptive-compaction-no-text-tail.ts">

<violation number="1" location="src/hooks/preemptive-compaction-no-text-tail.ts:54">
P0: Custom agent: **Opencode Compatibility**

The Opencode SDK v1 client requires parameters to be nested under `path` and `query`. Flattened arguments will cause the API request to fail at runtime.</violation>
</file>

<file name="src/hooks/preemptive-compaction-degradation-monitor.ts">

<violation number="1" location="src/hooks/preemptive-compaction-degradation-monitor.ts:26">
P0: Custom agent: **Opencode Compatibility**

The `ClientLike.messages` interface was modified to use flat parameters (`sessionID`, `directory`) matching the OpenCode v2 SDK. However, this codebase uses the v1 SDK (as seen in the `summarize` method above and across the project), which requires nested parameters: `{ path: { id: string }, query?: { directory: string } }`. This mismatch will cause the `client.session.messages()` call in `resolveNoTextTailFromSession` to throw a runtime error, breaking the degradation monitor entirely. Please revert to the v1 signature and update the call site in `resolveNoTextTailFromSession` to match.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@MoerAI
Copy link
Copy Markdown
Contributor Author

MoerAI commented Mar 18, 2026

Fixed P0: session.messages() now uses the exact parameter shape found in existing codebase hooks.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 071fab1618

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +126 to +128
client.session.summarize({
path: { id: sessionID },
body: { providerID: targetProviderID, modelID: targetModelID },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the recovery summarize in auto compaction mode

When the new no-text-tail detector fires during an in-flight assistant turn, this recovery call sends session.summarize() without auto: true. Every other automatic compaction path in this repo includes that flag (src/hooks/preemptive-compaction.ts, src/hooks/anthropic-context-window-limit-recovery/summarize-retry-strategy.ts), and OpenCode's SessionCompaction.process() only replays the synthetic continue message when input.auto is true. In practice, the session can compact successfully here but still never resume the interrupted reply, so the user is left with a recovered session and no answer.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 3 files (changes from recent commits).

Requires human review: PR introduces a potential regression by removing null-safety on cached.tokens in preemptive-compaction.ts and adds complex state logic that exceeds the 100% certainty bar.

@code-yeongyu code-yeongyu added the triage:bug-fix PR: Bug fix label Mar 24, 2026
@code-yeongyu code-yeongyu merged commit 935995d into code-yeongyu:dev Mar 24, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

triage:bug-fix PR: Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: long-running OMO sessions degrade into no-text tails and become non-resumable

2 participants