Skip to content

Comments

fix(agents): instruct agent not to retry lost tool results#13282

Open
thebtf wants to merge 1 commit intoopenclaw:mainfrom
thebtf:fix/transcript-repair-no-retry
Open

fix(agents): instruct agent not to retry lost tool results#13282
thebtf wants to merge 1 commit intoopenclaw:mainfrom
thebtf:fix/transcript-repair-no-retry

Conversation

@thebtf
Copy link

@thebtf thebtf commented Feb 10, 2026

Summary

  • The synthetic error message inserted during session transcript repair was too generic — agents would often retry the exact same call, triggering another rate-limit or timeout loop
  • Add explicit instructions to the synthetic tool result: "DO NOT retry this exact call" and "inform the user about the error and ask how to proceed"

Test plan

  • Verified the message text change in makeMissingToolResult()
  • CI

Greptile Overview

Greptile Summary

This PR updates the synthetic tool-result text emitted during session transcript repair to explicitly tell agents not to retry a “lost” tool call, reducing retry loops on rate limits/timeouts.

It also adds new Pi embedded compaction capabilities:

  • optional compaction model override (agents.defaults.compaction.{model,overrideModel})
  • a compaction timeout (agents.defaults.compaction.timeoutMs) with abort + clearer logging
  • disables reasoning on the model used for compaction sessions

Finally, it extends the internal hook event type union to include message and adds a MessageReceived* event shape + type guard, with related test changes in Telegram message processing.

Confidence Score: 3/5

  • This PR has a few correctness issues that should be addressed before merging.
  • Core intent (improving transcript repair messaging and adding compaction timeout/model override) is clear, but there’s at least one concrete compaction wiring bug (tools use the non-overridden model id), a Telegram test asserting behavior not present in production code, and a new hook-event type guard that unsafely narrows without validating required fields.
  • src/agents/pi-embedded-runner/compact.ts, src/telegram/bot-message.test.ts, src/hooks/internal-hooks.ts

Context used:

  • Context from dashboard - CLAUDE.md (source)
  • Context from dashboard - AGENTS.md (source)

Validation

  • pnpm build — passes
  • pnpm check — passes
  • pnpm test — verified message text change manually

Contribution checklist

  • Focused scope: Single message text change in makeMissingToolResult()
  • What + why: described above
  • AI-assisted: Yes, Claude Code was used for the fix. Testing level: lightly tested (manual verification)

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 16 to 20
vi.mock("../hooks/internal-hooks.js", () => ({
triggerInternalHook,
createInternalHookEvent,
}));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mocking unused exports

This test now mocks triggerInternalHook/createInternalHookEvent and asserts triggerInternalHook was called, but src/telegram/bot-message.ts doesn’t import or call either function. As written, the assertion will fail (or is a no-op if the matcher never runs due to earlier failure). Either remove these mocks/assertions or update the production code to actually emit the hook event.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/telegram/bot-message.test.ts
Line: 16:20

Comment:
**Mocking unused exports**

This test now mocks `triggerInternalHook`/`createInternalHookEvent` and asserts `triggerInternalHook` was called, but `src/telegram/bot-message.ts` doesn’t import or call either function. As written, the assertion will fail (or is a no-op if the matcher never runs due to earlier failure). Either remove these mocks/assertions or update the production code to actually emit the hook event.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 169 to 200
export interface MessageReceivedContext {
ctxPayload: Record<string, unknown>;
channel: string;
messageId: string;
from: string;
to: string;
isGroup: boolean;
chatId: string;
senderId?: string;
hasMedia?: boolean;
mediaCount?: number;
timestamp?: number;
}

export type MessageReceivedHookEvent = InternalHookEvent & {
type: "message";
action: "received";
context: MessageReceivedContext;
};

export function isMessageReceivedEvent(
event: InternalHookEvent,
): event is MessageReceivedHookEvent {
if (event.type !== "message" || event.action !== "received") {
return false;
}
const context = event.context as Partial<MessageReceivedContext> | null;
if (!context || typeof context !== "object") {
return false;
}
return typeof context.channel === "string" && typeof context.messageId === "string";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type guard too weak

isMessageReceivedEvent() only validates context.channel and context.messageId, but ReceivedHookEvent’s context requires several other non-optional fields (from, to, chatId, isGroup, ctxPayload, etc.). Callers using this guard will still be able to read missing/invalid fields under the narrowed type. Either validate all required fields or loosen MessageReceivedContext to match what the guard guarantees.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/hooks/internal-hooks.ts
Line: 169:200

Comment:
**Type guard too weak**

`isMessageReceivedEvent()` only validates `context.channel` and `context.messageId`, but `ReceivedHookEvent`’s `context` requires several other non-optional fields (`from`, `to`, `chatId`, `isGroup`, `ctxPayload`, etc.). Callers using this guard will still be able to read missing/invalid fields under the narrowed type. Either validate all required fields or loosen `MessageReceivedContext` to match what the guard guarantees.

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 10, 2026

Additional Comments (1)

src/agents/pi-embedded-runner/compact.ts
Wrong model id used

createOpenClawCodingTools() is still passed modelId (the primary run model) even when compaction overrides are enabled (effectiveProvider/effectiveModelId). This makes tool defaults/policy keyed off the wrong model during compaction (e.g., provider-specific tool behavior, auth mode selection, or any model-id based decisions inside tools). Pass the effective model id/provider consistently for the compaction session to avoid mismatched tool behavior.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 239:262

Comment:
**Wrong model id used**

`createOpenClawCodingTools()` is still passed `modelId` (the primary run model) even when compaction overrides are enabled (`effectiveProvider/effectiveModelId`). This makes tool defaults/policy keyed off the *wrong* model during compaction (e.g., provider-specific tool behavior, auth mode selection, or any model-id based decisions inside tools). Pass the effective model id/provider consistently for the compaction session to avoid mismatched tool behavior.

How can I resolve this? If you propose a fix, please make it concise.

@openclaw-barnacle openclaw-barnacle bot added channel: telegram Channel integration: telegram agents Agent runtime and tooling labels Feb 10, 2026
@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch from fee3dec to 3df5497 Compare February 10, 2026 16:36
@openclaw-barnacle openclaw-barnacle bot removed the channel: telegram Channel integration: telegram label Feb 10, 2026
@rodbland2021
Copy link
Contributor

We've been investigating the underlying cause of the tool result loss bug and submitted a fix in #13746.

The root cause is a race condition: pi-agent-core's _resolveRetry() fires on message_end (before tool execution completes), causing waitForRetry() to resolve early. The attempt's finally block then calls flushPendingToolResults() while tools are still in flight — inserting the synthetic errors.

This PR addresses a symptom but not the timing issue that causes tool results to be lost in the first place. The fix in #13746 adds waitForIdle() before the flush to ensure the retry loop fully completes. We're testing it on a production VPS now.

See detailed analysis in our comment on #8643.

@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch from 3df5497 to 92fb241 Compare February 16, 2026 17:41
@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch from 92fb241 to 8b1040d Compare February 16, 2026 17:50
@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation channel: telegram Channel integration: telegram commands Command implementations size: XL and removed size: XS labels Feb 16, 2026
@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch from 632087c to c25190e Compare February 16, 2026 18:41
@openclaw-barnacle openclaw-barnacle bot added size: XS and removed docs Improvements or additions to documentation channel: telegram Channel integration: telegram commands Command implementations size: XL labels Feb 16, 2026
@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch 3 times, most recently from 5901224 to 4b62cca Compare February 20, 2026 02:00
The synthetic error message inserted during transcript repair was too
generic — agents would often retry the exact same call, triggering
another rate-limit or timeout loop. Add explicit instructions:
"DO NOT retry this exact call" and "inform the user about the error".
@thebtf thebtf force-pushed the fix/transcript-repair-no-retry branch from 4b62cca to 6755f44 Compare February 21, 2026 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants