Skip to content

fix(auto-reply): preserve session text when message tool sends only media#37900

Closed
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/37841-whatsapp-text-media-drop
Closed

fix(auto-reply): preserve session text when message tool sends only media#37900
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/37841-whatsapp-text-media-drop

Conversation

@Sid-Qin
Copy link
Copy Markdown
Contributor

@Sid-Qin Sid-Qin commented Mar 6, 2026

Summary

  • Problem: When an agent writes session text and also calls the message tool to send media in the same turn, the session text is silently dropped on WhatsApp and other messaging channels. No error is raised.
  • Why it matters: Users lose the agent's text explanation and only receive the media. Neither the agent nor the user knows information was lost.
  • What changed: Gated shouldSuppressMessagingToolReplies on messagingToolSentTexts.length > 0 in buildReplyPayloads. Media-only sends no longer trigger reply suppression, preserving the agent's session text.
  • What did NOT change: Text deduplication behavior when the message tool sends text, cross-target suppression logic, or media URL deduplication.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • When the message tool sends only media (no text/caption) to the same target, the agent's session text is now delivered as a separate message
  • Text deduplication still suppresses replies when the message tool sends text

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: any
  • Runtime: Node.js 22+
  • Integration/channel: WhatsApp (also affects Telegram, Discord, etc.)

Steps

  1. Configure an agent on WhatsApp
  2. Send a request that produces both text and media (e.g., "analyze this and send a chart")
  3. Observe both text and media delivery

Expected

  • User receives both the text explanation and the media

Actual

  • Before fix: only media arrives, text silently dropped
  • After fix: both text and media are delivered

Evidence

 ✓ src/auto-reply/reply/agent-runner-payloads.test.ts (10 tests) 4ms

 Test Files  1 passed (1)
      Tests  10 passed (10)

New test verifies: media-only message tool send does not suppress session text for same-target WhatsApp delivery.

Human Verification (required)

  • Verified scenarios: media-only send (text preserved), text send (text suppressed), cross-target (text preserved), text+media send
  • Edge cases checked: empty messagingToolSentTexts, empty messagingToolSentTargets
  • What I did not verify: live WhatsApp media delivery with session text

Compatibility / Migration

  • Backward compatible? Yes — only un-suppresses previously dropped text
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert: revert this commit; text is suppressed again when tool sends media
  • Files/config to restore: src/auto-reply/reply/agent-runner-payloads.ts
  • Known bad symptoms: if the agent writes the same text as the media caption, both text and caption are delivered (minor duplicate)

Risks and Mitigations

  • Risk: agents that intentionally use NO_REPLY with media-only tools may see unexpected text delivery → mitigated by: NO_REPLY produces empty text payloads which are already filtered out

…edia

When the message tool sends media to the same target in the same turn,
shouldSuppressMessagingToolReplies returns true and all final payloads
(including the agent's text response) are dropped. This silently loses
the session text with no error or warning.

Gate the suppression on messagingToolSentTexts.length > 0 so that
media-only sends do not trigger reply suppression. Text deduplication
still works when the tool sends text content to the same target.

Closes openclaw#37841
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 6, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Messaging-tool suppression bypass when sends contain no tracked text (media/caption/components) causing duplicate outbound replies

1. 🟠 Messaging-tool suppression bypass when sends contain no tracked text (media/caption/components) causing duplicate outbound replies

Property Value
Severity High
CWE CWE-840
Location src/auto-reply/reply/agent-runner-payloads.ts:92-127

Description

buildReplyPayloads now suppresses final replies only if messagingToolSentTexts.length > 0. However, the messaging tool can successfully send content without populating messagingToolSentTexts (e.g., media-only sends, attachment captions, interactive/component-only messages, or tools whose text param is not content/message).

As a result:

  • A same-target messaging tool send can occur (tracked in messagingToolSentTargets / messagingToolSentMediaUrls) while messagingToolSentTexts remains empty.
  • suppressMessagingToolReplies becomes false, so the normal auto-reply is still emitted.
  • Additionally, dedupe is disabled in this case (dedupeMessagingToolPayloads becomes false when targets are present), so media/text duplicates that would otherwise be filtered may be sent again.

This creates an abuse/spam/billing amplification vector: a malicious prompt (or a user prompt that the agent follows) can instruct the agent to send via the messaging tool using non-text fields (e.g., media + caption) to keep messagingToolSentTexts empty, thereby forcing the system to also emit an additional auto-reply (and potentially resend the same media).

Vulnerable logic:

const messagingToolSentText = messagingToolSentTexts.length > 0;
const suppressMessagingToolReplies =
  messagingToolSentText &&
  shouldSuppressMessagingToolReplies({ ... });

const dedupeMessagingToolPayloads =
  suppressMessagingToolReplies || messagingToolSentTargets.length === 0;

Recommendation

Decouple suppression (whether to drop all final replies) from dedupe (removing duplicates) and base both on whether the messaging tool actually sent anything to the same origin target, not on sentTexts alone.

Safer options:

  1. Always compute same-target send and use it to enable dedupe, even when you choose not to fully suppress replies:
const sentSameTarget = shouldSuppressMessagingToolReplies({
  messageProvider: resolveOriginMessageProvider({
    originatingChannel: params.originatingChannel,
    provider: params.messageProvider,
  }),
  messagingToolSentTargets,
  originatingTo: resolveOriginMessageTo({ originatingTo: params.originatingTo }),
  accountId: resolveOriginAccountId({ originatingAccountId: params.accountId }),
});// Keep the new behavior (don’t drop the whole reply) if tool only sent media,// but still dedupe against tool-sent media/text.
const suppressMessagingToolReplies = sentSameTarget && messagingToolSentTexts.length > 0;
const dedupeMessagingToolPayloads = sentSameTarget || messagingToolSentTargets.length === 0;
  1. Track a single boolean like messagingToolSentAny based on any of:
  • messagingToolSentTargets.length > 0
  • messagingToolSentTexts.length > 0
  • messagingToolSentMediaUrls.length > 0
  • (optionally) other send metadata (templates/components)

and use that for suppression/guardrails.

  1. Improve tracking so that non-message/content fields that produce user-visible text (e.g., caption) are added to messagingToolSentTexts for correct suppression/dedupe behavior.

Add tests asserting that when a same-target tool send occurs with only media/caption, duplicates are still deduped (and/or suppression is applied according to the intended policy).


Analyzed PR: #37900 at commit 21c7dc0

Last updated on: 2026-03-06T14:47:18Z

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR fixes a silent data-loss bug where session text was dropped whenever the message tool sent media to the same target in a single agent turn. The fix gates shouldSuppressMessagingToolReplies behind messagingToolSentTexts.length > 0, so media-only tool sends no longer trigger reply suppression.

Core fix is correct — the suppressMessagingToolReplies gate on messagingToolSentText precisely targets the described bug and is backed by a new unit test.

Media deduplication regressiondedupeMessagingToolPayloads derives from suppressMessagingToolReplies as a same-target proxy. After this fix, when the message tool sends media-only to the same target, suppressMessagingToolReplies becomes false, which causes dedupeMessagingToolPayloads to also become false. This disables filterMessagingToolMediaDuplicates, so if a session payload carries a mediaUrl that was already delivered by the message tool to the same target, the media would be sent twice. This contradicts the PR's claim that media URL deduplication did not change. Separating the same-target detection (isSameTargetSend) from the text-existence check would preserve both the bug fix and existing media dedup behavior.

Confidence Score: 3/5

  • The core bug fix is safe and correct; however, it introduces a subtle media duplication regression in the edge case where a session payload carries a mediaUrl matching an already-sent message tool media URL for same-target sends.
  • Score 3/5 reflects that this is safe to merge for the common case (preserving session text when message tool sends media-only), but carries an identified regression. The core fix is sound and well-tested for its intended purpose. The media deduplication regression only manifests when: (1) message tool sends only media to same target, (2) session payload carries a mediaUrl matching a tool-sent URL. While this scenario is uncommon, it's a real regression relative to the PR's claim that media URL deduplication was unchanged.
  • src/auto-reply/reply/agent-runner-payloads.ts (lines 114-115) — dedupeMessagingToolPayloads logic needs to separate same-target detection from text-existence checking to preserve media deduplication for same-target media-only sends.

Comments Outside Diff (1)

  1. src/auto-reply/reply/agent-runner-payloads.ts, line 114-115 (link)

    Media deduplication is bypassed for same-target media-only sends

    When the message tool sends only media (no text) to the same target, suppressMessagingToolReplies is false because of the gate on messagingToolSentTexts.length > 0 (line 94). This causes dedupeMessagingToolPayloads to become false (line 114-115), which disables filterMessagingToolMediaDuplicates (lines 122-127).

    If a session payload carries a mediaUrl that was already delivered by the message tool to the same target, that media will be sent a second time. This contradicts the PR description claim that "media URL deduplication" did not change.

    To fix this while preserving the main bug fix, separate the same-target detection from the text-existence check:

    const messagingToolSentText = messagingToolSentTexts.length > 0;
    const isSameTargetSend = shouldSuppressMessagingToolReplies({
      messageProvider: resolveOriginMessageProvider({
        originatingChannel: params.originatingChannel,
        provider: params.messageProvider,
      }),
      messagingToolSentTargets,
      originatingTo: resolveOriginMessageTo({
        originatingTo: params.originatingTo,
      }),
      accountId: resolveOriginAccountId({
        originatingAccountId: params.accountId,
      }),
    });
    const suppressMessagingToolReplies = messagingToolSentText && isSameTargetSend;
    // ...
    const dedupeMessagingToolPayloads =
      isSameTargetSend || messagingToolSentTargets.length === 0;

    This preserves both the main bug fix (no text suppression for media-only sends) and the existing media URL deduplication behavior for same-target sends.

Last reviewed commit: 21c7dc0

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21c7dc0e89

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +94 to +96
const messagingToolSentText = messagingToolSentTexts.length > 0;
const suppressMessagingToolReplies =
messagingToolSentText &&
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve same-target media dedupe for media-only sends

Conditioning suppression on messagingToolSentTexts.length > 0 breaks the same-target dedupe path when a tool send has no tracked text (for example media-only sends, or caption-based sends where text is not recorded): suppressMessagingToolReplies becomes false, messagingToolSentTargets is non-empty, and dedupeMessagingToolPayloads becomes false, so filterMessagingToolMediaDuplicates is skipped and duplicate attachments can be sent again to the same chat.

Useful? React with 👍 / 👎.

@MaxHermez
Copy link
Copy Markdown

To avoid the one mentioned regression, maybe you can check message content to deduplicate the caption across other messages that are indended to be sent from the session?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Session text invisible to user when message tool sends media on WhatsApp

3 participants