fix(auto-reply): add unified outbound text sanitizer for NO_REPLY and streaming leaks#38232
fix(auto-reply): add unified outbound text sanitizer for NO_REPLY and streaming leaks#38232who96 wants to merge 5 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR introduces a unified outbound text sanitizer ( Key changes and observations:
Confidence Score: 4/5
Last reviewed commit: 797360d |
src/auto-reply/reply/reply-utils.ts
Outdated
| if (/^Reasoning:?$/i.test(text.trim())) { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Reasoning regex makes colon optional and case-insensitive
The regex /^Reasoning:?$/i uses :? (optional colon) and the i case-insensitive flag. The documented leak pattern is specifically the streaming prefix "Reasoning:" — always capitalized, always with a colon. Making the colon optional means the bare word "Reasoning" (no colon) sent by a user as their entire reply would also be silently suppressed. Similarly, the i flag suppresses "reasoning" or "REASONING" as standalone replies.
Consider requiring the colon and keeping the match case-sensitive to scope the suppression precisely to the known leak pattern:
| if (/^Reasoning:?$/i.test(text.trim())) { | |
| return null; | |
| } | |
| if (/^Reasoning:$/.test(text.trim())) { | |
| return null; | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-utils.ts
Line: 39-41
Comment:
**Reasoning regex makes colon optional and case-insensitive**
The regex `/^Reasoning:?$/i` uses `:?` (optional colon) and the `i` case-insensitive flag. The documented leak pattern is specifically the streaming prefix `"Reasoning:"` — always capitalized, always with a colon. Making the colon optional means the bare word `"Reasoning"` (no colon) sent by a user as their entire reply would also be silently suppressed. Similarly, the `i` flag suppresses `"reasoning"` or `"REASONING"` as standalone replies.
Consider requiring the colon and keeping the match case-sensitive to scope the suppression precisely to the known leak pattern:
```suggestion
if (/^Reasoning:$/.test(text.trim())) {
return null;
}
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Already exact at the current head, and this was locked in b3ad9ed68f58e39c2264693702205acb93b91bdc. The regression coverage now proves Reasoning and REASONING: still pass through while only exact Reasoning: is suppressed.
src/auto-reply/reply/reply-utils.ts
Outdated
|
|
||
| function isJsonWrappedNoReply(text: string): boolean { | ||
| const trimmed = text.trim(); | ||
| if (trimmed.length === 0 || trimmed.length > 200) { |
There was a problem hiding this comment.
Magic number for JSON payload length threshold
The value 200 is used as a hard cutoff for what counts as a "lightweight" JSON NO_REPLY payload, but it is not documented or extracted into a named constant. A reader has no way to know where 200 came from or whether it was intentional. There is also an off-by-one: a 200-character string passes the trimmed.length > 200 check (it equals 200, not greater), while a 201-character {"action":"NO_REPLY",...} slips through unsuppressed.
Consider extracting the constant with a comment explaining the rationale:
| if (trimmed.length === 0 || trimmed.length > 200) { | |
| if (trimmed.length === 0 || trimmed.length > MAX_NO_REPLY_JSON_BYTES) { | |
| return false; | |
| } |
and at module level:
/** Maximum byte length of a JSON payload that can be treated as a bare NO_REPLY sentinel. */
const MAX_NO_REPLY_JSON_BYTES = 200;Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/reply-utils.ts
Line: 5
Comment:
**Magic number for JSON payload length threshold**
The value `200` is used as a hard cutoff for what counts as a "lightweight" JSON NO_REPLY payload, but it is not documented or extracted into a named constant. A reader has no way to know where 200 came from or whether it was intentional. There is also an off-by-one: a 200-character string passes the `trimmed.length > 200` check (it equals 200, not greater), while a 201-character `{"action":"NO_REPLY",...}` slips through unsuppressed.
Consider extracting the constant with a comment explaining the rationale:
```suggestion
if (trimmed.length === 0 || trimmed.length > MAX_NO_REPLY_JSON_BYTES) {
return false;
}
```
and at module level:
```ts
/** Maximum byte length of a JSON payload that can be treated as a bare NO_REPLY sentinel. */
const MAX_NO_REPLY_JSON_BYTES = 200;
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Addressed in 5b1bc7ed48850f54d4ebecba1c8b323fdbbfc7f8. That commit extracts NO_REPLY_JSON_SENTINEL_MAX_LENGTH and documents why only tiny control-like JSON blobs are treated as sentinels.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: abbf9dcea3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| }); | ||
| provider = modelState.provider; | ||
| model = modelState.model; | ||
| resolvedReasoningLevel ??= await modelState.resolveDefaultReasoningLevel(); |
There was a problem hiding this comment.
Persist explicit reasoning off overrides across turns
This line now backfills resolvedReasoningLevel from the model default, which can be "on", but /reasoning off still clears sessionEntry.reasoningLevel instead of storing an explicit off override (src/auto-reply/reply/directive-handling.impl.ts, lines 321-323). In that setup, disabling reasoning only affects the current turn; the next message has no stored override, so this fallback re-enables reasoning, making user intent non-persistent on reasoning-capable models.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This is scoped to #37940, not this sanitizer PR. #37940 fixes the catalog-aware resolvedReasoning path for session_status and the synchronous status surfaces.
src/auto-reply/status.ts
Outdated
| const reasoningLevel = | ||
| args.resolvedReasoning ?? | ||
| (args.config ? resolveReasoningDefault({ cfg: args.config, provider, model }) : "off"); |
There was a problem hiding this comment.
Catalog-less fallback misses catalog-only reasoning models
The new resolveReasoningDefault fallback here checks user config but receives no catalog, so any model whose reasoning: true flag comes purely from Pi SDK catalog discovery (not from an explicit entry in the user's models.providers config) will return "off".
The most important production callsite that hits this path is session-status-tool.ts (line 433), which calls buildStatusMessage without supplying resolvedReasoning. The updated system prompt (also in this PR) now tells agents to "ALWAYS call session_status first" for reasoning/model questions — yet the session_status tool will report Reasoning: off for any user running gpt-5.4 via catalog-based discovery rather than an explicit config entry.
The /status command path already does the right thing (resolvedReasoning: resolvedReasoningLevel ?? (await resolveDefaultReasoningLevel())), so the inconsistency is specifically in the session_status tool path. To close the gap, session-status-tool.ts should resolve and pass resolvedReasoning the same way commands-status.ts does.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/status.ts
Line: 383-385
Comment:
**Catalog-less fallback misses catalog-only reasoning models**
The new `resolveReasoningDefault` fallback here checks user config but receives no catalog, so any model whose `reasoning: true` flag comes purely from Pi SDK catalog discovery (not from an explicit entry in the user's `models.providers` config) will return `"off"`.
The most important production callsite that hits this path is `session-status-tool.ts` (line 433), which calls `buildStatusMessage` without supplying `resolvedReasoning`. The updated system prompt (also in this PR) now tells agents to "ALWAYS call session_status first" for reasoning/model questions — yet the `session_status` tool will report `Reasoning: off` for any user running gpt-5.4 via catalog-based discovery rather than an explicit config entry.
The `/status` command path already does the right thing (`resolvedReasoning: resolvedReasoningLevel ?? (await resolveDefaultReasoningLevel())`), so the inconsistency is specifically in the `session_status` tool path. To close the gap, `session-status-tool.ts` should resolve and pass `resolvedReasoning` the same way `commands-status.ts` does.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
This is also scoped to #37940, not this sanitizer PR. #37940 threads catalog-aware reasoning resolution through session_status and the other synchronous status surfaces.
23b07b1 to
5b1bc7e
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b1bc7ed48
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
|
|
||
| function isDocumentedReasoningLeak(text: string): boolean { | ||
| return text.trim() === DOCUMENTED_REASONING_STREAM_LEAK; |
There was a problem hiding this comment.
Restrict bare
Reasoning: suppression to leak contexts
This check suppresses any outbound message whose trimmed text is exactly Reasoning:, regardless of whether it came from a leaked reasoning stream. In normal chat flows, a legitimate user-requested reply like “send exactly Reasoning:” will now be dropped by buildReplyPayloads (text becomes undefined and the payload is filtered out), which is a behavior regression from prior commits; consider gating this rule on an explicit reasoning/leak signal instead of raw text alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This PR intentionally matches the documented leak token at the shared outbound boundary. Adding explicit leak-context metadata would require a broader reply-pipeline change and is outside the scope of this sanitizer PR.
|
@greptile-apps review - New commit b3ad9ed adds negative test cases locking the exact Reasoning: matcher. The implementation already uses strict equality (text.trim() === 'Reasoning:'), not the regex pattern flagged in the previous review. REASONING:/reasoning: variants are confirmed to pass through. The other two comments (session-status resolvedReasoning and duplicate constants) are scoped to separate PRs #37940 and #38173 respectively. |
| const text = sanitizeOutboundText(normalized.payload.text); | ||
| return { | ||
| ...normalized.payload, | ||
| text: text ? text : undefined, |
There was a problem hiding this comment.
Falsy check discards empty-string text that sanitizer explicitly preserves
sanitizeOutboundText has a dedicated early-return for empty strings — if (text === "") return text — specifically to distinguish "preserve as-is" ("") from "actively suppress" (null). The three-valued contract is:
null→ suppress (caller should drop the field)undefined→ input was undefined (pass-through)string(including"") → keep unchanged
Using text ? text : undefined collapses the null and "" cases into the same outcome (undefined). A payload arriving with text: "" plus a mediaUrl or channelData will silently lose its (admittedly empty) text field, even though the sanitizer said to preserve it.
The straightforward fix is a null-specific check:
| text: text ? text : undefined, | |
| text: text !== null ? text : undefined, |
Or equivalently, text ?? undefined works too, since ?? only short-circuits on null/undefined, not on "".
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-payloads.ts
Line: 86
Comment:
**Falsy check discards empty-string text that sanitizer explicitly preserves**
`sanitizeOutboundText` has a dedicated early-return for empty strings — `if (text === "") return text` — specifically to distinguish "preserve as-is" (`""`) from "actively suppress" (`null`). The three-valued contract is:
- `null` → suppress (caller should drop the field)
- `undefined` → input was undefined (pass-through)
- `string` (including `""`) → keep unchanged
Using `text ? text : undefined` collapses the `null` and `""` cases into the same outcome (`undefined`). A payload arriving with `text: ""` plus a `mediaUrl` or `channelData` will silently lose its (admittedly empty) text field, even though the sanitizer said to preserve it.
The straightforward fix is a null-specific check:
```suggestion
text: text !== null ? text : undefined,
```
Or equivalently, `text ?? undefined` works too, since `??` only short-circuits on `null`/`undefined`, not on `""`.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
I left that collapse intentional in this PR. The sanitizer only needs null to suppress leaked text at the shared outbound boundary, and preserving empty-string payload shape exactly would be a separate payload-normalization follow-up.
|
Maintainer note: #38659 and #38689 overlap with this PR at the symptom level, but both are Telegram-specific fixes. #38232 is intentionally scoped one layer lower at the shared outbound reply boundary, so the NO_REPLY / bare Reasoning leak suppression is inherited by existing and future channels instead of being patched per channel. |
|
Note: this PR provides a unified sanitizer that covers the same NO_REPLY suppression pattern fixed in #32183 (suppress NO_REPLY token in chat transcript rendering by @ademczuk), but applied at the shared reply-utils layer so all channels (Telegram, Discord, WhatsApp, webchat) benefit without per-channel patches. Related per-channel fixes this supersedes: #37904, #37891, #37892. |
b3ad9ed to
1e50878
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e50878b3a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| blockReplyPipeline: null, | ||
| replyToMode: "first", | ||
| ...overrides, | ||
| }).replyPayloads; |
There was a problem hiding this comment.
Await buildReplyPayloads before reading replyPayloads
buildReplyPayloads returns a Promise, but this helper accesses .replyPayloads synchronously, so build(...) becomes undefined instead of an array. That means the new outbound-sanitizer tests in this block are not exercising the intended code path (and will fail/assert on wrong values), which undermines coverage for the regression this commit is trying to prevent; the helper and related assertions should await the async call.
Useful? React with 👍 / 👎.
fd4dc6a to
3652483
Compare
|
The async issue flagged by the Codex review on commit 1e50878 has been fixed in the current head (fd4dc6a) — all @chatgpt-codex-connector review — please re-evaluate against the latest commit. |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
3652483 to
7384372
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 738437206c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| payload: parsed, | ||
| payload: { | ||
| ...normalized.payload, | ||
| text: text ? text : undefined, |
There was a problem hiding this comment.
Preserve streamed payload keys when sanitizing final text
When block streaming aborts (didStream() === true and isAborted() === true), buildReplyPayloads falls back to final payloads and de-duplicates against blockReplyPipeline.hasSentPayload(...), which uses a content key that includes text (createBlockReplyContentKey in block-reply-pipeline.ts). Sanitizing normalized.payload.text here can change a streamed payload like {"action":"NO_REPLY"} (or bare Reasoning:) into undefined, so the fallback key no longer matches what was already streamed and the same media/channelData payload can be sent again.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Codex Review: Didn't find any major issues. 🎉
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
7384372 to
433e867
Compare
|
@greptileai review |
| const mediaFilteredPayloadEntries = dedupedPayloadEntries.flatMap((entry, index) => { | ||
| const payload = mediaFilteredPayloads[index]; | ||
| if (!payload || !isRenderablePayload(payload)) { | ||
| return []; | ||
| } | ||
| return [{ ...entry, payload }]; | ||
| }); |
There was a problem hiding this comment.
Index misalignment when media duplicates are removed mid-array
mediaFilteredPayloadEntries iterates dedupedPayloadEntries by position and indexes into mediaFilteredPayloads using the same index. But mediaFilteredPayloads is a filtered subset of dedupedPayloads — if any payload is removed from the middle (e.g. payload at index 1 of 3 is a media duplicate), every subsequent entry picks up the wrong payload and the last entry silently disappears.
Concrete example:
dedupedPayloads = [payA, payB, payC]— payB has a media URL already sentmediaFilteredPayloads = [payA, payC]— payB removed- Iteration:
dedupedPayloadEntries[1](entryB) getsmediaFilteredPayloads[1]= payC (wrong), while entryC'smediaFilteredPayloads[2]isundefined→ dropped entirely
The effect is that entryC is silently lost and payC is emitted with entryB's rawText, which can cause the content-key dedup in the block-streaming path to compare the wrong raw text against the pipeline's sentContentKeys.
The dedupedPayloadSet approach used a few lines above is the correct pattern — match by object identity rather than position:
const mediaFilteredPayloadSet = new Set(mediaFilteredPayloads);
const mediaFilteredPayloadEntries = dedupedPayloadEntries.filter(
(entry) => mediaFilteredPayloadSet.has(entry.payload),
);Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-payloads.ts
Line: 223-229
Comment:
**Index misalignment when media duplicates are removed mid-array**
`mediaFilteredPayloadEntries` iterates `dedupedPayloadEntries` by position and indexes into `mediaFilteredPayloads` using the same index. But `mediaFilteredPayloads` is a filtered subset of `dedupedPayloads` — if any payload is removed from the middle (e.g. payload at index 1 of 3 is a media duplicate), every subsequent entry picks up the wrong payload and the last entry silently disappears.
Concrete example:
- `dedupedPayloads = [payA, payB, payC]` — payB has a media URL already sent
- `mediaFilteredPayloads = [payA, payC]` — payB removed
- Iteration: `dedupedPayloadEntries[1]` (entryB) gets `mediaFilteredPayloads[1]` = **payC** (wrong), while entryC's `mediaFilteredPayloads[2]` is `undefined` → dropped entirely
The effect is that entryC is silently lost and payC is emitted with entryB's `rawText`, which can cause the content-key dedup in the block-streaming path to compare the wrong raw text against the pipeline's `sentContentKeys`.
The `dedupedPayloadSet` approach used a few lines above is the correct pattern — match by object identity rather than position:
```ts
const mediaFilteredPayloadSet = new Set(mediaFilteredPayloads);
const mediaFilteredPayloadEntries = dedupedPayloadEntries.filter(
(entry) => mediaFilteredPayloadSet.has(entry.payload),
);
```
How can I resolve this? If you propose a fix, please make it concise.| public let transport: String | ||
|
|
||
| public init( | ||
| ok: Bool, | ||
| status: Int, | ||
| apnsid: String?, | ||
| reason: String?, | ||
| tokensuffix: String, | ||
| topic: String, | ||
| environment: String) | ||
| environment: String, | ||
| transport: String) | ||
| { | ||
| self.ok = ok | ||
| self.status = status | ||
| self.apnsid = apnsid | ||
| self.reason = reason | ||
| self.tokensuffix = tokensuffix | ||
| self.topic = topic | ||
| self.environment = environment | ||
| self.transport = transport | ||
| } | ||
|
|
||
| private enum CodingKeys: String, CodingKey { | ||
| case ok | ||
| case status | ||
| case apnsid = "apnsId" | ||
| case reason | ||
| case tokensuffix = "tokenSuffix" | ||
| case topic | ||
| case environment | ||
| case transport | ||
| } | ||
| } |
There was a problem hiding this comment.
Unrelated model change included in sanitizer PR
The addition of transport: String to PushTestResult (and the identical change in apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift) appears unrelated to the outbound text sanitizer described in this PR. The PR description, linked issues, and background context all focus on the auto-reply NO_REPLY/Reasoning leak fix; there is no mention of push test transport metadata.
Because transport is a non-optional String it is a source-breaking change for any Swift call-sites that construct PushTestResult directly. If this change belongs to a different fix or feature, landing it here makes it harder to revert either change independently and adds noise to the sanitizer's review surface.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/macos/Sources/OpenClawProtocol/GatewayModels.swift
Line: 1109-1141
Comment:
**Unrelated model change included in sanitizer PR**
The addition of `transport: String` to `PushTestResult` (and the identical change in `apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift`) appears unrelated to the outbound text sanitizer described in this PR. The PR description, linked issues, and background context all focus on the auto-reply NO_REPLY/Reasoning leak fix; there is no mention of push test transport metadata.
Because `transport` is a non-optional `String` it is a source-breaking change for any Swift call-sites that construct `PushTestResult` directly. If this change belongs to a different fix or feature, landing it here makes it harder to revert either change independently and adds noise to the sanitizer's review surface.
How can I resolve this? If you propose a fix, please make it concise.433e867 to
0e77ecf
Compare
|
Fixed both issues from the last review:
@greptileai review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fb91f562e0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const mediaFilteredPayloadSet = new Set(mediaFilteredPayloads); | ||
| const mediaFilteredPayloadEntries = dedupedPayloadEntries.filter((entry) => | ||
| mediaFilteredPayloadSet.has(entry.payload), |
There was a problem hiding this comment.
Preserve entries when media dedupe rewrites payload objects
buildReplyPayloads rebuilds mediaFilteredPayloadEntries by checking object identity (new Set(mediaFilteredPayloads) + has(entry.payload)), but filterMessagingToolMediaDuplicates creates new payload objects when it strips duplicate media. That means rewritten payloads never match their original entry.payload references and get dropped entirely, so a reply like {text:"hello", mediaUrl:"..."} with a deduped media URL can become no final payload instead of preserving the text-only reply.
Useful? React with 👍 / 👎.
| const mediaFilteredPayloadSet = new Set(mediaFilteredPayloads); | ||
| const mediaFilteredPayloadEntries = dedupedPayloadEntries.filter((entry) => | ||
| mediaFilteredPayloadSet.has(entry.payload), | ||
| ); |
There was a problem hiding this comment.
Set membership check breaks for media-stripped payloads
filterMessagingToolMediaDuplicates is implemented with payloads.map(...). When a payload has a duplicate media URL to strip it returns a new object ({...payload, mediaUrl: undefined, ...}); only unchanged payloads return the original reference (return payload; // No change at reply-payloads.ts:152).
Because mediaFilteredPayloadSet is built from those potentially-new references, and dedupedPayloadEntries still holds the original entry.payload references, the has() check silently fails for every payload that had its media URL stripped:
dedupedPayloadEntries[i].payload → originalPayloadRef
mediaFilteredPayloads[i] → newPayloadRef (different object after .map spread)
mediaFilteredPayloadSet.has(originalPayloadRef) → false ← entry dropped!
The consequence is that any payload whose media URL was deduplicated — but still has non-empty text or channelData that should be delivered — is completely silently dropped from replyPayloads. Before this refactor, mediaFilteredPayloads flowed directly to filteredPayloads, so the modified payload (with stripped media URL) was still sent. After this refactor it is lost.
The filterMessagingToolDuplicates path does not have this problem because it uses .filter() and always preserves object references.
One fix: track entries by content key rather than object reference for the media-filter step, or perform the media-dedup directly on the entries array so the same object always flows through:
// Instead of a separate media-filtered payloads array, apply the media filter
// directly on entries and let each entry carry a potentially-updated payload.
const mediaFilteredPayloadEntries = dedupeMessagingToolPayloads
? applyMediaFilterToEntries(dedupedPayloadEntries, messagingToolSentMediaUrls)
: dedupedPayloadEntries;Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-payloads.ts
Line: 223-226
Comment:
**Set membership check breaks for media-stripped payloads**
`filterMessagingToolMediaDuplicates` is implemented with `payloads.map(...)`. When a payload has a duplicate media URL to strip it returns a **new object** (`{...payload, mediaUrl: undefined, ...}`); only unchanged payloads return the original reference (`return payload; // No change` at `reply-payloads.ts:152`).
Because `mediaFilteredPayloadSet` is built from those potentially-new references, and `dedupedPayloadEntries` still holds the original `entry.payload` references, the `has()` check silently fails for every payload that had its media URL stripped:
```
dedupedPayloadEntries[i].payload → originalPayloadRef
mediaFilteredPayloads[i] → newPayloadRef (different object after .map spread)
mediaFilteredPayloadSet.has(originalPayloadRef) → false ← entry dropped!
```
The consequence is that any payload whose media URL was deduplicated — but still has non-empty text or `channelData` that should be delivered — is completely silently dropped from `replyPayloads`. Before this refactor, `mediaFilteredPayloads` flowed directly to `filteredPayloads`, so the modified payload (with stripped media URL) was still sent. After this refactor it is lost.
The `filterMessagingToolDuplicates` path does not have this problem because it uses `.filter()` and always preserves object references.
One fix: track entries by content key rather than object reference for the media-filter step, or perform the media-dedup directly on the entries array so the same object always flows through:
```ts
// Instead of a separate media-filtered payloads array, apply the media filter
// directly on entries and let each entry carry a potentially-updated payload.
const mediaFilteredPayloadEntries = dedupeMessagingToolPayloads
? applyMediaFilterToEntries(dedupedPayloadEntries, messagingToolSentMediaUrls)
: dedupedPayloadEntries;
```
How can I resolve this? If you propose a fix, please make it concise.|
Fixed the media dedup regression: applied filterMessagingToolMediaDuplicates directly on the entries array so each entry carries the potentially-updated payload reference. No more Set-based identity mismatch when .map() + spread creates new objects. @greptileai review |
797360d to
006ce8b
Compare
|
Rebased onto latest main (c4265a5) to resolve merge conflicts. Two earlier commits dropped as already obsoleted by upstream changes. All 5 substantive fix commits preserved. CI re-running. |
006ce8b to
80ef3e8
Compare
80ef3e8 to
4608103
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4608103a7c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| payload: normalized.payload, | ||
| normalizeMediaPaths: params.normalizeMediaPaths, | ||
| }); | ||
| const text = sanitizeOutboundText(rawPayload.text); |
There was a problem hiding this comment.
Sanitize block-streamed text before emitting replies
This sanitizer is only executed in buildReplyPayloads, but when block streaming succeeds the same function drops all final payloads (shouldDropFinalPayloads), so this path is skipped and streamed chunks are emitted from createBlockReplyDeliveryHandler via normalizeStreamingText in src/auto-reply/reply/agent-runner-execution.ts without sanitizeOutboundText. In block-streaming environments, leaked control text like {"action":"NO_REPLY"} or bare Reasoning: will still be delivered to users, so the fix does not cover the primary streamed delivery path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch. The streaming path (normalizeStreamingText in agent-runner-execution.ts) does bypass this sanitizer. This PR only covers the final payload path in buildReplyPayloads. Fixing the block streaming path would also require changes in createBlockReplyDeliveryHandler, so I kept that out of scope here. I can open a follow-up for that.
|
Addressing remaining bot review conversations: Re: Re: Re: Re: Re: Codex Review: No major issues 🎉 (agent-runner-payloads.ts) |
daff254 to
7306ac9
Compare
Positional indexing into mediaFilteredPayloads broke when filterMessagingToolMediaDuplicates removed elements mid-array, causing subsequent entries to pair with wrong payloads and the last entry to be silently dropped. Use Set-based object identity filtering (matching the dedupedPayloadSet pattern) instead.
…yload refs filterMessagingToolMediaDuplicates uses .map() + spread, creating new object references when stripping duplicate media URLs. The previous Set-based identity check (Set.has(entry.payload)) silently dropped these rewritten payloads because the new references didn't match. Now apply media filtering on entries directly so each entry carries the potentially-updated payload, then filter out non-renderable entries.
7306ac9 to
0f2b02e
Compare
User Impact
Users on chat surfaces such as Telegram and Discord could see leaked control text like
{"action":"NO_REPLY"}or a bareReasoning:fragment show up as visible replies. This change suppresses those leaks once in the shared outbound reply path so existing and future channels inherit the same protection.Closes #37727
Closes #37890
What this PR does
This PR adds a unified outbound text sanitizer at the shared auto-reply reply exit, instead of adding more per-channel ad-hoc patches.
It suppresses:
NO_REPLYpayloads in lightweight formsNO_REPLYsentinelsReasoning:leaks with no bodyThe sanitizer is wired into
buildReplyPayloads, so media-only and channelData payloads are preserved while text-only silent/leaked payloads are suppressed.Background: #37904, #37891, #37892, and #37900 addressed this in scattered places. This PR unifies the behavior at the shared outbound reply layer.
Newer overlapping fixes such as #38659 and #38689 address the same symptom in Telegram-specific paths. This PR intentionally fixes the shared outbound reply layer so the suppression rules are inherited across existing and future channels instead of being patched per channel.
AI Disclosure