WhatsApp media, image generation, and gateway patches by guiramos · Pull Request #3 · butley/openclaw

guiramos · 2026-03-05T02:40:37Z

Summary

Merge upstream openclaw v2026.3.2
Add image_generate tool (Gemini) — saves to ~/.openclaw/media/ for WhatsApp outbound compatibility
Gateway media endpoint serving TTS audio and generated images via GET /media/{filename}
Chat media pipeline: imageReady events, audioUrl/imageUrl extraction from history
WhatsApp paragraph streaming, outbound mentions, login tool dedup
TUI dark theme, logs pretty formatter, verbose light mode, status card redesign
19 custom patches documented with verify script and 001.patch files

Test plan

Agent generates image and sends via WhatsApp (assertLocalMediaAllowed fixed)
Frontend displays images via /media proxy
TTS audio playback works in webchat
All 19 patches pass verify-patches.sh

🤖 Generated with Claude Code

@hsiaoa

…bundle splitting Without this fix, the bundler can emit multiple copies of internal-hooks into separate chunks. registerInternalHook writes to one Map instance while triggerInternalHook reads from another — resulting in hooks that silently fire with zero handlers regardless of how many were registered. Reproduce: load a hook via hooks.external.entries (loader reads one chunk), then send a message:transcribed event (get-reply imports a different chunk). The handler list is empty; the hook never runs. Fix: use globalThis.__openclaw_internal_hook_handlers__ as a shared singleton. All module copies check for and reuse the same Map, ensuring registrations are always visible to triggers.

Addresses greptile review: collapses the if-guard + assignment into a single ??= expression so TypeScript can narrow the type without a non-null assertion.

@Drickon

…anks @Drickon)

When gateway.restart is triggered with a reason but no separate note, the payload sets both message and stats.reason to the same text. formatRestartSentinelMessage() then emits both the message line and a redundant 'Reason: <same text>' line, doubling the restart reason in the notification delivered to the agent session. Skip the 'Reason:' line when stats.reason matches the already-emitted message text. Add regression tests for both duplicate and distinct reason scenarios.

…velamints2

…velamints2)

…32128) - Remove vi.hoisted() wrapper from exported mock in shared module (Vitest cannot export hoisted variables) - Inline vi.hoisted + vi.mock in startup test so Vitest's per-file hoisting registers mocks before production imports Co-authored-by: Claude Opus 4.6 <[email protected]>

) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>

… runs (openclaw#32358) The `forceFlushTranscriptBytes` path (introduced in d729ab2) bypasses the `memoryFlushCompactionCount` guard that prevents repeated flushes within the same compaction cycle. Once the session transcript exceeds 2 MB, memory flush fires on every single message — even when token count is well under the compaction threshold. Extract `hasAlreadyFlushedForCurrentCompaction()` from the inline guard in `shouldRunMemoryFlush` and apply it to both the token-based and the transcript-size trigger paths. Fixes openclaw#32317 Signed-off-by: HCL <[email protected]>

) The sticker code path called ctx.getFile() directly without retry, unlike the non-sticker media path which uses resolveTelegramFileWithRetry (3 attempts with jitter). This made sticker downloads vulnerable to transient Telegram API failures, particularly in group topics where file availability can be delayed. Refs openclaw#32326 Co-authored-by: Claude Opus 4.6 <[email protected]>

…32320) Top-level channel messages were creating isolated per-message sessions because roomThreadId fell through to threadContext.messageTs whenever replyToMode was not off. Introduced in openclaw#10686, every new channel message got its own session key (agent:...:thread:<messageTs>), breaking conversation continuity. Fix: only derive thread-specific session keys for actual thread replies. Top-level channel messages stay on the per-channel session key regardless of replyToMode. Fixes openclaw#32285

) * fix: add session-memory hook support for Feishu provider Issue openclaw#31275: Session-memory hook not triggered when using /new command in Feishu - Added command handler to Feishu provider - Integrated with OpenClaw's before_reset hook system - Ensures session memory is saved when /new or /reset commands are used * Changelog: note Feishu session-memory hook parity --------- Co-authored-by: Tak Hoffman <[email protected]>

…ps (openclaw#30315) * fix(feishu): guard against false-positive @mentions in multi-app groups When multiple Feishu bot apps share a group chat, Feishu's WebSocket event delivery remaps the open_id in mentions[] per-app. This causes checkBotMentioned() to return true for ALL bots when only one was actually @mentioned, making requireMention ineffective. Add a botName guard: if the mention's open_id matches this bot but the mention's display name differs from this bot's configured botName, treat it as a false positive and skip. botName is already available via account.config.botName (set during onboarding). Closes openclaw#24249 * fix(feishu): support @ALL mention in multi-bot groups When a user sends @ALL (@_all in Feishu message content), treat it as mentioning every bot so all agents respond when requireMention is true. Feishu's @ALL does not populate the mentions[] array, so this needs explicit content-level detection. * fix(feishu): auto-fetch bot display name from API for reliable mention matching Instead of relying on the manually configured botName (which may differ from the actual Feishu bot display name), fetch the bot's display name from the Feishu API at startup via probeFeishu(). This ensures checkBotMentioned() always compares against the correct display name, even when the config botName doesn't match (e.g. config says 'Wanda' but Feishu shows '绯红女巫'). Changes: - monitor.ts: fetchBotOpenId → fetchBotInfo (returns both openId and name) - monitor.ts: store botNames map, pass botName to handleFeishuMessage - bot.ts: accept botName from params, prefer it over config fallback * Changelog: note Feishu multi-app mention false-positive guard --------- Co-authored-by: Teague Xiao <[email protected]> Co-authored-by: Tak Hoffman <[email protected]>

- 3 conflicts resolved: server-methods-list, process-message, monitor - Refactor: extract wa-streaming-utils.ts + wa-verbose-utils.ts (Butley patches) so future upstream merges never conflict on these again - monitor.ts: adopt upstream enrichInboundMessage/enqueueInboundMessage structure; re-apply WA Streaming (sendAvailable) + WA Mentions (processOutboundMentions, noteContactName) as 3 surgical edits in enqueueInboundMessage Co-authored-by: Bob

…m-config refactor channelId→channel, body→ctx.BodyForCommands??ctx.Body, conversationId→chatId Variables renamed/scoped differently in upstream v2026.3.2 auto-merge. Co-authored-by: Bob

Fix A: Suppress noisy plugin table logged at gateway startup via console.log. Lines matching 'Plugins (N/M loaded)', 'Source roots:', source path entries, and the renderTable() output (starts with ┌, contains ┬ and 'Status') are now silently dropped. Use `openclaw plugins list` to see the plugin table on demand. Fix B: Multi-line log entries (e.g. ASCII tables, multi-line tool output) are now rendered with proper indentation (15 chars) instead of collapsing all newlines into one mangled line. Also: minor style cleanup in wa-verbose-utils.ts (curly braces). Co-authored-by: Bob

…a endpoint re-applied

…Ready

…rendering - image-generate tool now returns imageUrl in details (like TTS audioUrl) - gateway extracts imageUrl from toolResult details in chat.history - gateway emits imageReady (renamed from mediaReady) with imageUrl field - refactor captureMediaFromPayload into shared helper

…mission resolveToolDeliveryPayload was filtering out image_generate tool results because they had no mediaUrl/mediaUrls fields (only MEDIA: markers in text). TTS worked because maybeApplyTtsToPayload adds mediaUrls explicitly. Now extracts MEDIA:/ paths from tool result text into mediaUrls so the deliver callback can collect them for imageReady event emission.

…s it The ACP dispatch path formats tool results as summaries, so the raw MEDIA markers from image_generate never reach the deliver callback. After agent run completes, fall back to scanning the session transcript for toolResult details.imageUrl to emit imageReady events.

…#15-openclaw#22) Each patch dir now has a README.md (human context) and 001.patch (git format-patch for mechanical re-application via git apply --3way). Updated registry, verify script, and re-application order docs.

…outbound Images saved to /root/clawd/ failed assertLocalMediaAllowed because that path is only in agent-scoped media roots (requires agentId flow). ~/.openclaw/media/ is always in default roots, bypassing the issue. Also add ~/.openclaw/media/ to gateway media handler search paths.

Regenerated 001.patch files and updated READMEs to reflect image save location change from /root/clawd/ to ~/.openclaw/media/ and the gateway media handler's three search paths.

CRITICAL: - #1: Fix InboxMessage.from type from number to string (Pilot address) - #2: Fix parseNodeIdFromPilotAddress to extract only last hex group - #3: Rebuild hostnameToNodeId cache from Convex on startup - #4: Fix AgentNetworkConfig → AgentRegistryConfig type reference - #5: Track processed message hashes to prevent re-processing on clear failure HIGH: - #7: Change default pollIntervalSeconds from 300 to 15 - #8: Fix startDaemon regex to match Pilot address format (0:xxxx.xxxx.xxxx) - #9: Use registerAgent() for re-registration to preserve enriched metadata - #10: Fix search query param from 'query' to 'q' - #12: handleRefresh supports optional accountId, refreshes all if omitted - #13: Move spawn import to top-level, remove 7 inline dynamic imports - #14: Mark inbox tool action as debug-only (poll loop handles processing) MEDIUM: - #15: Add installation_id to PilotPeer interface and agentToPeer - #16: Use exact /by-hostname/ endpoint for get_agent action - #17: Add LRU-style eviction to hostnameToNodeId (max 1000 entries) - openclaw#18: Add mutex (isPolling flag) to executePollCycle - openclaw#19: Add lifecycle comments to startDaemon/stopDaemon (container vs local) - openclaw#20: Pass gatewayToken in fetchNetworkMetadata Convex query - openclaw#21: Wrap sendText fallback in try/catch with error handling - openclaw#23: Document sendMessage JSON vs raw text design intent

hsiaoa and others added 30 commits March 3, 2026 00:25

models-config: apply config env vars before implicit provider discovery

65dc3ee

fix: apply config env vars before model discovery (openclaw#32295) (t…

1e8afa1

…hanks @hsiaoa)

refactor: use ??= operator for cleaner globalThis singleton init

d0a3743

Addresses greptile review: collapses the if-guard + assignment into a single ??= expression so TypeScript can narrow the type without a non-null assertion.

fix: stabilize internal hooks singleton registry (openclaw#32292) (th…

68e982e

…anks @Drickon)

refactor(voice-call): extract twilio twiml policy and status mapping

a96b3b4

refactor(voice-call): split webhook server and tailscale helpers

439a773

test(voice-call): split call manager tests by scenario

82101b1

fix(discord): add missing accountId to reaction routing params

d76ddd6

fix: dedupe restart sentinel reason output (openclaw#32083) (thanks @…

f6233cf

…velamints2)

feat(cli): add configurable banner tagline mode

1b5ac8b

feat(models): support minimax highspeed across onboarding

77ecef1

docs(models): refresh minimax kimi glm provider docs

86090b0

test(perf): speed up cron, memory, and secrets hotspots

282b107

test(perf): trim slack, hook, and plugin-validation test overhead

9657ded

chore(test): add vitest hotspot reporter script

5966219

fix(agents): harden openai ws tool call id handling

6649c22

feat(acp): enable dispatch by default

36dfd46

docs(acp): document sandbox limitation

7de4204

refactor: unify inbound debounce policy and split gateway/models helpers

4708346

fix(ci): resolve docs lint and test typing regressions

e930517

fix(test): tighten websocket and runner fixture typing

f3e6578

CLI: dedupe config validate errors and expose allowed values

f26853f

feat(acp): add kimi harness support surfaces

287606e

Linux2010 and others added 28 commits March 2, 2026 22:19

CI: start push test lanes earlier and drop check gating

be5de30

CI: disable flaky sticky disk mount for Windows pnpm setup

d45aa68

chore(release): cut 2026.3.2

85377a2

fix(tts): retain generated audio files for 7 days

284bf39

fix(butley-patch): update WS Inbound Push after upstream dispatch-fro…

ce623f2

…m-config refactor channelId→channel, body→ctx.BodyForCommands??ctx.Body, conversationId→chatId Variables renamed/scoped differently in upstream v2026.3.2 auto-merge. Co-authored-by: Bob

feat(media): serve image artifacts and attach mediaUrl in chat final

4beab8e

feat(tools): add Gemini image_generate tool for webchat media output

af6be10

fix(image_generate): use available Gemini image model by default

0972dc7

Merge alpha into feat/whatsapp-48-53 with imageGenerateTool and /medi…

3902c8e

…a endpoint re-applied

fix: restore TTS/audio features from feat/whatsapp-48-53

09f06c0

fix(image_generate): default to gemini-3.1-flash-image-preview

ae3085f

fix(chat): emit structured mediaReady event for image outputs

03f1dd8

fix(chat): collect MEDIA markers from tool text to emit mediaReady

924c31b

fix(chat): capture singular payload.mediaUrl for mediaReady emission

f74d61a

debug(chat): add temporary logging for media URL collection

902906a

fix(chat): capture MEDIA URLs from agent tool result events for media…

b692665

…Ready

debug(chat): add comprehensive logging for media capture flow

374ccd4

fix(chat): log every deliver call and capture MEDIA before early-return

6a5eda1

docs(patches): update patch #15 and #16 for ~/.openclaw/media/ save path

bb33e63

Regenerated 001.patch files and updated READMEs to reflect image save location change from /root/clawd/ to ~/.openclaw/media/ and the gateway media handler's three search paths.

guiramos merged commit c15ee81 into work Mar 5, 2026
1 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WhatsApp media, image generation, and gateway patches#3

WhatsApp media, image generation, and gateway patches#3
guiramos merged 2267 commits intoworkfrom
feat/whatsapp-48-53

guiramos commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

guiramos commented Mar 5, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants