Skip to content

WhatsApp media, image generation, and gateway patches#3

Merged
guiramos merged 2267 commits intoworkfrom
feat/whatsapp-48-53
Mar 5, 2026
Merged

WhatsApp media, image generation, and gateway patches#3
guiramos merged 2267 commits intoworkfrom
feat/whatsapp-48-53

Conversation

@guiramos
Copy link
Copy Markdown

@guiramos guiramos commented Mar 5, 2026

Summary

  • Merge upstream openclaw v2026.3.2
  • Add image_generate tool (Gemini) — saves to ~/.openclaw/media/ for WhatsApp outbound compatibility
  • Gateway media endpoint serving TTS audio and generated images via GET /media/{filename}
  • Chat media pipeline: imageReady events, audioUrl/imageUrl extraction from history
  • WhatsApp paragraph streaming, outbound mentions, login tool dedup
  • TUI dark theme, logs pretty formatter, verbose light mode, status card redesign
  • 19 custom patches documented with verify script and 001.patch files

Test plan

  • Agent generates image and sends via WhatsApp (assertLocalMediaAllowed fixed)
  • Frontend displays images via /media proxy
  • TTS audio playback works in webchat
  • All 19 patches pass verify-patches.sh

🤖 Generated with Claude Code

hsiaoa and others added 30 commits March 3, 2026 00:25
…bundle splitting

Without this fix, the bundler can emit multiple copies of internal-hooks
into separate chunks. registerInternalHook writes to one Map instance
while triggerInternalHook reads from another — resulting in hooks that
silently fire with zero handlers regardless of how many were registered.

Reproduce: load a hook via hooks.external.entries (loader reads one chunk),
then send a message:transcribed event (get-reply imports a different chunk).
The handler list is empty; the hook never runs.

Fix: use globalThis.__openclaw_internal_hook_handlers__ as a shared
singleton. All module copies check for and reuse the same Map, ensuring
registrations are always visible to triggers.
Addresses greptile review: collapses the if-guard + assignment into
a single ??= expression so TypeScript can narrow the type without
a non-null assertion.
When gateway.restart is triggered with a reason but no separate note,
the payload sets both message and stats.reason to the same text.
formatRestartSentinelMessage() then emits both the message line and a
redundant 'Reason: <same text>' line, doubling the restart reason in
the notification delivered to the agent session.

Skip the 'Reason:' line when stats.reason matches the already-emitted
message text. Add regression tests for both duplicate and distinct
reason scenarios.
…32128)

- Remove vi.hoisted() wrapper from exported mock in shared module
  (Vitest cannot export hoisted variables)
- Inline vi.hoisted + vi.mock in startup test so Vitest's per-file
  hoisting registers mocks before production imports

Co-authored-by: Claude Opus 4.6 <[email protected]>
)

Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source

When TTS providers (like mlx-audio Qwen3-TTS) output audioHz,
Discord voice at 24k messages play at half speed because Discord expects 48kHz.

This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus
function, ensuring voice messages always play at correct speed regardless
of the input audio's sample rate.

Co-authored-by: Kevin Shenghui <[email protected]>
… runs (openclaw#32358)

The `forceFlushTranscriptBytes` path (introduced in d729ab2) bypasses the
`memoryFlushCompactionCount` guard that prevents repeated flushes within the
same compaction cycle. Once the session transcript exceeds 2 MB, memory flush
fires on every single message — even when token count is well under the
compaction threshold.

Extract `hasAlreadyFlushedForCurrentCompaction()` from the inline guard in
`shouldRunMemoryFlush` and apply it to both the token-based and the
transcript-size trigger paths.

Fixes openclaw#32317

Signed-off-by: HCL <[email protected]>
)

The sticker code path called ctx.getFile() directly without retry,
unlike the non-sticker media path which uses resolveTelegramFileWithRetry
(3 attempts with jitter). This made sticker downloads vulnerable to
transient Telegram API failures, particularly in group topics where
file availability can be delayed.

Refs openclaw#32326

Co-authored-by: Claude Opus 4.6 <[email protected]>
…32320)

Top-level channel messages were creating isolated per-message sessions because roomThreadId fell through to threadContext.messageTs whenever replyToMode was not off.

Introduced in openclaw#10686, every new channel message got its own session key (agent:...:thread:<messageTs>), breaking conversation continuity.

Fix: only derive thread-specific session keys for actual thread replies. Top-level channel messages stay on the per-channel session key regardless of replyToMode.

Fixes openclaw#32285
Linux2010 and others added 28 commits March 2, 2026 22:19
)

* fix: add session-memory hook support for Feishu provider

Issue openclaw#31275: Session-memory hook not triggered when using /new command in Feishu

- Added command handler to Feishu provider
- Integrated with OpenClaw's before_reset hook system
- Ensures session memory is saved when /new or /reset commands are used

* Changelog: note Feishu session-memory hook parity

---------

Co-authored-by: Tak Hoffman <[email protected]>
…ps (openclaw#30315)

* fix(feishu): guard against false-positive @mentions in multi-app groups

When multiple Feishu bot apps share a group chat, Feishu's WebSocket
event delivery remaps the open_id in mentions[] per-app. This causes
checkBotMentioned() to return true for ALL bots when only one was
actually @mentioned, making requireMention ineffective.

Add a botName guard: if the mention's open_id matches this bot but the
mention's display name differs from this bot's configured botName, treat
it as a false positive and skip.

botName is already available via account.config.botName (set during
onboarding).

Closes openclaw#24249

* fix(feishu): support @ALL mention in multi-bot groups

When a user sends @ALL (@_all in Feishu message content), treat it as
mentioning every bot so all agents respond when requireMention is true.

Feishu's @ALL does not populate the mentions[] array, so this needs
explicit content-level detection.

* fix(feishu): auto-fetch bot display name from API for reliable mention matching

Instead of relying on the manually configured botName (which may differ
from the actual Feishu bot display name), fetch the bot's display name
from the Feishu API at startup via probeFeishu().

This ensures checkBotMentioned() always compares against the correct
display name, even when the config botName doesn't match (e.g. config
says 'Wanda' but Feishu shows '绯红女巫').

Changes:
- monitor.ts: fetchBotOpenId → fetchBotInfo (returns both openId and name)
- monitor.ts: store botNames map, pass botName to handleFeishuMessage
- bot.ts: accept botName from params, prefer it over config fallback

* Changelog: note Feishu multi-app mention false-positive guard

---------

Co-authored-by: Teague Xiao <[email protected]>
Co-authored-by: Tak Hoffman <[email protected]>
- 3 conflicts resolved: server-methods-list, process-message, monitor
- Refactor: extract wa-streaming-utils.ts + wa-verbose-utils.ts (Butley patches)
  so future upstream merges never conflict on these again
- monitor.ts: adopt upstream enrichInboundMessage/enqueueInboundMessage structure;
  re-apply WA Streaming (sendAvailable) + WA Mentions (processOutboundMentions,
  noteContactName) as 3 surgical edits in enqueueInboundMessage

Co-authored-by: Bob
…m-config refactor

channelId→channel, body→ctx.BodyForCommands??ctx.Body, conversationId→chatId
Variables renamed/scoped differently in upstream v2026.3.2 auto-merge.

Co-authored-by: Bob
Fix A: Suppress noisy plugin table logged at gateway startup via
console.log. Lines matching 'Plugins (N/M loaded)', 'Source roots:',
source path entries, and the renderTable() output (starts with ┌,
contains ┬ and 'Status') are now silently dropped. Use
`openclaw plugins list` to see the plugin table on demand.

Fix B: Multi-line log entries (e.g. ASCII tables, multi-line tool
output) are now rendered with proper indentation (15 chars) instead
of collapsing all newlines into one mangled line.

Also: minor style cleanup in wa-verbose-utils.ts (curly braces).

Co-authored-by: Bob
…rendering

- image-generate tool now returns imageUrl in details (like TTS audioUrl)
- gateway extracts imageUrl from toolResult details in chat.history
- gateway emits imageReady (renamed from mediaReady) with imageUrl field
- refactor captureMediaFromPayload into shared helper
…mission

resolveToolDeliveryPayload was filtering out image_generate tool results
because they had no mediaUrl/mediaUrls fields (only MEDIA: markers in text).
TTS worked because maybeApplyTtsToPayload adds mediaUrls explicitly.

Now extracts MEDIA:/ paths from tool result text into mediaUrls so the
deliver callback can collect them for imageReady event emission.
…s it

The ACP dispatch path formats tool results as summaries, so the raw
MEDIA markers from image_generate never reach the deliver callback.
After agent run completes, fall back to scanning the session transcript
for toolResult details.imageUrl to emit imageReady events.
…#15-openclaw#22)

Each patch dir now has a README.md (human context) and 001.patch
(git format-patch for mechanical re-application via git apply --3way).
Updated registry, verify script, and re-application order docs.
…outbound

Images saved to /root/clawd/ failed assertLocalMediaAllowed because
that path is only in agent-scoped media roots (requires agentId flow).
~/.openclaw/media/ is always in default roots, bypassing the issue.
Also add ~/.openclaw/media/ to gateway media handler search paths.
Regenerated 001.patch files and updated READMEs to reflect image
save location change from /root/clawd/ to ~/.openclaw/media/ and
the gateway media handler's three search paths.
@guiramos guiramos merged commit c15ee81 into work Mar 5, 2026
1 of 9 checks passed
guiramos added a commit that referenced this pull request Mar 31, 2026
CRITICAL:
- #1: Fix InboxMessage.from type from number to string (Pilot address)
- #2: Fix parseNodeIdFromPilotAddress to extract only last hex group
- #3: Rebuild hostnameToNodeId cache from Convex on startup
- #4: Fix AgentNetworkConfig → AgentRegistryConfig type reference
- #5: Track processed message hashes to prevent re-processing on clear failure

HIGH:
- #7: Change default pollIntervalSeconds from 300 to 15
- #8: Fix startDaemon regex to match Pilot address format (0:xxxx.xxxx.xxxx)
- #9: Use registerAgent() for re-registration to preserve enriched metadata
- #10: Fix search query param from 'query' to 'q'
- #12: handleRefresh supports optional accountId, refreshes all if omitted
- #13: Move spawn import to top-level, remove 7 inline dynamic imports
- #14: Mark inbox tool action as debug-only (poll loop handles processing)

MEDIUM:
- #15: Add installation_id to PilotPeer interface and agentToPeer
- #16: Use exact /by-hostname/ endpoint for get_agent action
- #17: Add LRU-style eviction to hostnameToNodeId (max 1000 entries)
- openclaw#18: Add mutex (isPolling flag) to executePollCycle
- openclaw#19: Add lifecycle comments to startDaemon/stopDaemon (container vs local)
- openclaw#20: Pass gatewayToken in fetchNetworkMetadata Convex query
- openclaw#21: Wrap sendText fallback in try/catch with error handling
- openclaw#23: Document sendMessage JSON vs raw text design intent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.