feat(voice): add talk mode latency tracing by ngutman · Pull Request #32974 · openclaw/openclaw

ngutman · 2026-03-03T12:14:56Z

Summary

add per-turn latency tracing in talk mode from final transcript through chat send, stream, and speech output stages
record first-request/first-chunk/playback timing anchors for ElevenLabs streaming paths
add an onStart callback to TalkSystemSpeechSynthesizer.speak(...) via AVSpeechSynthesizerDelegate.didStart
add incremental speech buffer tests for soft-boundary splitting and short-chunk buffering behavior

Testing

swift test --package-path apps/ios --filter TalkModeIncrementalSpeechBufferTests (fails in this repo layout: Could not find Package.swift for apps/ios)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb90e004a9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

apps/ios/Sources/Voice/TalkModeManager.swift

greptile-apps · 2026-03-03T12:21:10Z

Greptile Summary

This PR adds per-turn latency tracing to talk mode, threading timing anchors through the full pipeline from final transcript to speech output (ElevenLabs streaming and system TTS), and also introduces soft-boundary splitting in IncrementalSpeechBuffer to start speaking sooner on long stream chunks.

Key changes:

TalkTurnLatencyTrace struct captures timestamps for each stage (transcript → process start → chat send/ack → assistant stream → first TTS request/chunk → playback start → system speech start).
instrumentTTSStreamFirstChunk wraps AsyncThrowingStream<Data, Error> to record the first-chunk timing anchor before forwarding data downstream.
isSoftBoundary triggers a segment split at any whitespace once the buffer reaches 72 characters, enabling earlier speech start on long chunks.
TalkSystemSpeechSynthesizer.speak(text:language:onStart:) gains an onStart callback wired to AVSpeechSynthesizerDelegate.didStart.

Issues found:

The didStart delegate in TalkSystemSpeechSynthesizer does not validate that the firing utterance is the currently-active one. When speech is interrupted and restarted rapidly, a queued Task { @MainActor in } for the old utterance's didStart event can consume the new onStart callback before the new speech actually starts, causing firstSystemSpeechStartAt to be recorded prematurely or not at all.

Confidence Score: 2/5

Safe to merge for user-facing behavior, but latency data from the system TTS path may be inaccurate due to the didStart callback race condition.
One real functional correctness issue exists: the didStart handler in TalkSystemSpeechSynthesizer lacks an utterance-identity guard, which can cause the onStart callback to fire for the wrong speech instance when speech is interrupted and restarted quickly. This does not affect playback, only latency measurement accuracy for the system TTS fallback path. The ElevenLabs and incremental-speech tracing paths are correct.
apps/shared/OpenClawKit/Sources/OpenClawKit/TalkSystemSpeechSynthesizer.swift — the new speechSynthesizer(_:didStart:) delegate method needs an utterance identity check before firing the callback.

_{Last reviewed commit: eb90e00}

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

apps/shared/OpenClawKit/Sources/OpenClawKit/TalkSystemSpeechSynthesizer.swift

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d3f1fcaf4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-03T12:35:15Z

apps/ios/Sources/Voice/TalkModeManager.swift

+    }
+
+    private func ensureLatencyTrace(transcriptChars: Int) {
+        guard self.activeLatencyTrace == nil else { return }


Reset latency traces when starting a new transcript turn

ensureLatencyTrace bails out if activeLatencyTrace is already set, but this commit introduced paths where a trace can be started and never finished (for example push-to-talk offline/cancel exits that do not call finishLatencyTrace). In that case, the next turn can reuse stale timestamps/anchors from a previous turn, which merges separate interactions into one trace and corrupts per-turn latency metrics.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-03T12:35:15Z

apps/ios/Sources/Voice/TalkModeManager.swift

            } else {
                await self.playAssistant(text: assistantText)
            }
+            self.finishLatencyTrace(outcome: "spoken", fields: ["runId=\(runId)"])


Record playback failures as non-spoken latency outcomes

This unconditionally finalizes tracing as outcome=spoken, but playAssistant handles TTS/system-speech errors internally and can end with statusText = "Speak failed: ..." without any spoken output. When that happens, latency telemetry marks failed speech turns as successful, which skews success-rate and latency analysis for talk mode.

Useful? React with 👍 / 👎.

mbelinky · 2026-03-03T17:31:21Z

Thanks for the solid work here, @ngutman.

Closing this original PR in favor of smaller, reviewable slices that preserve the reliability/UX improvements:

fix(ios): guard talk TTS callbacks to active utterance #33304 fix(ios): guard talk TTS callbacks to active utterance
fix(ios): start incremental speech at soft boundaries #33305 fix(ios): start incremental speech at soft boundaries

This keeps risk isolated and makes it easier to validate each behavior independently.

feat(voice): add talk mode latency tracing

eb90e00

chatgpt-codex-connector bot reviewed Mar 3, 2026

View reviewed changes

apps/ios/Sources/Voice/TalkModeManager.swift Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

apps/shared/OpenClawKit/Sources/OpenClawKit/TalkSystemSpeechSynthesizer.swift Show resolved Hide resolved

fix(voice): preserve tts backpressure and guard utterance callbacks

1d3f1fc

openclaw-barnacle bot added app: ios App: ios size: M maintainer Maintainer-authored PR labels Mar 3, 2026

chatgpt-codex-connector bot reviewed Mar 3, 2026

View reviewed changes

This was referenced Mar 3, 2026

fix(ios): guard talk TTS callbacks to active utterance #33304

Merged

fix(ios): start incremental speech at soft boundaries #33305

Merged

mbelinky closed this Mar 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(voice): add talk mode latency tracing#32974

feat(voice): add talk mode latency tracing#32974
ngutman wants to merge 2 commits intomainfrom
feat/talk-mode-latency-tracing

ngutman commented Mar 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

greptile-apps bot commented Mar 3, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Uh oh!

mbelinky commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ngutman commented Mar 3, 2026

Summary

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

greptile-apps bot commented Mar 3, 2026

Greptile Summary

Confidence Score: 2/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

mbelinky commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants