Refactor call classes by codingjoe · Pull Request #49 · codingjoe/VoIP

codingjoe · 2026-03-16T22:42:37Z

Make agent more responsive
Improve voice response

Copilot

Pull request overview

This PR aims to make the AI “agent response” loop feel more responsive by changing how RTP audio is buffered/flushed for transcription and how TTS audio is generated/sent back to the caller.

Changes:

Made RTP payload decoding synchronous (no executor) and adjusted VAD buffering/flush timing (shorter silence_gap, new timer handle, new flush filtering).
Simplified Whisper transcription triggering (removed some buffering/min-duration logic and docstrings).
Changed Agent reply assembly and switched TTS from streamed chunk delivery to single-shot audio generation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.

File	Description
`voip/audio.py`	Alters decode execution model and VAD buffering/flush behavior (timers, thresholds, buffering strategy).
`voip/ai.py`	Changes transcription triggering/docs and modifies agent response + TTS generation approach.

You can also share your feedback on Copilot code review. Take the survey.

voip/audio.py

+        audio = self.decode_payload(packet.payload)
        if audio.size > 0:
            self.audio_received(
                audio=audio, rms=float(np.sqrt(np.mean(np.square(audio))))
            )


voip/ai.py

+        self.messages.append({"role": "user", "content": self.pending_text.getvalue()})
+        self.pending_text.seek(0)
+        self.pending_text.truncate(0)
+        response = await ollama.AsyncClient().chat(
+            model=self.ollama_model,
+            messages=self.messages,
+        )
+        reply = (response.message.content or "").encode("ascii", "ignore").decode()
+        self.messages.append({"role": "assistant", "content": reply})
+
+        logger.debug("Agent reply: %r", reply)
+        await self.send_speech(reply)


voip/ai.py

@@ -228,23 +185,12 @@ async def send_speech(self, text: str) -> None:
        Args:
            text: Text to synthesise and transmit.
        """
-        loop = asyncio.get_running_loop()
-        queue: asyncio.Queue[np.ndarray | None] = asyncio.Queue()
-
-        def generate() -> None:
-            for chunk in self.tts_instance.generate_audio_stream(
-                self.voice_state,
-                text,  # type: ignore[too-many-positional-arguments]
-            ):
-                asyncio.run_coroutine_threadsafe(
-                    queue.put(chunk.numpy()), loop
-                ).result()
-            asyncio.run_coroutine_threadsafe(queue.put(None), loop).result()
-
-        future = loop.run_in_executor(None, generate)
-        while (tts_chunk := await queue.get()) is not None:
-            resampled = self.resample(
-                tts_chunk, self.tts_instance.sample_rate, self.codec.sample_rate_hz
+        audio = self.tts_instance.generate_audio(
+            self.voice_state,
+            text,  # type: ignore[too-many-positional-arguments]
+        )
+        await self.send_rtp_audio(
+            self.resample(
+                audio.numpy(), self.tts_instance.sample_rate, self.codec.sample_rate_hz
            )
-            await self.send_rtp_audio(resampled)
-        await future
+        )


voip/audio.py

-        self.silence_handle = None
-        if not self.speech_buffer:
-            return
+        self.transcription_handle = None


voip/audio.py

                audio=audio, rms=float(np.sqrt(np.mean(np.square(audio))))
            )

    def decode_payload(self, payload: bytes) -> np.ndarray:


voip/audio.py

    def on_audio_silence(self) -> None:
-        """Arm the silence debounce timer when speech is buffered."""
-        if self.silence_handle is None and self.speech_buffer:
-            loop = asyncio.get_running_loop()
-            self.silence_handle = loop.call_later(
+        if self.transcription_handle is None:
+            loop = asyncio.get_event_loop()
+            self.transcription_handle = loop.call_later(
                self.silence_gap,
                self.flush_speech_buffer,
            )


voip/audio.py

    def audio_received(self, *, audio: np.ndarray, rms: float) -> None:
-        if self.collect_audio(audio, rms):
-            self.speech_buffer.append(audio)
+        self.speech_buffer.append(audio)
        if rms > self.speech_threshold:
            self.on_audio_speech()
        else:
            self.on_audio_silence()



voip/audio.py

+        self.transcription_handle = None
+        # Ensure at least one second of audio to avoid cutting words in half.
        audio = np.concatenate(self.speech_buffer)
+        if (
+            sum(len(c) for c in self.speech_buffer)
+            < self.RESAMPLING_RATE_HZ * self.silence_gap
+            or float(np.sqrt(np.mean(np.square(audio)))) < 0.01
+        ):
+            self.speech_buffer.clear()
+            return


voip/ai.py

    async def speech_buffer_ready(self, audio: np.ndarray) -> None:
-        """Transcribe the buffered utterance when it meets the minimum length.
-
-        Skips utterances shorter than one second to avoid passing fragments
-        to Whisper that would produce low-quality transcriptions.
-
-        Args:
-            audio: Float32 mono PCM array at `RESAMPLING_RATE_HZ` Hz.
-        """
-        if len(audio) < self.RESAMPLING_RATE_HZ:
-            return
        await self.transcribe(audio)



voip/audio.py

    def audio_received(self, *, audio: np.ndarray, rms: float) -> None:
-        if self.collect_audio(audio, rms):
-            self.speech_buffer.append(audio)
+        self.speech_buffer.append(audio)
        if rms > self.speech_threshold:
            self.on_audio_speech()
        else:
            self.on_audio_silence()



codecov · 2026-03-16T23:00:25Z

Codecov Report

❌ Patch coverage is 83.78378% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.58%. Comparing base (53b1491) to head (38c5004).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
voip/__main__.py	55.55%	16 Missing ⚠️
voip/ai.py	92.59%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
+ Coverage   94.25%   94.58%   +0.32%     
==========================================
  Files          24       24              
  Lines        1759     1716      -43     
==========================================
- Hits         1658     1623      -35     
+ Misses        101       93       -8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codingjoe added 2 commits March 16, 2026 22:29

Make agent more responsive

f375bbc

Improve voice response

d37ed16

Copilot AI review requested due to automatic review settings March 16, 2026 22:42

codingjoe self-assigned this Mar 16, 2026

Copilot started reviewing on behalf of codingjoe March 16, 2026 22:43 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

codingjoe added 4 commits March 17, 2026 23:43

I tested it all

96b9064

Fix optional tests

d02fb4d

Simplify docs

8b0ff05

Fix prek

38c5004

codingjoe changed the title ~~agent response~~ Refactor call classes Mar 17, 2026

codingjoe merged commit 1099f97 into main Mar 17, 2026
22 of 24 checks passed

codingjoe deleted the agent-response branch March 17, 2026 23:06

codingjoe mentioned this pull request Mar 17, 2026

Tighten agent response timing #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor call classes#49

Refactor call classes#49
codingjoe merged 6 commits intomainfrom
agent-response

codingjoe commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

codecov bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

codingjoe commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

codecov bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 16, 2026 •

edited

Loading