Skip to content

Refactor call classes#49

Merged
codingjoe merged 6 commits intomainfrom
agent-response
Mar 17, 2026
Merged

Refactor call classes#49
codingjoe merged 6 commits intomainfrom
agent-response

Conversation

@codingjoe
Copy link
Copy Markdown
Owner

  • Make agent more responsive
  • Improve voice response

Copilot AI review requested due to automatic review settings March 16, 2026 22:42
@codingjoe codingjoe self-assigned this Mar 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make the AI “agent response” loop feel more responsive by changing how RTP audio is buffered/flushed for transcription and how TTS audio is generated/sent back to the caller.

Changes:

  • Made RTP payload decoding synchronous (no executor) and adjusted VAD buffering/flush timing (shorter silence_gap, new timer handle, new flush filtering).
  • Simplified Whisper transcription triggering (removed some buffering/min-duration logic and docstrings).
  • Changed Agent reply assembly and switched TTS from streamed chunk delivery to single-shot audio generation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.

File Description
voip/audio.py Alters decode execution model and VAD buffering/flush behavior (timers, thresholds, buffering strategy).
voip/ai.py Changes transcription triggering/docs and modifies agent response + TTS generation approach.

You can also share your feedback on Copilot code review. Take the survey.

voip/audio.py Outdated
Comment on lines 195 to 199
audio = self.decode_payload(packet.payload)
if audio.size > 0:
self.audio_received(
audio=audio, rms=float(np.sqrt(np.mean(np.square(audio))))
)
voip/ai.py Outdated
Comment on lines +165 to +176
self.messages.append({"role": "user", "content": self.pending_text.getvalue()})
self.pending_text.seek(0)
self.pending_text.truncate(0)
response = await ollama.AsyncClient().chat(
model=self.ollama_model,
messages=self.messages,
)
reply = (response.message.content or "").encode("ascii", "ignore").decode()
self.messages.append({"role": "assistant", "content": reply})

logger.debug("Agent reply: %r", reply)
await self.send_speech(reply)
Comment on lines 178 to +196
@@ -228,23 +185,12 @@ async def send_speech(self, text: str) -> None:
Args:
text: Text to synthesise and transmit.
"""
loop = asyncio.get_running_loop()
queue: asyncio.Queue[np.ndarray | None] = asyncio.Queue()

def generate() -> None:
for chunk in self.tts_instance.generate_audio_stream(
self.voice_state,
text, # type: ignore[too-many-positional-arguments]
):
asyncio.run_coroutine_threadsafe(
queue.put(chunk.numpy()), loop
).result()
asyncio.run_coroutine_threadsafe(queue.put(None), loop).result()

future = loop.run_in_executor(None, generate)
while (tts_chunk := await queue.get()) is not None:
resampled = self.resample(
tts_chunk, self.tts_instance.sample_rate, self.codec.sample_rate_hz
audio = self.tts_instance.generate_audio(
self.voice_state,
text, # type: ignore[too-many-positional-arguments]
)
await self.send_rtp_audio(
self.resample(
audio.numpy(), self.tts_instance.sample_rate, self.codec.sample_rate_hz
)
await self.send_rtp_audio(resampled)
await future
)
voip/audio.py Outdated
self.silence_handle = None
if not self.speech_buffer:
return
self.transcription_handle = None
audio=audio, rms=float(np.sqrt(np.mean(np.square(audio))))
)

def decode_payload(self, payload: bytes) -> np.ndarray:
Comment on lines 321 to 327
def on_audio_silence(self) -> None:
"""Arm the silence debounce timer when speech is buffered."""
if self.silence_handle is None and self.speech_buffer:
loop = asyncio.get_running_loop()
self.silence_handle = loop.call_later(
if self.transcription_handle is None:
loop = asyncio.get_event_loop()
self.transcription_handle = loop.call_later(
self.silence_gap,
self.flush_speech_buffer,
)
Comment on lines 309 to 315
def audio_received(self, *, audio: np.ndarray, rms: float) -> None:
if self.collect_audio(audio, rms):
self.speech_buffer.append(audio)
self.speech_buffer.append(audio)
if rms > self.speech_threshold:
self.on_audio_speech()
else:
self.on_audio_silence()

voip/audio.py Outdated
Comment on lines +330 to +339
self.transcription_handle = None
# Ensure at least one second of audio to avoid cutting words in half.
audio = np.concatenate(self.speech_buffer)
if (
sum(len(c) for c in self.speech_buffer)
< self.RESAMPLING_RATE_HZ * self.silence_gap
or float(np.sqrt(np.mean(np.square(audio)))) < 0.01
):
self.speech_buffer.clear()
return
voip/ai.py Outdated
Comment on lines 75 to 77
async def speech_buffer_ready(self, audio: np.ndarray) -> None:
"""Transcribe the buffered utterance when it meets the minimum length.

Skips utterances shorter than one second to avoid passing fragments
to Whisper that would produce low-quality transcriptions.

Args:
audio: Float32 mono PCM array at `RESAMPLING_RATE_HZ` Hz.
"""
if len(audio) < self.RESAMPLING_RATE_HZ:
return
await self.transcribe(audio)

Comment on lines 309 to 315
def audio_received(self, *, audio: np.ndarray, rms: float) -> None:
if self.collect_audio(audio, rms):
self.speech_buffer.append(audio)
self.speech_buffer.append(audio)
if rms > self.speech_threshold:
self.on_audio_speech()
else:
self.on_audio_silence()

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 83.78378% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.58%. Comparing base (53b1491) to head (38c5004).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
voip/__main__.py 55.55% 16 Missing ⚠️
voip/ai.py 92.59% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
+ Coverage   94.25%   94.58%   +0.32%     
==========================================
  Files          24       24              
  Lines        1759     1716      -43     
==========================================
- Hits         1658     1623      -35     
+ Misses        101       93       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codingjoe codingjoe changed the title agent response Refactor call classes Mar 17, 2026
@codingjoe codingjoe merged commit 1099f97 into main Mar 17, 2026
22 of 24 checks passed
@codingjoe codingjoe deleted the agent-response branch March 17, 2026 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants