-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: add dump_rotate_on_finalize and enable_keepalive configs for soniox #1928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pull Request Review: Soniox Rotate DumpThanks for this PR! I've reviewed the changes and overall the implementation looks good. Here are my findings: ✅ What's Working Well
🐛 Potential Bugs & IssuesCritical: Reconnection Behavior Change (websocket.py:177-179)if self._attempt_count >= self.max_attempts:
await asyncio.sleep(self.max_delay)
returnIssue: This changes the behavior from stopping reconnection attempts after Impact: If the Soniox service is down or there's a persistent network issue, the client will retry forever instead of failing gracefully. Recommendation: Either:
Bug: Dumper not initialized properly (extension.py:247)if self.config.dump_rotate_on_finalize and self.audio_dumper:Issue: In Recommendation: Add validation in config: # In SonioxASRConfig or on_init
if self.config.dump_rotate_on_finalize and not self.config.dump:
raise ValueError("dump_rotate_on_finalize requires dump=True")Potential Race Condition (dumper.py:31-42)async def rotate(self):
if self._file:
await self._file.close()
# ... file is None here but might be accessed by push_bytes()
self._file = await aiofiles.open(self.dump_file_path, mode="wb")Issue: Between closing the old file and opening the new one, if Impact: Race condition if audio continues streaming during rotation. Recommendation: Add a lock: class Dumper:
def __init__(self, dump_file_path: str):
# ...
self._lock = asyncio.Lock()
async def rotate(self):
async with self._lock:
# existing code
async def push_bytes(self, data: bytes):
async with self._lock:
# existing codeResource Leak Risk (dumper.py:33-34)if self._file:
await self._file.close()Issue: If opening the new file fails after closing the old one, Recommendation: Add error handling: async def rotate(self):
"""Close current file and open new one with timestamp suffix."""
old_file = self._file
try:
# Generate new filename
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
base, ext = os.path.splitext(self.dump_file_base_path)
self.dump_file_path = f"{base}_{timestamp}{ext}"
# Open new file first
os.makedirs(os.path.dirname(self.dump_file_path), exist_ok=True)
new_file = await aiofiles.open(self.dump_file_path, mode="wb")
# Only close old file after new one opens successfully
if old_file:
await old_file.close()
self._file = new_file
except Exception as e:
# Keep old file handle if rotation fails
self._file = old_file
raise RuntimeError(f"Failed to rotate dump file: {e}") from eMissing Null Check (extension.py:255-256)if self.audio_dumper:
await self.audio_dumper.rotate()Issue: This check is inside the callback but Impact: Low risk but defensive programming suggests keeping the check. Status: Actually fine as-is, but consider the config validation issue above. ⚡ Performance Considerations
🔒 Security Considerations
📋 Code Quality & Best Practices
🧪 Test CoverageMissing Test Cases:
Recommendation: Add tests in 📝 Additional Recommendations
SummaryOverall Assessment: Good feature implementation with solid design patterns. The main concerns are:
Recommendation: Address the critical reconnection issue before merging. The other issues can be fixed in follow-up PRs if time is constrained, but ideally should be addressed now. Great work on the implementation! The callback pattern for rotation is elegant and the code is generally clean and well-structured. 🚀 |
PR Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new features to the Soniox ASR extension:
Code QualityStrengths:
Areas for Improvement: 1. dumper.py:60-65 - Missing file flush 2. dumper.py:35-58 - CRITICAL Race condition in rotate() 3. websocket.py:177-179 - Removed max reconnect limit 4. extension.py:64 - Task management Potential Bugs1. CRITICAL - dumper.py:25 - Directory creation edge case 2. extension.py:260 - Redundant null check Performance Considerations1. dumper.py:60-65 - No buffering configuration 2. websocket.py:162-163 - Keepalive queue behavior Security Concerns1. No path traversal validation 2. dumper.py:44 - Predictable filenames Test CoverageMissing test coverage for:
DocumentationMissing:
Recommendations
Overall AssessmentVerdict: Good feature addition with solid implementation, but needs fixes before merge. Blocking issues:
Non-blocking issues:
The code follows existing patterns well. After addressing the race condition and task management issues, this should be good to merge. Review generated with Claude Code |
PR Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new configuration options to the Soniox ASR extension:
The implementation includes creating a custom ✅ Code Quality & Best PracticesStrengths:
Suggestions:
🐛 Potential Bugs & Issues
🔒 Security ConsiderationsGood:
Concerns:
⚡ Performance ConsiderationsGood:
Suggestions:
🧪 Test CoverageMissing:
Recommendations:
📋 Additional Observations
🎯 Summary & RecommendationOverall Assessment: This is a solid feature addition with good architecture. The main concerns are:
Recommendation: ✅ Approve with changes requested The core functionality is well-implemented, but the race condition should be fixed before merging. The reconnection behavior change should be clarified (intentional vs bug). Consider adding at least basic tests for the rotation feature. Great work on the clean separation of concerns and following the repository's patterns! 👍 |
Pull Request Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new configuration options to the Soniox ASR extension:
The implementation includes a new custom ✅ Code Quality & Best PracticesStrengths:
Suggestions:
🐛 Potential Bugs & IssuesCritical Issues:
Minor Issues:
⚡ Performance ConsiderationsGood practices:
Concerns:
🔒 Security ConcernsGood practices:
Recommendations:
🧪 Test CoverageMissing test scenarios:
Recommendation: Add a new test file 📝 Additional Notes
🎯 SummaryOverall Assessment: The PR adds useful features with generally good code quality. However, there are critical concurrency issues that should be addressed before merging. Priority Issues:
Estimated Risk: Medium - The features are opt-in via configuration, but the concurrency issues could cause crashes or data corruption when enabled. Great work on the implementation! Please address the critical issues and consider the suggestions for a more robust solution. Happy to re-review once updated. 🚀 |
PR Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new configuration options to the Soniox ASR extension:
The changes also introduce a custom Dumper class to replace the base class from ten_ai_base.dumper. Code Quality and Best PracticesStrengths:
Areas for improvement:
Potential Bugs and IssuesCritical:
Medium:
Performance ConsiderationsGood:
Considerations:
Security ConcernsGood:
Considerations:
Test CoverageMissing: This PR adds significant new functionality but NO new tests are included:
Recommendations:
SummaryMust Fix (Blocking):
Should Fix (Important):
Nice to Have:
The code is well-structured and follows good async patterns, but needs test coverage and a few bug fixes before merging. |
c5c912f to
df27b4d
Compare
Code Review: Soniox Dump Rotation & Keepalive FeaturesOverviewThis PR introduces two valuable features to the Soniox ASR extension:
The implementation follows good architectural patterns with a dedicated Critical Issues1. Race Condition in Dumper (dumper.py:71-77)Severity: HIGH async def push_bytes(self, data: bytes):
async with self._lock: # ✅ Good - but this was added after initial review
if not self._file:
raise RuntimeError(...)
_ = await self._file.write(data)✅ Actually GOOD - The lock is properly used in the latest code. However, be aware that 2. Websocket Reconnection Behavior Changed (websocket.py:187-189)Severity: HIGH # Old behavior: Stop after max attempts
if self._attempt_count >= self.max_attempts:
await asyncio.sleep(self.max_delay)
return # Continues loop - infinite retries!Issue: The commit "feat: no max reconnect attempt limit" removes the hard failure after Impact:
Recommendation: If infinite retry is intentional:
3. Task Lifecycle Not Managed (extension.py:64, 150)Severity: MEDIUM self.ws_task = asyncio.create_task(ws.connect()) # Created but never cancelledIssue: The websocket task is stored but never explicitly cancelled during shutdown. While Python will cancel tasks on event loop closure, this can lead to noisy logs about pending tasks. Recommendation: async def stop_connection(self):
if self.websocket:
await self.websocket.stop()
if self.ws_task and not self.ws_task.done():
self.ws_task.cancel()
try:
await self.ws_task
except asyncio.CancelledError:
passSecurity ConcernsPath Traversal Risk (dumper.py:31-37)Severity: MEDIUM The # Config could contain: "dump_path": "../../../../etc"Recommendation: def __init__(self, dump_dir_path: str, dump_file_name: str):
# Resolve and validate path
self.dump_dir_path = os.path.abspath(dump_dir_path)
# Optional: validate it's within allowed base path
# if not self.dump_dir_path.startswith(ALLOWED_BASE_PATH):
# raise ValueError("dump_path outside allowed directory")Code Quality Issues1. Incomplete Error Recovery (extension.py:261-268)Severity: LOW try:
await self.audio_dumper.rotate()
except Exception as e:
self.ten_env.log_warn(f"Failed to rotate audio dumper, error: {str(e)}")
# What happens next? Dumper is in unknown stateThe rotation failure is logged but there's no recovery strategy. If rotation fails:
Recommendation: The current 2. Directory Creation Timing (dumper.py:33, 56)Minor: Code Smell def _ensure_dir(self):
os.makedirs(self.dump_dir_path, exist_ok=True) # Sync I/O in async contextWhile def _ensure_dir(self):
# makedirs is fast enough for local filesystems, but document the assumption
os.makedirs(self.dump_dir_path, exist_ok=True)
# Or if dealing with slow/network filesystems:
async def _ensure_dir(self):
await asyncio.to_thread(os.makedirs, self.dump_dir_path, exist_ok=True)3. Magic Number in Websocket (websocket.py:290)Minor: Maintainability self._total_sent_audio_ts_ms += 10 # Assumes 10ms audio framesRecommendation: ASSUMED_AUDIO_FRAME_DURATION_MS = 10 # Document this assumption
# ...
self._total_sent_audio_ts_ms += ASSUMED_AUDIO_FRAME_DURATION_MS4. Filename Collision Risk (dumper.py:52)Severity: LOW f"{base}_{int(time.time() * 1000)}_{secrets.token_hex(3)}{ext}"Good use of Performance Considerations1. Lock Contention on Audio Path (dumper.py:71-77)Severity: LOW Every audio chunk write acquires the lock, which could contend with rotation. Given that:
This is acceptable, but monitor if audio latency becomes an issue. 2. Keepalive Check Interval (websocket.py:178)Minor Optimization await asyncio.sleep(1.0) # Check every secondYou could optimize by sleeping for the remaining time: sleep_duration = max(0.1, self.keepalive_interval - time_since_last_audio)
await asyncio.sleep(sleep_duration)Though the current implementation is simpler and 1-second granularity is fine for keepalive purposes. Missing Test CoverageThe following scenarios lack test coverage:
Recommendation: Add tests in Configuration ValidationMissing Validation (config.py)# No validation for:
if config.dump_rotate_on_finalize and not config.dump:
# This configuration doesn't make sense
raise ValueError("dump_rotate_on_finalize requires dump=True")Add this check in Positive Observations
Summary & RecommendationsMust Fix Before Merge:
Should Fix:
Nice to Have:
Overall AssessmentGood feature addition with solid engineering. The core functionality is well-implemented, but the websocket reconnection behavior change needs clarification, and basic safeguards (config validation, task cleanup) should be added. Recommendation: ✅ Approve with minor changes requested Great work on the clean implementation and following the repository's patterns! The callback-based rotation is an elegant solution. 🚀 |
ai_agents/agents/ten_packages/extension/soniox_asr_python/websocket.py
Outdated
Show resolved
Hide resolved
ai_agents/agents/ten_packages/extension/soniox_asr_python/websocket.py
Outdated
Show resolved
Hide resolved
PR Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new configuration options to the Soniox ASR extension:
Overall, this is a well-structured PR with good test refactoring and solid implementation. Code Quality & Best PracticesStrengths:
Suggestions:
Potential Bugs & IssuesCritical:
Minor:
Performance ConsiderationsGood:
Concerns:
Security ConcernsRecommendations:
Test CoverageExcellent:
Missing Coverage:
Recommendations PriorityHigh Priority:
Medium Priority: Low Priority: SummaryThis is a high-quality PR that adds useful functionality with good code structure. The test refactoring is particularly commendable. Main concerns are around the unlimited retry behavior and missing test coverage for the new Dumper class. With the high-priority issues addressed, this will be a solid addition to the codebase. Overall Assessment: Approve with recommended changes Great work! |
699fffc to
0e61bbd
Compare
PR Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two new configuration options to the Soniox ASR extension:
The implementation includes a new custom Dumper class replacing the base ten_ai_base dumper, refactored test fixtures, and support for unlimited reconnection attempts. Strengths
Critical Issues
Medium Priority Issues
Test Coverage Gaps
Recommendations SummaryMust fix:
Should fix: VerdictApprove with minor suggestions The PR implements useful features with good code quality and proper error handling. The critical concerns are around edge cases (infinite reconnection, rotation failures) that should be addressed or explicitly documented as intended behavior. The test refactoring is well done and improves maintainability. Great work overall! The new Dumper class is well-designed and the keepalive feature addresses real connection stability issues. |
Pull Request Review: Add dump_rotate_on_finalize and enable_keepalive configs for sonioxSummaryThis PR adds two important features to the
The PR also includes significant test refactoring, moving mock infrastructure from a separate file into ✅ StrengthsCode Quality
Features
🔍 Issues & Concerns1. Critical: Race Condition in Dumper InitializationLocation: The audio dumper is started after the websocket connection begins, creating a window where audio could be sent before the dumper is ready: # Line 148: WebSocket task started
self.ws_task = asyncio.create_task(ws.connect())
# Line 161-162: Dumper started later
if self.audio_dumper:
await self.audio_dumper.start()Impact: If the WebSocket connects quickly and audio starts flowing, Recommendation: Start the dumper before creating the WebSocket connection task: if self.audio_dumper:
await self.audio_dumper.start()
self.ws_task = asyncio.create_task(ws.connect())2. Bug: Missing await in Keepalive LoopLocation: await self._send_queue.put(keepalive_message)The keepalive loop puts messages into the queue, but if the queue fills up, this could block indefinitely. While 3. Design Flaw: Keepalive Default ValueLocation: The PR description and commit messages indicate
Inconsistency: If someone instantiates Recommendation: Align the default in def __init__(
self,
url: str,
start_request: str,
base_delay: float = 0.1,
max_delay: float = 10.0,
max_attempts: int = 10,
enable_keepalive: bool = True, # Changed from False
keepalive_interval: float = 15.0,
):4. Logic Issue: No Reconnect LimitLocation: async def _exponential_backoff(self):
if self._attempt_count >= self.max_attempts:
await asyncio.sleep(self.max_delay)
returnThe code checks Recommendation: Either:
5. Potential Memory Leak: Uncancelled TaskLocation: self.ws_task = asyncio.create_task(ws.connect())The task is stored but never explicitly cancelled or awaited during shutdown. If Recommendation: Add task cancellation in async def stop_connection(self) -> None:
self.ten_env.log_info("stop_connection")
if self.audio_dumper:
await self.audio_dumper.stop()
if self.websocket:
await self.websocket.stop()
if self.ws_task and not self.ws_task.done():
self.ws_task.cancel()
try:
await self.ws_task
except asyncio.CancelledError:
pass
self.connected = False🔒 Security ConcernsLow Risk: File System AccessThe
Info: Secrets in FilenamesTimestamp + random hex (6 chars) in rotated filenames provides ~16.7M combinations. This is sufficient for preventing filename collisions but not for security purposes. ⚡ Performance ConsiderationsPositive
Concerns
🧪 Test CoverageExcellent Improvements
Gaps
RecommendationsAdd tests for:
📝 Code Style & Best PracticesExcellent
Minor Issues
🎯 Recommendations SummaryMust Fix (Critical)
Should Fix (Important)
Nice to Have
📊 Overall AssessmentQuality: Good - Well-implemented features with proper async patterns This is a valuable addition to the framework, but requires fixes to the critical issues before merging. The test infrastructure improvements are excellent and will benefit future development. Recommendation: Request changes to address critical issues (#1-3) and test coverage gaps before merging. |
Pull Request ReviewThis PR adds two new configuration options ( ✅ Strengths1. Well-Structured Dumper ImplementationThe new dumper.py file is well-designed with proper async/await patterns and thread-safety:
2. Improved Test InfrastructureThe refactoring of test fixtures (conftest.py) is excellent:
3. Proper Feature Integration
🔍 Potential Issues & ImprovementsCritical: Missing Import Path UpdateLocation: extension.py:32 from .dumper import DumperIssue: The PR creates a custom Dumper class but changes the import from ten_ai_base.dumper to a local .dumper. However, I noticed that 18 other ASR extensions in the codebase use from ten_ai_base.dumper import Dumper. This suggests:
Recommendation:
Security: Error Message Information DisclosureLocation: dumper.py:75 raise RuntimeError(
f"Dumper for {self._current_file_path()} is not opened. Please start the Dumper first."
)Issue: The error message exposes the full file path, which could reveal sensitive directory structure information in logs. Recommendation: raise RuntimeError(
"Dumper is not opened. Please start the Dumper first."
)Similarly in dumper.py:69, the exception chaining already includes the error details, so the path isn't necessary in the outer message. Bug: File Handle Leak on Rotation FailureLocation: dumper.py:45-69 async def rotate(self):
async with self._lock:
old_file = self._file
try:
# ...
new_file = await aiofiles.open(...)
if old_file:
await old_file.close()
self._file = new_file
except Exception as e:
self._file = old_file # ⚠️ old_file is still open
raise RuntimeError(...) from eIssue: If rotation fails after opening the new file but before completing, the code correctly keeps old_file, but if it fails during directory creation or filename generation, the old file remains open indefinitely without being tracked. Recommendation: Consider whether failed rotations should close the old file or keep it open. Current behavior is inconsistent - it depends on where the exception occurs. Add a comment documenting the intended behavior. Performance: Unnecessary await AssignmentLocation: dumper.py:77 _ = await self._file.write(data)Minor: The assignment to _ is unnecessary. Just use await self._file.write(data). Race Condition: Websocket Task Not AwaitedLocation: extension.py:150 self.ws_task = asyncio.create_task(ws.connect())Issue: The task is created but never awaited in stop_connection(). While websocket.stop() is called, the task itself should be properly cancelled/awaited to prevent warnings and ensure clean shutdown. Recommendation: async def stop_connection(self) -> None:
self.ten_env.log_info("stop_connection")
if self.audio_dumper:
await self.audio_dumper.stop()
if self.websocket:
await self.websocket.stop()
if self.ws_task and not self.ws_task.done():
self.ws_task.cancel()
try:
await self.ws_task
except asyncio.CancelledError:
pass
self.connected = FalseCode Quality: Type Hint InconsistencyLocation: websocket.py:296 before_send_callback: Optional[Callable] = NoneIssue: The Callable type hint is incomplete. It should specify the signature. Recommendation: before_send_callback: Optional[Callable[[int], Awaitable[None]]] = NoneThis matches the actual usage in extension.py:264 where the callback takes audio_bytes_sent: int. Test Coverage: Missing Rotation TestObservation: The tests in test_finalize.py now handle the before_send_callback parameter, but there's no explicit test verifying that:
Recommendation: Add a test case similar to test_audio_dump_functionality but with dump_rotate_on_finalize: True that:
Documentation: Config Options Need DescriptionLocation: config.py:20, 24 The new config options lack inline documentation: dump_rotate_on_finalize: bool = False
enable_keepalive: bool = TrueRecommendation: Add docstrings or inline comments: dump_rotate_on_finalize: bool = False # Rotate dump file on each finalize command
enable_keepalive: bool = True # Send keepalive messages during silence periods🔒 Security Assessment✅ Overall: Low Risk
⚡ Performance Considerations✅ Efficient Implementation:
📝 Minor Suggestions
🎯 SummaryApproval Status: This PR is well-implemented with good attention to async patterns and test quality. The main concerns are:
The code is production-ready after addressing the websocket task cleanup. Great work on the test refactoring! 🚀 Reviewed with: Claude Code |
No description provided.