Close WebRTC session when cloud WebSocket disconnects#638
Conversation
When the fal WebSocket closes (e.g. MAX_DURATION_EXCEEDED timer), the
WebRTC peer connection (UDP) between the client and the scope backend
continues running independently — the pipeline keeps processing frames
and the video keeps streaming.
Add a DELETE /api/v1/webrtc/offer/{session_id} endpoint to the backend,
and call it from fal_app.py's finally block to explicitly tear down the
WebRTC session when the signaling WebSocket closes.
Signed-off-by: emranemran <[email protected]>
📝 WalkthroughWalkthroughThese changes implement proper WebRTC session lifecycle management by adding a server-side HTTP DELETE endpoint for closing sessions and client-side logic to invoke it during WebSocket disconnect cleanup, ensuring symmetric session teardown. Changes
Sequence DiagramsequenceDiagram
participant Client as WebSocket Client<br/>(fal_app.py)
participant Server as Scope Server<br/>(app.py)
participant Manager as WebRTCManager
Client->>Client: WebSocket disconnect event
Client->>Server: DELETE /api/v1/webrtc/offer/{session_id}
Server->>Manager: remove_session(session_id)
Manager->>Manager: Clean up session resources
Manager-->>Server: Success
Server-->>Client: 204 No Content
Client->>Client: Cleanup complete
Estimated Code Review Effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/scope/cloud/fal_app.py (1)
1154-1189:⚠️ Potential issue | 🟠 MajorMove
_cleanup_event.clear()before starting the WebRTC DELETE.Because Lines 591-595 wait on
_cleanup_eventbefore sendingready, leaving the event set until after this DELETE returns lets a fast reconnect bypass the gate whileremove_session()is still closing the previous peer connection. Clear the event before the teardown starts, then set it again after the rest of cleanup finishes.Suggested fix
finally: + event = _get_cleanup_event() + event.clear() + try: # Cancel log forwarder task if log_forwarder_task is not None: log_forwarder_task.cancel() try: await log_forwarder_task @@ if kafka_publisher and kafka_publisher.is_running: end_time = time.time() elapsed_ms = int((end_time - connection_start_time) * 1000) await kafka_publisher.publish( "websocket_disconnected", @@ if session_id: import httpx try: async with httpx.AsyncClient() as client: resp = await client.delete( @@ except Exception as e: print( f"[{log_prefix()}] Warning: Failed to close WebRTC " f"session {session_id}: {e}" ) - # Clean up session data to prevent data leakage between users. - # Block the next connection's "ready" message until cleanup finishes. - event = _get_cleanup_event() - event.clear() - try: + # Clean up session data to prevent data leakage between users. + # Block the next connection's "ready" message until cleanup finishes. await cleanup_installed_plugins() cleanup_session_data() finally: event.set()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scope/cloud/fal_app.py` around lines 1154 - 1189, Move the cleanup gating so the `_cleanup_event()` is cleared before beginning the WebRTC teardown: call _get_cleanup_event() and event.clear() immediately when entering the cleanup block (before the HTTP DELETE that uses session_id), then perform the DELETE to SCOPE_LOCAL_URL for the WebRTC session (the existing session_id/httpx.AsyncClient block), followed by cleanup_installed_plugins() and cleanup_session_data(), and finally call event.set() in the finally block; this ensures the `_cleanup_event` prevents a fast reconnect from sending "ready" while remove_session/peer-connection teardown is still in progress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/scope/cloud/fal_app.py`:
- Around line 1154-1189: Move the cleanup gating so the `_cleanup_event()` is
cleared before beginning the WebRTC teardown: call _get_cleanup_event() and
event.clear() immediately when entering the cleanup block (before the HTTP
DELETE that uses session_id), then perform the DELETE to SCOPE_LOCAL_URL for the
WebRTC session (the existing session_id/httpx.AsyncClient block), followed by
cleanup_installed_plugins() and cleanup_session_data(), and finally call
event.set() in the finally block; this ensures the `_cleanup_event` prevents a
fast reconnect from sending "ready" while remove_session/peer-connection
teardown is still in progress.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 10e86be0-85d4-470a-befa-249de4880f52
📒 Files selected for processing (2)
src/scope/cloud/fal_app.pysrc/scope/server/app.py
🚀 fal.ai Preview Deployment
TestingConnect to this preview deployment by running this on your branch: 🧪 E2E tests will run automatically against this deployment. |
✅ E2E Tests passed
Test ArtifactsCheck the workflow run for screenshots. |
|
Tested by overwriting the timeout to 5min for a quick test and it worked as expected - stream stopped and backend stopped cleanly. |
Problem
When the fal WebSocket closes (e.g.
MAX_DURATION_EXCEEDEDafter 60 minutes), the video stream keeps playing indefinitely instead of stopping.Why this happens
There are 3 actors on the fal machine:
The WebRTC peer connection uses UDP and is completely independent of the WebSocket (TCP). When fal_app.py closes the WebSocket, the scope backend has no idea — nobody tells it to stop. The pipeline keeps running and frames keep flowing over UDP.
Current (broken) sequence
Fix
Fixed sequence
Changes
DELETE /api/v1/webrtc/offer/{session_id}— new endpoint on the scope backend that closes and removes a WebRTC session viawebrtc_manager.remove_session()fal_app.py
finallyblock — calls the DELETE endpoint with the session ID before cleaning up files/plugins, so the WebRTC peer connection is torn down when the WebSocket closes for any reason (MAX_DURATION_EXCEEDED, client disconnect, errors)Test plan
MAX_CONNECTION_DURATION_SECONDS) and verify the stream stops when the timer firesClosed WebRTC session <id>in fal outputSummary by CodeRabbit