PROJECT: Online Coding Interview Platform
========================================
Overview
--------
An Online Coding Interview Platform enabling interviewers and candidates to conduct
live coding interviews with real-time video/audio (WebRTC), collaborative code
editing and chat (WebSockets), whiteboarding, code execution, and session
management. The platform emphasizes low latency, resilience to network issues, and
secure, sandboxed execution.
Objectives
----------
1) Demonstrate hybrid real-time architecture: WebRTC for A/V + data channels;
WebSockets for signaling, editor sync, chat, whiteboard, and presence.
2) Deliver a recruiter-ready UX: polished UI/UX, reliability under adverse
conditions, auditability via recordings and logs.
3) Ensure security and scalability: encrypted comms, sandboxed execution,
horizontal scalability for concurrent interviews.
Core Features
-------------
1) Authentication & Authorization
- Email/password, OAuth (Google/LinkedIn), JWT sessions (access + refresh).
- Roles: Interviewer, Candidate; optional Admin.
- Invite-based session access (expiring links, room codes).
2) Real-Time Video/Audio (WebRTC)
- P2P A/V with adaptive bitrate (Simulcast/SVC if available).
- Mute/unmute, camera toggle, device selection, echo cancellation, noise
suppression.
- Screen sharing (entire screen/app/window).
- Optional recording (client-side via MediaRecorder, or SFU-based server
recording).
- Network health indicators: bitrate, packet loss, RTT, jitter.
3) Collaborative Code Editor (WebSockets)
- Monaco/CodeMirror with syntax highlighting & language modes (JS/TS, Python,
Java, C++).
- Real-time synchronization with OT/CRDT to resolve conflicts.
- Cursor presence, selections, and line highlights.
- File tabs and basic project structure (multiple files).
- Autosave and version history (snapshots/diffs).
4) Code Execution Engine
- Isolated sandbox per run (Docker, Firecracker, or remote code-run API).
- Compile/run with standard input, time/memory limits, stdout/stderr capture.
- Pre-bundled runtime images for supported languages.
- Resource quotas; kill on timeouts or runaway processes.
5) Whiteboard & Notes
- Canvas whiteboard with shapes, pen, eraser, text, sticky notes.
- Real-time sync over WebSockets; undo/redo; export as PNG/PDF.
- Shared notes panel (markdown) with live sync and history.
6) Chat & Collaboration
- Real-time text chat, code snippets, lightweight attachments.
- Emojis/reactions, typing indicators, read receipts.
- System messages for joins/leaves, role changes, recording start/stop.
7) Interview Controls & Workflow
- Session scheduling with timezone support and calendar invites (ICS).
- Timers (round timer, overall), prompts/problem statements.
- Feedback forms and scorecards; tagging competencies.
- Post-interview transcript/summary (auto-generated notes optional).
8) Admin/Operations (Optional but valuable)
- Session list, metrics dashboard, recording library.
- User management, rate limiting, abuse prevention.
- Audit logs for compliance.
Architecture (High-Level)
-------------------------
- Frontend: SPA (React/Next.js) using WebRTC for A/V; WebSockets for signaling,
editor, chat, whiteboard, presence.
- Signaling Server: WebSocket server for exchanging SDP/ICE, room management, and
presence.
- STUN/TURN: STUN for NAT discovery; TURN for relaying when P2P fails (coturn).
- Media Topology: Start P2P; optional SFU (e.g., mediasoup/Jitsi) for multi-party,
recording, and bandwidth optimization.
- Realtime Services: Editor/whiteboard/chat services using OT/CRDT and pub/sub
channels.
- Execution Service: Isolated containers/VMs with language runtimes; jobs queued
and executed with quotas.
- API Backend: REST/GraphQL for auth, scheduling, storage, recordings metadata,
feedback, history.
- Storage: DB (PostgreSQL/MongoDB), Object storage for recordings and artifacts.
- Observability: Logging, metrics, tracing; error reporting; QoS monitoring.
Tasks & Implementation Plan
---------------------------
Phase 1: Foundations
- Repo setup, CI, linting/formatting, commit hooks.
- Auth service with JWT (access/refresh), OAuth provider integration.
- Room model, invites, and role-based guards.
- WebSocket gateway/service (rooms, presence, signaling).
Phase 2: A/V and Signaling
- Implement WebRTC call setup (offer/answer/ICE) over WebSockets.
- Device selection UI; basic call controls (mute/cam toggle).
- STUN/TURN configuration; network stats display.
- Screen share MVP.
Phase 3: Editor & Chat
- Integrate Monaco/CodeMirror; OT/CRDT sync via WebSockets.
- Presence (cursors, selections), file tabs, autosave.
- Real-time chat with typing indicators and read receipts.
Phase 4: Execution
- Dockerized runners or remote code-run API integration.
- Language runtimes, resource limits, kill/timeout policies.
- UI for stdin, run button, output/err stream.
Phase 5: Whiteboard & Notes
- Canvas-based whiteboard with drawing tools; real-time sync.
- Notes panel with markdown; history/versioning; export.
Phase 6: Interview Workflow
- Scheduler with timezone and ICS.
- Timers, prompts, feedback forms, scorecards.
- Session summary view and export (PDF/markdown).
Phase 7: Recording & Storage
- Local client recording (MediaRecorder) and upload; or SFU server recording.
- Recording management: metadata, search, secure playback.
Phase 8: Hardening & Deployment
- E2E encryption for transport (TLS/DTLS/SRTP).
- Rate limiting, input validation, CSRF where applicable.
- Horizontal scaling for WebSocket servers (sticky sessions or pub/sub).
- TURN autoscaling, health checks.
- Logging, metrics, alerting; synthetic tests and chaos drills.
Testing Benchmarks (Acceptance Criteria)
---------------------------------------
Functional
- Two participants establish A/V within 3 seconds on typical networks.
- Editor changes propagate and apply within 200 ms (p50), 400 ms (p95).
- Whiteboard events propagate within 250 ms (p50), 500 ms (p95).
- Code runs complete within 3 seconds (p50) and 7 seconds (p95) for standard
problems.
- Screen sharing at readable quality (720p) with legible text.
- Recording produces playable files with A/V sync and correct durations.
- Rejoin after refresh restores state (files, notes, whiteboard) within 2 seconds.
Performance
- Video: sustained ≥30 FPS with end-to-end latency <300 ms on 15 Mbps down / 2 Mbps
up.
- Editor: smooth typing up to 500–1,000 LOC per file without jank; CPU under 50% on
mid-tier laptops.
- Concurrency: support ≥10 simultaneous interviews on a single node; scale linearly
with nodes.
- Message throughput: handle ≥200 msgs/sec per room bursts without dropped updates
(with backpressure).
Resilience & Stability (Must-Not-Break Situations)
--------------------------------------------------
- Network Fluctuations: Degrade A/V quality gracefully; auto-reconnect WebRTC and
WebSockets; resume streams without manual intervention.
- Page Refresh/Rejoin: On refresh or reconnect, restore room membership, editor
buffers, whiteboard state, timers, and chat history from the last snapshot.
- NAT/Firewall Restrictions: Fall back to TURN relay if direct P2P fails; maintain
call.
- Duplicate Sessions/Devices: Handle same user joining from multiple tabs; prevent
input loops and provide clear device selection.
- Large Edits/Pastes: OT/CRDT should batch/compact operations to avoid lockups;
apply backpressure.
- Malicious/Heavy Code: Sandbox enforces CPU/memory/time limits; terminate with
safe errors; never crash the runner host.
- Timeouts/Crashes: If code execution process crashes, the UI reports failure and
remains usable; the system recovers for subsequent runs.
- Storage/Upload Failures: Recording upload retries with exponential backoff;
partial uploads are resumable.
- Permissions/Access: Invalid tokens or expired invites fail gracefully with clear
UX; no data leakage.
- Version Mismatch: Client/server schema changes use feature flags or versioned
protocols to avoid breaking active sessions.
- Service Restarts/Deploys: Sticky sessions or session resumption guarantee
continuity during rolling deploys.
Security Requirements
---------------------
- TLS for all HTTP/WSS; DTLS-SRTP for WebRTC media.
- JWT validation on every request; short-lived access tokens; refresh rotation.
- CSRF protection where cookies are used; CORS correctly configured.
- Sandboxed code runners with seccomp/AppArmor, read-only FS, no outbound network
(unless whitelisted).
- Input validation and content limits (message size, file size, rate limiting).
- Audit logging of critical actions (room create, recording start/stop, role
changes).
Observability & QA
------------------
- Centralized logs with correlation IDs for sessions.
- Metrics: call setup time, A/V bitrate, RTT, packet loss, editor latency, message
throughput, execution durations, error rates.
- SLOs: 99.5% successful call setup; 99% editor p95 latency <400 ms; 99% execution
p95 <7 s.
- Synthetic tests: periodic bot-run interviews to validate flows.
- Chaos drills: packet loss/latency injection; TURN unavailability; runner crash
scenarios.
Deliverables
------------
- Source code (frontend, backend, infra).
- Architecture diagram(s) and README with setup instructions.
- Test plan with automated and manual test cases.
- Demo video and sample interview itinerary.
- Deployed demo (optional) with restricted access and usage quotas.
Roadmap (Nice-to-Have Enhancements)
-----------------------------------
- SFU-based multi-party interviews with breakout rooms.
- In-call code playback (timeline scrubbing of edits).
- AI-assisted interviewer: code review hints, solution checks, rubric auto-fill.
- Problem bank with tagging, difficulty, and analytics.
- Calendar integrations (Google/Microsoft), webhooks, and ATS export.