Conversation
b6e588c to
b0825a3
Compare
b0825a3 to
9cf8af1
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces a new voip package that can register to a SIP server, accept inbound calls, receive RTP Opus audio, and transcribe audio chunks using OpenAI Whisper. It also adds a CLI entrypoint and updates packaging/tests/docs to match the new project scope.
Changes:
- Add SIP/RTP/STUN building blocks plus call handling (
IncomingCall,RegisterProtocol) for receiving inbound RTP audio. - Add Whisper-based transcription (
WhisperCall) that wraps RTP Opus payloads into Ogg Opus and decodes via ffmpeg for Whisper transcription. - Add a Click-based CLI and update project metadata, CI matrix, and tests for the new
voippackage layout.
Reviewed changes
Copilot reviewed 27 out of 29 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
voip/whisper.py |
Adds WhisperCall transcription pipeline, Ogg Opus wrapping, ffmpeg decode. |
voip/call.py |
Adds inbound call handling, RTP socket setup, SIP REGISTER + digest auth + STUN integration. |
voip/rtp.py |
Adds minimal RTP parsing and RTPProtocol datagram handler. |
voip/stun.py |
Adds STUN binding discovery and response parsing. |
voip/types.py |
Adds digest auth QoP enum. |
voip/sip/messages.py |
Expands SIP message API (Message), improves parsing, auto Content-Length. |
voip/sip/protocol.py |
Updates SIP asyncio protocol dispatch (but currently contains invalid stubs). |
voip/sip/types.py |
Adds SIP status enums/utilities. |
voip/sip/lexers.py |
Adds Pygments lexer for SIP messages. |
voip/__main__.py |
Adds CLI scaffolding and transcribe command. |
voip/__init__.py |
Defines package version exports. |
tests/test_whisper.py |
Adds unit tests for transcription buffering, decoding, and Ogg Opus wrapping. |
tests/test_call.py |
Adds tests for answering/rejecting calls and REGISTER/STUN behavior. |
tests/test_stun.py |
Adds STUN unit tests. |
tests/test_rtp.py |
Adds RTP unit tests. |
tests/test_main.py |
Adds CLI tests. |
tests/test_integration.py |
Adds integration test for public STUN discovery (marked integration). |
tests/sip/test_messages.py |
Updates imports and adds parsing/serialization tests. |
tests/sip/test_lexers.py |
Updates lexer import path. |
tests/sip/test_aio.py |
Updates protocol/message imports for new module paths. |
tests/conftest.py |
Registers integration marker. |
sip/__init__.py |
Removes old sip package entrypoint module. |
pyproject.toml |
Renames project to voip, adds extras/scripts, updates pytest/coverage/ruff config. |
README.md |
Updates docs toward the new voip package. |
CONTRIBUTING.md |
Adds contributor workflow (uv-based). |
.pre-commit-config.yaml |
Updates tooling configuration (pyupgrade args, ruff rev). |
.github/workflows/ci.yml |
Runs extra-matrix tests for cli, pygments, and whisper. |
Comments suppressed due to low confidence (1)
voip/sip/protocol.py:31
request_receivedandresponse_receivedhave empty bodies (only docstrings). This makes the module invalid Python (IndentationError/SyntaxError) and prevents imports. Addpass/...or raiseNotImplementedErrorso these methods are syntactically valid stubs.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
@copilot address review comments |
|
@codingjoe I've opened a new pull request, #9, to work on those changes. Once the pull request is ready, I'll request review from you. |
…ut, SDP mono, contact IP, and package hygiene (#9) Addresses all automated review comments from the initial transcription implementation. ## CLI (`voip/__main__.py`) - **Entry point**: `main = voip` (was `sip`), `__main__` calls `voip()` — exposes the top-level group and `-v/--verbose` - **`ConsoleMessageProcessor`**: Fixed both `request_received` and `response_received` to forward correct args to the correct `super()` method (was calling `super().request_received()` in both, dropping args) - **ImportError message**: Fixed grammar and package name (`libsip` → `voip[cli]`) ## Transcription (`voip/whisper.py`) - **Concurrency guard**: Added `_transcribe_task: asyncio.Task | None` — `audio_received` only spawns a task when none is running; the task drains all full chunks in a `while` loop before clearing the guard - **ffmpeg timeout**: Replaced `.run()` with `.run_async()` + `proc.communicate(timeout=decode_timeout_secs)` — `decode_timeout_secs` is now enforced; includes proper subprocess cleanup on timeout and captures stderr for error messages ## SDP / RTP (`voip/call.py`) - **Mono codec**: `opus/48000/2` → `opus/48000/1` in SDP — consistent with the Ogg container (1 channel) and ffmpeg `ac=1` decode path - **Contact IP**: Added `contact_ip` kwarg to `IncomingCall`; `IncomingCallProtocol` exposes a `_contact_ip` property (default `None`), overridden in `RegisterProtocol` to return the STUN-discovered public IP — avoids advertising `0.0.0.0` in SDP ## Tests - **`test_integration.py`**: Replaced `RegisterProtocol` with a minimal `_STUNOnlyProtocol` to avoid sending unintended SIP REGISTER to `stun.l.google.com` - **`test_main.py`**: Fixed stub key `"sip.whisper"` → `"voip.whisper"` - **`test_whisper.py`**: Added `pytest.importorskip("whisper")` guard; updated ffmpeg mocks to `run_async`/`communicate` API ## Packaging / Docs - **`pyproject.toml`**: Added `ffmpeg-python` to `cli` extra (required by the `transcribe` command) - **`README.md`**: Fixed install snippet `libsip[...]` → `voip[...]` - **`voip/__init__.py`**: Re-exported `SIP`, `Request`, `Response` so `import voip; voip.SIP` works as shown in the README example <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: codingjoe <[email protected]> Co-authored-by: Johannes Maron <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 32 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
voip/sip/messages.py:45
- The
ValueErrorraised for an invalid first line embeds the entire raw SIP message ({data!r}), which can be extremely large and may include sensitive headers (e.g., Authorization) if this exception is logged/propagated. Prefer including only a sanitized/size-limited value (e.g., the decoded first line, truncated) in the error message.
voip/sip/messages.py:60 Message._first_lineis currently a stub (def _first_line(self) -> str: ...). At runtime this returnsNone, sobytes(Message(...))will serialize an invalid SIP start-line (None\r\n...). IfMessageis intended to be abstract, raiseNotImplementedError(or make it anabc.ABCwith@abstractmethod) to prevent accidental invalid serialization.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8 +/- ##
=========================================
+ Coverage 0 93.69% +93.69%
=========================================
Files 0 16 +16
Lines 0 1348 +1348
=========================================
+ Hits 0 1263 +1263
- Misses 0 85 +85 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@codingjoe I've opened a new pull request, #11, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: codingjoe <[email protected]>
|
@copilot I don't like the structure right now. The SIP handler should handle the session and should be initialized with a call (RTP) handler class for incoming or outgoing calls. The RTP handler shouldn't know what a session is. It's too many different classes right now and it's not clear what they are supposed to do. The whisper handler should be a rtp (call) handler separate from the session. The cli should create a session handler with the call hanlder of choice. |
|
@codingjoe I've opened a new pull request, #12, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@codingjoe I've opened a new pull request, #21, to work on those changes. Once the pull request is ready, I'll request review from you. |
…FC 8760) (#21) MD5 is cryptographically broken and its use in SIP Digest Auth is deprecated by RFC 8760, which mandates SHA-256 and SHA-512-256 support. ## Changes - **`voip/types.py`**: New `DigestAlgorithm` enum covering all RFC 8760 values: `MD5`, `MD5-sess`, `SHA-256`, `SHA-256-sess`, `SHA-512-256`, `SHA-512-256-sess` - **`voip/sip/protocol.py`**: - `digest_response()` now accepts an `algorithm` parameter (default: `SHA-256`) and dispatches via `hashlib.new()`. Raises `ValueError` for unrecognised algorithms. - `response_received()` reads `algorithm` from the server's `WWW-Authenticate`/`Proxy-Authenticate` challenge and echoes it in the `Authorization` header — server-advertised algorithm is honoured end-to-end. - **`tests/sip/test_protocol.py`**: Updated existing `algorithm="MD5"` assertion to `algorithm="SHA-256"`; added `TestDigestResponse` with coverage for correctness, backward-compat MD5, SHA-512-256, algorithm distinctness, and server algorithm negotiation. ```python # Server advertises SHA-256 → client responds with SHA-256 challenge = 'Digest realm="example.com", nonce="abc", algorithm="SHA-256"' # Authorization header echoes: algorithm="SHA-256", response=<sha256-digest> # No algorithm in challenge → defaults to SHA-256 (RFC 8760) digest = SessionInitiationProtocol.digest_response( username="alice", ******, realm="example.com", nonce="abc", method="REGISTER", uri="sip:example.com", # algorithm defaults to DigestAlgorithm.SHA_256 ) ``` MD5 remains supported for servers that explicitly request it. <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Send tasks to Copilot coding agent from [Slack](https://gh.io/cca-slack-docs) and [Teams](https://gh.io/cca-teams-docs) to turn conversations into code. Copilot posts an update in your thread when it's finished. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: codingjoe <[email protected]>
|
@copilot if get a |
|
@codingjoe I've opened a new pull request, #22, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@copilot RFC 3261 allows the server you connect to and the domain you register to be different, right? Double-check the standard and extend the implementation. By default they would be the same. What about proxies? Make sure everything follows the correct terminology layed out in the RFC standard. |
|
@codingjoe I've opened a new pull request, #23, to work on those changes. Once the pull request is ready, I'll request review from you. |
) RFC 3261 §8.1.2 separates the **outbound proxy** (the server a UA physically connects to) from the **registrar** (§10, whose URI is derived from the AOR domain). These can differ — e.g. `proxy.carrier.com` as the TLS endpoint while `carrier.com` is the registrar domain. ## Changes ### Protocol (`voip/sip/protocol.py`) - **`SessionInitiationProtocol`**: renamed `server_address` → `outbound_proxy` (`tuple[str, int] | None`, default `None`); optional because the UA may connect directly to the registrar with no explicit proxy configured - **Class docstring**: added RFC 3261 topology overview explaining the outbound proxy / registrar / AOR distinction and when they differ - **`register()`**: docstring and debug logging now explicitly distinguish proxy from registrar (`"Sending REGISTER via outbound proxy … to registrar …"`) ```python # Proxy and registrar on the same host (most common) SIP(aor="sips:[email protected]", username="alice", ******) # Proxy differs from the registrar domain SIP( aor="sips:[email protected]", # registrar → sips:carrier.com outbound_proxy=("proxy.carrier.com", 5061), username="alice", ******, ) ``` ### CLI (`voip/__main__.py`) The `transcribe` command now takes the AOR as a **positional argument**. All connection parameters are derived from it by default; individual pieces can be overridden: ```console voip sip transcribe sips:[email protected] --password secret voip sip transcribe sip:[email protected]:5060 --password secret voip sip transcribe sips:[email protected] --proxy proxy.carrier.com --password secret ``` - **`_parse_aor()`**: new helper that extracts `(scheme, user, host, port)` from a SIP URI - **Transport auto-detection from port**: port 5060 → plain TCP; any other port → TLS. `--no-tls` still overrides explicitly. - **`sip:` scheme** defaults to port 5060 (plain TCP); **`sips:`** defaults to port 5061 (TLS). - **`--username`**: optional override for the user part from the AOR - **`--proxy HOST[:PORT]`**: optional override for the outbound proxy address when it differs from the registrar domain in the AOR - **`SIP_TCP_PORT = 5060` / `SIP_TLS_PORT = 5061`**: named constants replacing magic numbers <!-- START COPILOT CODING AGENT TIPS --> --- 📱 Kick off Copilot coding agent tasks wherever you are with [GitHub Mobile](https://gh.io/cca-mobile-docs), available on iOS and Android. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: codingjoe <[email protected]> Co-authored-by: Johannes Maron <[email protected]>
… RegistrationError on failure (#22) TLS-registered clients were receiving `SIP/2.0 403 Forbidden` and other error responses because the code used hardcoded URI schemes in REGISTER requests and Contact headers, ignoring the scheme configured in the AOR. The URI scheme (`sip:` or `sips:`) is now taken directly from the configured AOR, and any unexpected REGISTER response raises a descriptive `RegistrationError`. ## Changes - **`RegistrationError`** — new exception class (exported from `__all__`) raised when a REGISTER response is unexpected; message includes the server's status code and reason phrase (e.g. `"403 Forbidden"`) - **`registrar_uri`** — preserves the AOR scheme verbatim: `sips:[email protected]` → `REGISTER sips:example.com`, `sip:[email protected]` → `REGISTER sip:example.com` - **`_build_contact(user=None)`** — helper that builds the Contact URI using the AOR scheme: `sips:` AOR → `<sips:user@host:port>`; `sip:` AOR over TLS → `<sip:user@host:port;transport=tls>`; used by both `register()` and `_answer()` - **`response_received()`** — raises `RegistrationError("{status_code} {reason}")` immediately for any unexpected REGISTER response; no silent retries or state mutation ``` # sips: AOR (carrier requires end-to-end TLS) REGISTER sips:example.com SIP/2.0 Contact: <sips:[email protected]:5061> # sip: AOR over TLS REGISTER sip:example.com SIP/2.0 Contact: <sip:[email protected]:5061;transport=tls> ``` <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Send tasks to Copilot coding agent from [Slack](https://gh.io/cca-slack-docs) and [Teams](https://gh.io/cca-teams-docs) to turn conversations into code. Copilot posts an update in your thread when it's finished. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: codingjoe <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Johannes Maron <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 37 out of 42 changed files in this pull request and generated 9 comments.
You can also share your feedback on Copilot code review. Take the survey.
voip/sip/protocol.py
Outdated
| if remote_audio is not None: | ||
| conn = request.body.connection if request.body else None | ||
| remote_ip = conn.connection_address if conn else "0.0.0.0" # noqa: S104 | ||
| remote_rtp_addr: tuple[str, int] | None = (remote_ip, remote_audio.port) |
voip/stun.py
Outdated
| def connection_made(self, transport: asyncio.DatagramTransport) -> None: # type: ignore[override] | ||
| """Store the transport, create :attr:`public_address`, and start STUN discovery. | ||
|
|
||
| When ``stun_server_address`` is ``None`` the socket is considered | ||
| ready immediately: :attr:`public_address` is resolved with the local | ||
| socket address and :meth:`stun_connection_made` is called right away. |
| # Instantiate the per-call handler and register it with the shared mux. | ||
| call_handler = call_class( | ||
| rtp=self._rtp_protocol, | ||
| sip=self, |
README.md
Outdated
| All signalling uses **SIP over TLS** (SIPS, RFC 3261 §26) and all media is | ||
| protected with **SRTP** ([RFC 3711](https://tools.ietf.org/html/rfc3711)) | ||
| using the `AES_CM_128_HMAC_SHA1_80` cipher suite with SDES key exchange | ||
| ([RFC 4568](https://tools.ietf.org/html/rfc4568)). Unencrypted SIP or RTP | ||
| is not supported. |
There was a problem hiding this comment.
Let's remove that from the documentation.
voip/sip/protocol.py
Outdated
| def error_received(self, exc: Exception) -> None: | ||
| """Handle a transport-level error. | ||
|
|
||
| On Windows, sending to an unreachable TCP port may raise | ||
| ``ConnectionResetError``; logging and ignoring it keeps the socket |
| payload = packet[12:] | ||
| ssrc = struct.unpack(">I", header[8:12])[0] | ||
| seq = struct.unpack(">H", header[2:4])[0] | ||
| index = (self._roc << 16) | seq | ||
|
|
voip/srtp.py
Outdated
| expected_tag = self._auth_tag(srtp_no_tag) | ||
| if received_tag != expected_tag: | ||
| return None |
voip/rtp.py
Outdated
|
|
||
| When the matched handler carries an SRTP session the packet is | ||
| authenticated and decrypted before being forwarded; packets that fail | ||
| authentication are silently discarded. |
There was a problem hiding this comment.
I'd consider a log silent handling, but we can sure more descriptive words to avoid misunderstandings.
| def __post_init__(self) -> None: | ||
| super().__post_init__() | ||
| logger.debug("Loading Whisper model %r", self.model) | ||
| self._whisper_model = WhisperModel(self.model) | ||
|
|
There was a problem hiding this comment.
Hm ... That's a tricky one. The model is cached, bit I don't know if it loads multiple instances into memory or not. We'd need to investigate and update the implementation according. If need be, we can just pass the whole model instance as an attribute.
|
@copilot address the review comments |
|
@codingjoe I've opened a new pull request, #24, to work on those changes. Once the pull request is ready, I'll request review from you. |
…otocol hardening (#24) Addresses all unresolved review comments from the call transcription PR series. ## SRTP (`voip/srtp.py`) - **Separate send/receive ROC counters** — `_send_roc`/`_last_send_seq` for encryption, `_recv_roc`/`_last_recv_seq` for decryption, per RFC 3711 §3.3.1 - **Full index estimation on receive** — `_estimate_recv_index()` implements the RFC 3711 §3.3.1 algorithm (plain arithmetic comparisons, not modular) - **Constant-time auth tag comparison** — replaces `!=` with `hmac.compare_digest` (stdlib) to prevent timing side-channels ## SIP Protocol (`voip/sip/protocol.py`) - **Remote RTP address selection** — prefers media-level `c=` → session-level `c=` → SIP peer IP (per RFC 4566 §5.7); port 0 (inactive media, RFC 4566 §5.14) skips registration and NAT hole-punching - **Early INVITE guard** — `_initialize_task` stored on `connection_made`; `_answer()` awaits it only when `_rtp_protocol` is still `None`, preventing attribute errors without breaking tests - **Remove `error_received`** — `asyncio.Protocol` has no such callback (it belongs to `DatagramProtocol`); removed method and its tests ## Audio (`voip/audio.py`) - **`WhisperModel` injection** — `WhisperCall.model` now accepts a pre-loaded `WhisperModel` instance in addition to a name string, enabling sharing across concurrent calls: ```python shared_model = WhisperModel("base") class MyCall(WhisperCall): model = shared_model # loaded once, reused per call ``` ## Docs / minor - **STUN docstring** — removed stale reference to `public_address` (attribute no longer exists on `STUNProtocol`) - **RTP docstring** — "silently discarded" → "logged at WARNING level and discarded" to match actual behavior - **README** — removed inaccurate "Unencrypted SIP or RTP is not supported" claim <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/codingjoe/VoIP/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: codingjoe <[email protected]>
No description provided.