DAX RX: native pw_stream source on Linux — 400ms → 200ms (#1008)#2312
Conversation
a18d51d to
f7de417
Compare
|
@RiskAndReward1337 Can you split the main window thread priority change into a separate PR for testing? I don't want to introduce instability in other non-Linux DAX / TCI use cases. |
Replaces module-pipe-source with a native libpipewire-0.3 pw_stream per DAX channel. This eliminates both the kernel FIFO and the pulseaudio-pipe translation that the previous path relied on, getting WSJT-X DT from a ~400 ms baseline down to a steady ~200 ms. See issue ten9876#1008 comment thread for measurement traces. Architecture ------------ - PipeWireNativeContext (singleton) owns one pw_thread_loop + pw_context + pw_core, refcounted across the four RX channels. Locked across pw_stream creation/destruction; the loop runs on its own dedicated thread. - PipeWireNativeRxSource is one pw_stream per channel, registered as Audio/Source with PW_DIRECTION_OUTPUT. Producer (Qt audio thread) and consumer (PipeWire RT thread) communicate via a fixed-size SPSC float ring with std::atomic head/tail indices — no locks, no allocations on the RT path. Drops oldest samples on overflow so backlog cannot grow past ring size (~42 ms). Stream is created with: PW_KEY_NODE_LATENCY = 256/48000 (5.3 ms quantum request) PW_KEY_NODE_RATE = 1/48000 PW_KEY_NODE_FORCE_QUANTUM = 256 PW_KEY_NODE_FORCE_RATE = 48000 PW_KEY_NODE_ALWAYS_PROCESS = true Verified live via pw-top: DAX nodes lock at QUANT=256, B/Q ~ 0, no xruns once the ring is primed. pw-cat (native PipeWire client) captures cleanly from the source with ~5.3 ms latency end-to-end. Bridge changes -------------- - PipeWireAudioBridge::feedDaxAudio is now lock-free / atomic-only. The audio fast path no longer touches Qt-thread-affine state, which is a prerequisite for an upcoming MainWindow connect-type change (split into a separate PR per maintainer request, so the connection-type change can be tested in isolation against non-Linux DAX/TCI use cases). - m_open / m_channelGain / m_transmitting / m_lastAudioMs converted to std::atomic. - Silence timer self-stops by observing m_lastAudioMs from the main thread rather than calling QTimer::stop() from the audio thread. - Per-channel native source falls back to legacy module-pipe-source if the native open() fails, so non-PipeWire systems still work. - Audio is float32 mono 48 kHz end-to-end. Linear-interp upsample of the radio's 24 kHz stereo to 48 kHz mono runs in feedDaxAudio so the PipeWire graph does not have to insert a resampler. Build ----- - pkg_check_modules(PIPEWIRE_NATIVE libpipewire-0.3) gates the new path behind a HAVE_PIPEWIRE_NATIVE define. When the dev headers are missing, the build silently falls back to the legacy module-pipe-source path (issue ten9876#1008's PR ten9876#1901 work, still present and used). - libpipewire-0.3-dev added to .github/docker/Dockerfile for CI. Remaining gap (~200 ms, consumer-side) -------------------------------------- The remaining 200 ms is the libpulse fragment-pool buffer that pulse-compat holds for clients using the PulseAudio API (WSJT-X via Qt PulseAudio backend connects with pulse.attr.fragsize=4800 = 50 ms × ~4 fragments queued). Confirmed via pw-cat bypass test — native PipeWire clients get sub-10 ms latency from the same source. PipeWire 1.0.5 does not expose a documented server-side mechanism to clamp pulse.attr.fragsize from the source side; pulse.min.* keys are minimums (xrun protection), not maximums. See issue ten9876#1008 for full measurement trace and the failed pulse.rules attempts. Real fix paths are consumer-side (PULSE_LATENCY_MSEC env, Qt native PipeWire backend) and tracked separately. Co-Authored-By: Claude Opus 4.7 <[email protected]>
f7de417 to
244f776
Compare
|
@jensenpat done — split into #2318. The MainWindow.cpp This PR has been rebased onto current Recommended merge order: this PR first (so |
|
Claude here on Jeremy's behalf — really nice writeup, and the architectural direction (native pw_stream + DirectConnection on the audio fast path) is sound. The 400 → 200 ms result with explicit identification of where the remaining latency lives (consumer-side libpulse fragsize, not in your pipeline) is exactly the level of rigor we love to see. Account checks out, no security concerns in the diff (Dockerfile / CMake / pactl call sites are all clean, no shell-injection surface introduced). A few items to address before merge — the first is a real (rare) data race; the rest are smaller: Must-fix:
Worth fixing while you're in there:
Nice-to-have, not blocking:
Once #1 and #2 are addressed I'd be happy to give a fresh pass and recommend merge. The latency win is meaningful and the rest of the work is genuinely careful — the upsampler's cross-packet 73, Jeremy KK7GWY & Claude (AI dev partner) |
…ten9876#1008) Addresses the review on ten9876#2312: 1. SPSC ring invariant. feedAudio() previously advanced m_readIdx from the producer side on overflow ("drop oldest"), which is a real (rare) race against the consumer's in-flight memcpy when a wraparound write crosses the slot the consumer has already claimed. Switch to drop-newest: clamp toCopy to the available space and bail if zero. m_readIdx is now consumer-only as documented. The 42 ms latency cap is preserved; in steady state the ring is near-empty so the choice between drop-oldest and drop-newest only matters during consumer stalls (which in practice only occur under extreme system load). 2. Raw new for the listener hook. pw_stream_add_listener was being handed a heap-allocated spa_hook that leaked on every open(). Make the hook a by-value member of PipeWireNativeRxSource so its lifetime is tied to the source itself — no raw new/delete, and zero leaks. 3. PipeWireNativeContext init/teardown race. acquire() did m_refCount.fetch_add(1) and then read m_loop without serialisation, so a second thread arriving while the first was mid-init would see m_loop == nullptr and bail incorrectly. Add a std::mutex around the init/teardown body in acquire()/release() so the documented "safe from any thread" contract is real instead of relying on incidental main-thread serialisation. m_refCount no longer needs to be atomic under the lock; it is now a plain int. 4. reinterpret_cast on the state-changed callback. Replace the cast of our int-arg static onStateChanged into pipewire's enum-arg slot with a small anonymous-namespace trampoline that has the correct C signature and forwards via static_cast<int>. Header stays free of <pipewire/stream.h>. 5. Stale doc. The feedAudio() comment still claimed "~170 ms" backlog from the earlier 8192 ring sizing. RING_SIZE has been 2048 (~42 ms) for some time — fix the comment. Co-Authored-By: Claude Opus 4.7 <[email protected]>
|
@ten9876 Thanks Jeremy & Claude — that's a really sharp review, especially catching the SPSC violation. All five items addressed in Must-fix #1 — SPSC ring race. Switched to drop-newest in Must-fix #2 — raw Worth-fixing #3 — Worth-fixing #4 — Worth-fixing #5 — stale doc. Comment on CI. The previous push's CI run on this branch was green (#25286904608). The new push's run is currently in Builds clean locally on Qt 6.11.0 / RelWithDebInfo with no new warnings on the four touched files. Verified the binary still drives DAX correctly post-changes. Ready for a fresh pass whenever convenient. Thanks for the careful read — that ring race genuinely could have shipped. |
|
Claude here on Jeremy's behalf — merged via admin squash. All five review items cleanly addressed in 2cf2d15; the SPSC fix in particular landed exactly the way you described. Jeremy is going to run the hardware smoke test on his FLEX-8600 + Linux PipeWire setup against WSJT-X. We'll holler back here if the on-radio path surfaces anything — your fallback to legacy module-pipe-source on per-channel open() failure means worst case is "no latency improvement," not "DAX broken." Thanks for the careful workmanship and the well-reasoned writeup. The 400 → 200 ms result with explicit identification of the remaining consumer-side libpulse fragsize gap is the kind of detail that makes review easy. 73, Jeremy KK7GWY & Claude (AI dev partner) |
Summary
Replaces
module-pipe-sourcewith a native libpipewire-0.3pw_streamper DAX RX channel — the deferred Patch D from the #1008 triage thread. On my setup (FLEX-8600 fw 4.1.5, PipeWire 1.0.5, Qt 6.11.0, WSJT-X via Qt PulseAudio backend) this drops WSJT-X DT from a steady ~400 ms baseline down to a steady ~200 ms.Marked as draft. This branch is rebased off
bc47c3b(43 commits behindupstream/mainat time of push) and the diff againstupstream/mainwill look noisy because of that drift, not because of changes in this PR. Posting it draft so the upstream agents can review the architecture and decide whether to rebase + merge, or cherry-pick the new files onto a fresh branch.See issue #1008 comment for the full measurement trace, what failed pulse-side experiments were tried (so the next agent can skip them), and where the remaining 200 ms actually lives (consumer-side, not in our pipeline).
Architecture
src/core/PipeWireNativeContext.{h,cpp}— singleton owning a sharedpw_thread_loop+pw_context+pw_core, refcounted across the four DAX RX channels. Loop runs on its own dedicated RT thread.src/core/PipeWireNativeRxSource.{h,cpp}— onepw_streamper channel,Audio/Source+PW_DIRECTION_OUTPUT. Producer (Qt audio thread) ↔ consumer (PipeWire RT thread) via fixed-size SPSC float ring withstd::atomichead/tail. No locks, no allocations on the RT path. Drops oldest samples on overflow so backlog can never exceed ring size (~42 ms).Stream is created with
node.latency=256/48000,node.force-quantum=256,node.force-rate=48000,node.always-process=true. Verified viapw-topthat the DAX nodes actually lock at QUANT=256 with B/Q ≈ 0 once primed.Bridge changes
feedDaxAudio()is now fully lock-free and runs onPanadapterStream's network thread viaQt::DirectConnection(wired inMainWindow::startDax). Removes 10–50 ms of variable latency that the GUI thread was adding under waterfall paint load.m_open/m_channelGain[]/m_transmitting/m_lastAudioMsconverted tostd::atomicso the audio fast path never touches Qt-thread-affine state.m_lastAudioMsrather than callingQTimer::stop()cross-thread.PipeWireNativeRxSource::open()fails, that channel still uses the legacymodule-pipe-sourcepath. Non-PipeWire systems are unaffected.feedDaxAudioso PipeWire never inserts a resampler.Build
pkg_check_modules(PIPEWIRE_NATIVE libpipewire-0.3)gates the new path behindHAVE_PIPEWIRE_NATIVE. Build silently falls back to the existing pipe-source path when the dev headers aren't present.libpipewire-0.3-devadded to.github/docker/Dockerfilefor CI.What this does not fix
The remaining ~200 ms gap to macOS-equivalent is the libpulse fragment-pool buffer pulse-compat holds for libpulse clients (WSJT-X via Qt PA gets
pulse.attr.fragsize=4800× ~4 fragments queued). PipeWire 1.0.5 has no documented server-side mechanism to clamp this from our source —pulse.min.*keys are minimums (xrun protection), not maximums. Bypass-tested withpw-cat(native PipeWire client) which gets sub-10 ms from the same source. Real fix paths are consumer-side and tracked in the issue thread.Test plan
RelWithDebInfopw-dumpconfirms all four DAX nodes register withnode.latency=256/48000+node.force-quantum=256pw-topshows DAX nodes locked atQUANT=256,B/Q≈0, no xruns under loadpw-cat --record --target=aethersdr-dax-1captures clean float32 mono 48 kHz audiomodule-pipe-sourcepath verified by forcingopen()failurereadTxPipe, gain, RMS metering all unchanged)🤖 Generated with Claude Code