[bug] Fix Flex TCXO frequency offset calibration#2119
Conversation
30edc3e to
d3b1f2b
Compare
d3b1f2b to
1edd104
Compare
There was a problem hiding this comment.
Good fix, @jensenpat. The core change -- replacing the non-functional radio calibrate with the SmartSDR/FlexLib-compatible radio pll_start sequence -- is well-researched and correctly implemented. The race condition handling (stale pll_done=1 before pll_done=0, run-specific timeout guards) shows careful follow-up testing.
A few observations:
Code looks correct:
sendCmdPubliccallback signature matchesRadioModel.h(int code, const QString& body).statusReceivedsignal signature matches existing usage throughout the codebase (QString& object, QMap<QString, QString>& kvs).lcProtocolis declared inLogManager.hand used elsewhere -- consistent with project conventions.QPointerguards ondialog,startBtn, andcalStatuscorrectly protect the deferredpll_startresponse callback from use-after-delete.- The
calibrationRuncounter correctly scopes both the timeout and thepll_startresponse to the run that created them, while thestatusReceivedhandler (connected once at tab build) only needs the sharedcalibrationActive/pllRunningSeenflags since it's not per-run. AppSettingsis used for persistence (notQSettings) -- consistent with project convention.
Minor items worth noting (non-blocking):
-
statusReceivedhandler doesn't guard widget lifetime -- unlike thepll_startcallback which usesQPointer, thestatusReceivedlambda captures rawstartBtnandcalStatuspointers. Since the connection's receiver isthis(the dialog), Qt will disconnect on dialog destruction, so this is safe in practice. But if the tab is ever rebuilt while the dialog lives (e.g., a future "reconnect" flow), those pointers would dangle. Low risk given the current deferred-tab architecture. -
QString("Error 0x%1").arg(code, 0, 16).toUpper()--toUpper()uppercases the entire string including "Error" → "ERROR". Likely intended to uppercase the hex digits, but the visual effect is fine either way. If you wanted only hex uppercased:.arg(code, 0, 16).toUpper()could be.arg(QString::number(code, 16).toUpper())with the prefix added separately. Cosmetic only. -
Doc placement -- the agent notes in
tx-audio-signal-path.mdare explicitly "not part of the TX audio chain" per the note itself. A separatedocs/flex-calibration-notes.mdmight be a more natural home, but I understand the rationale of putting it where agents already look. Maintainer's call.
Overall this is a thorough, well-documented fix. The PR description is excellent -- the command sequence sourcing, race condition analysis, and debug workflow are all clearly explained. Thanks for the contribution. 73!
Community-driven release. WAVE Phase 2 visualization (#2124), DAX-aware TCI multi-stream routing for FlexRadio firmware 4.2.18 (#2140), TCXO frequency-offset calibration (#2119), VFO marker tri-state UX (#2141), v4.2.18 discovery beacon parsing (#2138). Bug fixes from the community: r8b heap corruption (#2114, NF0T), serial PTT triple-fix (#2125, chibondking), slice-audio mute on band change (#2128, jensenpat), CWX Live toggle (#2122, jensenpat), connect-radio dialog polish (#2121, jensenpat). Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Summary
This PR fixes the Radio Setup → RX → Frequency Offset calibration workflow for FlexRadio hardware and documents the protocol lesson so future follow-up agents do not rediscover the same command mismatch.
The existing UI exposed a Start button for manual frequency-offset calibration on radios without a GPSDO, but it sent
radio calibrate. On the reporter's FLEX-6600 running firmware v4.1.5, that command returns0x50000016(unknown command), so the button could not actually start calibration.Issue #2095 supplied a SmartSDR TCP command capture showing the working sequence:
radio set freq_error_ppb=0radio pll_startpll_done=0while calibration is running.pll_done=1 freq_error_ppb=<value> cal_freq=<value>when calibration completes.I also verified the command against a second source: the official FlexLib API v2.10.1 source. In
Radio.cs,StartOffsetEnabled=falsesendsradio pll_start, and the same radio status fields (cal_freq,freq_error_ppb, andpll_done) are parsed from status updates.Fixes #1237.
Fixes #2095.
What Changed
Calibration command path
radio calibratecall with the SmartSDR/FlexLib-compatible sequence:radio set cal_freq=<entered MHz>radio set freq_error_ppb=0radio pll_startsendCmdPublicforradio pll_startso the UI can react to the immediate command response.Completion tracking and user feedback
Starting...immediately after the user starts calibration.Calibrating...afterradio pll_startis accepted or whenpll_done=0arrives.Complete (<ppb> ppb)whenpll_done=1arrives with afreq_error_ppbvalue.Completeif the completion status arrives without a ppb value.Error 0x...ifradio pll_startreturns a non-zero response.No responseif calibration does not reportpll_done=1within 20 seconds.pll_done=0thenpll_done=1transition for the active Start press, rather than accepting the firstpll_done=1seen after the reset commands.radio pll_startcommand response so it cannot overwrite the label after the active run has already been cleared.No response.GPSDO behavior
GPSDO installed. Manual frequency offset calibration available.Debug visibility
aether.protocoldebug breadcrumbs for the calibration lifecycle:cal_freqand the reset-to-zero stepradio pll_startacceptedradio pll_startfailed, including response code/bodypll_donestatus transition observed by the dialogfreq_error_ppbandcal_freqon completionpll_done=1never arrivesLive value updates
RadioModel::infoChangedwhen the user is not actively editing them.freq_error_ppbvalue flow back into the field when the radio reports the new state, without overwriting a user in the middle of typing.Layout polish
Agent documentation
Agent Notes: 10 MHz TXO Calibrationsection todocs/tx-audio-signal-path.md.radio calibrateshould not be usedpll_donecompletion semanticsWhy This Fix
The original bug was not a UI-only issue. The button failed because AetherSDR was using a command that the radio firmware does not recognize for this workflow. The reporter's SmartSDR capture and FlexLib both point to
radio pll_startas the correct start command.The completion behavior also matters:
radio pll_startonly acknowledges that the radio accepted the command. The actual calibration result arrives later as radio status, so the dialog has to listen for the calibration state transition and report the finalfreq_error_ppbvalue.Follow-up testing showed that the radio can broadcast stale
pll_done=1status immediately afterradio set freq_error_ppb=0, before the new calibration has actually enteredpll_done=0. If the UI treats that firstpll_done=1as completion, it can show the previous or zeroed ppb value, clear its active flag too early, and then get stuck onCalibrating...when the delayedpll_startresponse arrives. This PR now waits forpll_done=0for the active run before acceptingpll_done=1as completion.Additional repeated-run testing exposed a second race: a previous run's 20-second timeout can still be pending after that run completed successfully. If the operator starts another calibration before the old timer fires, that stale callback can see the shared active flag from the newer run and incorrectly set the UI to
No response. The timeout and delayedpll_startresponse callbacks now carry the run id they were created for, so stale callbacks are ignored.Finally, GPSDO presence should not remove the user's ability to manually calibrate. A GPSDO-equipped radio can still expose manual offset calibration, and SmartSDR/Mac does so. This PR follows that model and lets the operator choose the reference source instead of making that choice implicitly in the dialog.
User Impact
pll_done=1updates are ignored, so the Start button stays disabled until the real calibration finishes or times out.No responsebefore the radio returns the real result.Validation
cmake --build build --parallel 10ctest --test-dir build --output-on-failure -j 10CHANGELOG.mdand generated What's New data are not modified; those files are left for the project maintainer.👨🏼💻 Generated with OpenAI Codex (GPT-5.5 Pro 4/23) and tested by @jensenpat