Fix and trace netCW keying for straight key and iambic CW#2336
Fix and trace netCW keying for straight key and iambic CW#2336ten9876 merged 2 commits intoten9876:mainfrom
Conversation
There was a problem hiding this comment.
Thanks for the writeup, @jensenpat — the diagnosis here is plausible and the changes are tightly scoped. A few concerns before this can land, mostly around empirical evidence rather than code quality:
Major: this PR reverts findings from #2081 without on-radio validation
Three of the changes deliberately undo decisions made in commit 2ced6ee (#2081), which were each justified with very specific empirical observations in the comments you're replacing:
- VITA payload null-padding — the prior comment claimed null padding to a 4-byte boundary caused the radio's parser to reject the packet (no key/PTT change, silent
0x50001000on TCP). This PR re-adds the padding based on Maestro captures. - TCP backstop after UDP sends — the prior comment claimed radio v4.1.5 rejected every CW command over TCP with
0x50001000, both the netcw form andcw key immediate. This PR re-adds it on the theory that dedup-by-index=Nwill save it. - Timestamp format — switching from a 32-bit epoch-derived value to a 16-bit relative counter with
0x0000resync.
The "Hardware Validation Notes" section explicitly says this hasn't been smoke-tested against a Flex radio yet. Given that prior PRs in this area have flipped between working and not-working depending on small framing details, I think on-radio validation is a hard prerequisite for a CW keying change — otherwise we risk regressing users who currently get TX on the existing path (the bug reports cite some users where local sidetone works but TX doesn't, which suggests at least some configurations are keying today). Could you stage this on a fork build and confirm at minimum:
- Straight key produces RF on a current-firmware Flex
- The TCP backstop doesn't generate
0x50001000syntax-error spam inlcProtocol - Padded VITA packets aren't silently dropped
Question on the 16-bit timestamp claim
The PR description asserts FlexLib uses a 16-bit relative ms counter, but the existing code's commit history suggests 32-bit was chosen with reference to FlexLib output. What's the source for the 16-bit reading — a FlexLib decompile, an API doc, or a Maestro/Wireshark capture? If you have a capture showing the 4-hex-digit field over the wire, linking it would settle this quickly.
Minor
kNetCwIdleResetMs = 3000is reasonable for normal CW spacing (word gap at 20 WPM is ~420 ms) but is a guess in the absence of FlexLib reference behavior. Worth a comment if you have a basis for the value.- The 16-bit timestamp wraps every 65.5s of continuous keying without a 3s gap — likely fine, but worth a one-liner that the radio is expected to handle wrap.
- The
cw_sidetone_testsetPan(0.0f)change is a clean fix, no concerns there.
The code itself looks correct (math for packetWords stays consistent with the new paddedPayloadBytes, QElapsedTimer usage is right, invalidation in onDisconnected and stream-create both reset state). The blocker is just empirical: the prior comments are too specific to overwrite without confirming the new behavior on real hardware.
|
@AetherClaude aethersdr-agent asked "The PR description asserts FlexLib uses a 16-bit relative ms counter, but the existing code's commit history suggests 32-bit was chosen with reference to FlexLib output. What's the source for the 16-bit reading — a FlexLib decompile, an API doc, or a Maestro/Wireshark capture? If you have a capture showing the 4-hex-digit field over the wire, linking it would settle this quickly." This is a Wireshark capture from my Maestro keying my Flex 8600: The timestamp in the "cw key" commands can be clearly seen as a 16 bit hex number. In addition, in this thread: https://community.flexradio.com/discussion/7703751/is-keying-possible-via-the-api-or-cat-serial-rts-or-dtr Flex employee Eric KE5DTO stated "The timestamp is a 4 digit free-running millisecond counter" in a later post in that thread he confirmed that "the values are expected to be hex formatted as 0xnnnn where n is [0-9A-F]. So it would roll over after FFFF (2^16-1)." To further support the 16 bit timestamp, I have written my own keying software that uses 16 timestamps and I have also been involved in the testing of other Flex client software and have Wireshark captures for their CW keying. All use 16 bit timestamps. I do hope that helps, I have never seen any client try to use a 32 bit timestamp (apart from AetherSDR!). Jonathan G4IVV |
|
Thanks Jonathan, this is exactly the kind of source we needed and it settles the question definitively. Three independent confirmations — your Maestro/Flex 8600 Wireshark capture, the Eric KE5DTO (Flex) statement that "the timestamp is a 4 digit free-running millisecond counter… expected to be hex formatted as 0xnnnn… would roll over after FFFF (2^16-1)", and cross-validation against your own keying software and other Flex clients — all converge on a 16-bit free-running ms counter, never 32-bit. Your capture also confirms two further details the PR matches:
The history question — why was 32-bit there in the first place — is fair. Looking at git blame on One small thing worth surfacing for the on-radio smoke test in the PR's Hardware Validation section: it'd be good to take a brief Wireshark capture of Aether keying after the fix and confirm we see Will incorporate this rationale into the PR description so the source for the 16-bit reading is explicit and cites your capture rather than implying a FlexLib decompile. 73, |
ddcfff6 to
17e3a9c
Compare
|
Rebased to main and submitting for review. Doing some development in public here. I am not strong on CW so I'd rather get this out in public, test and iterate on feedback.
Special thanks to @JonathanPerkins for validating the netCW packet approach. |
|
Claude here on Jeremy's behalf — merged via admin squash. Thanks @jensenpat for the rigour: external confirmation from Jonathan Perkins's Maestro capture + Eric KE5DTO's Flex statement nails the 16-bit timestamp finding, and the diagnostic instrumentation is going to make every CW user-report from here on out triageable from logs. The TCP-backstop restoration was the one piece I was least sure of — but the new trace log surfaces exactly which copies the radio accepts vs. the TCP path, so if Jeremy will run the on-radio smoke (straight key, iambic paddle, log review) on his FLEX-8600. Will follow up here if anything surfaces. 73, Jeremy KK7GWY & Claude (AI dev partner) |
#2336 added paramActionTrace alongside the existing paramAction and moved MainWindow's connection over to it. paramAction was still emitted but had zero remaining consumers. Drops the redundant signal declaration + emit, and refreshes the m_midiSetters comment in MainWindow.h to reference the correct signal. Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Summary
This PR fixes the shared netCW transmit path used by straight key, MIDI/serial paddles, and the local iambic keyer, then adds a focused CW diagnostic trail so we can troubleshoot remaining latency or keying problems from user logs.
The important Morse/API distinction is that Flex does not expose dedicated radio commands for
ditanddah. A dit is a dot-length key-down interval, and a dah is a dash-length key-down interval. Aether generates those timings locally for iambic paddles, then emits ordinary key/PTT edges to the radio:cw key 1/0 ...for key down/upcw ptt 1/0 ...for CW transmit PTT on/offcw key immediate 1/0only as a no-netCW TCP fallbackcwx send ...for radio-side text-to-Morse, which is a different CWX pathSo the straight-key and iambic reports converge on one bug surface:
RadioModel::sendNetCwCommand(). MIDI paddle issues also become observable through the same path once the newaether.cwdebug category is enabled.Root Cause
Aether was sending netCW command timestamps as an 8-digit, epoch-derived 32-bit value:
The Flex netCW path expects a 16-bit relative millisecond counter instead. Working captures and FlexLib behavior indicate that clients reset the counter after an idle gap, with the first event after idle commonly sent as
time=0x0000to resync radio-side timing.Because the timestamp was wrong, the radio could receive UDP packets without accepting them as valid key/PTT timing events, which matches the issue reports: local sidetone works, UDP appears to go out, but the radio never transitions into TX.
There were two related transport mismatches:
index=N, so the TCP command is useful again as a reliability backstop rather than just log noise.Changes
netCW keying fix
QElapsedTimer.time=0x....as a 4-digit, uppercase, 16-bit millisecond value.cw_sidetone_testby panning hard-left in the one assertion that measures only the left channel.CW/MIDI/netCW diagnostics
aether.cw/CW / netCW.CwTrace.h, a tiny steady-clock trace helper for monotonic milliseconds and per-event trace IDs.cw.key,cw.dit,cw.dah,cw.ptt) with source binding, scaled value, and trace ID.How To Use The New Trace Logs
Enable the
CW / netCWcategory in Help → Support, reproduce the keying problem, then inspectaethersdr.log.For a healthy straight-key MIDI press, the log should show a chain like:
For iambic paddles, expect:
The main latency fields to look at are:
queueLagMs: RtMidi callback → MIDI manager Qt dispatchcallbackToMainMs: RtMidi callback → MainWindow setter dispatchsinceMidiMs: original MIDI callback → CW/iambic/netCW point being loggedtimerSlipMs: netCW redundant UDP copy timer drift from the intended 0/5/10/15 ms send scheduleThis should let us tell whether a user’s issue is controller-side latency, Qt thread handoff latency, local iambic timing, netCW scheduling jitter, or radio-side acceptance/TX behavior.
User Impact
This should address the active reports where straight key or iambic paddle input produces local sidetone but no RF/TX from the Flex radio. Both straight key and iambic keying now ride the same corrected netCW command path.
The new logging is disabled by default and only activates when the support category is enabled, so normal CW operation should not be noisy. When enabled, it gives us enough timing breadcrumbs to debug user-reported MIDI paddle latency, missed key-up/key-down edges, duplicate PTT/key behavior, and delayed netCW packet sends.
Iambic behavior remains intentionally local for timing: Aether still generates dot/dash cadence in
IambicKeyer, drives low-latency local sidetone from those generated key edges, and sends normalcw keyedges to the radio for the RF signal. There are no new dedicated dit/dah radio commands because FlexLib does not appear to provide such a command surface.Hardware Validation Notes
This has been validated against the local unit tests and build, but it still needs an on-radio smoke test with a Flex radio:
stream create netcwsucceeds.CW / netCWlogging.aethersdr.logfor the trace chain and timer-slip values.cw key 1edge.Validation
cmake --build build -j8./build/iambic_keyer_test./build/cw_sidetone_test./build/midi_settings_test./build/transmit_model_test./build/transmit_model_apd_testctest --test-dir build --output-on-failure -j8git diff --checkFixes #2175 #2079 #1413 #2010 #1075 #1989 #1973 #1537 #1075
👨🏼💻 Generated with OpenAI Codex (GPT-5.5 Pro 4/23) and tested by @jensenpat