Skip to content

Fix and trace netCW keying for straight key and iambic CW#2336

Merged
ten9876 merged 2 commits intoten9876:mainfrom
jensenpat:aether/netcw-cw-keying-fix
May 5, 2026
Merged

Fix and trace netCW keying for straight key and iambic CW#2336
ten9876 merged 2 commits intoten9876:mainfrom
jensenpat:aether/netcw-cw-keying-fix

Conversation

@jensenpat
Copy link
Copy Markdown
Collaborator

@jensenpat jensenpat commented May 4, 2026

Summary

This PR fixes the shared netCW transmit path used by straight key, MIDI/serial paddles, and the local iambic keyer, then adds a focused CW diagnostic trail so we can troubleshoot remaining latency or keying problems from user logs.

The important Morse/API distinction is that Flex does not expose dedicated radio commands for dit and dah. A dit is a dot-length key-down interval, and a dah is a dash-length key-down interval. Aether generates those timings locally for iambic paddles, then emits ordinary key/PTT edges to the radio:

  • cw key 1/0 ... for key down/up
  • cw ptt 1/0 ... for CW transmit PTT on/off
  • cw key immediate 1/0 only as a no-netCW TCP fallback
  • cwx send ... for radio-side text-to-Morse, which is a different CWX path

So the straight-key and iambic reports converge on one bug surface: RadioModel::sendNetCwCommand(). MIDI paddle issues also become observable through the same path once the new aether.cw debug category is enabled.

Root Cause

Aether was sending netCW command timestamps as an 8-digit, epoch-derived 32-bit value:

cw key 1 time=0x12345678 index=N client_handle=0x...

The Flex netCW path expects a 16-bit relative millisecond counter instead. Working captures and FlexLib behavior indicate that clients reset the counter after an idle gap, with the first event after idle commonly sent as time=0x0000 to resync radio-side timing.

Because the timestamp was wrong, the radio could receive UDP packets without accepting them as valid key/PTT timing events, which matches the issue reports: local sidetone works, UDP appears to go out, but the radio never transitions into TX.

There were two related transport mismatches:

  • The VITA/netCW datagram was sized to the raw ASCII payload even though the VRT packet size rounded to a 32-bit word boundary. Maestro captures show null padding to the word boundary, so this PR makes packet length and VRT size agree.
  • Aether had removed the FlexLib-style TCP send of the same decorated command after UDP delivery. With the corrected timestamp format, the radio can dedupe by index=N, so the TCP command is useful again as a reliability backstop rather than just log noise.

Changes

netCW keying fix

  • Add a small per-radio netCW relative clock using QElapsedTimer.
  • Emit time=0x.... as a 4-digit, uppercase, 16-bit millisecond value.
  • Reset the netCW clock on stream creation, disconnect, and after a short idle gap.
  • Keep the existing redundant UDP sends, with incrementing VITA packet count.
  • Restore FlexLib-style TCP delivery after the UDP netCW packets.
  • Pad netCW VITA command payloads to a 32-bit word boundary.
  • Clarify cw_sidetone_test by panning hard-left in the one assertion that measures only the left channel.

CW/MIDI/netCW diagnostics

  • Add a new support-log category: aether.cw / CW / netCW.
  • Add CwTrace.h, a tiny steady-clock trace helper for monotonic milliseconds and per-event trace IDs.
  • Log raw MIDI callback events, including status/data bytes, RtMidi delta, and callback timestamp.
  • Log MIDI dispatch from the RtMidi callback into Qt, including callback-to-dispatch queue lag.
  • Log CW MIDI binding resolution (cw.key, cw.dit, cw.dah, cw.ptt) with source binding, scaled value, and trace ID.
  • Carry trace metadata through MainWindow’s MIDI setter dispatch so CW setters can be correlated with the original controller event.
  • Log straight-key MIDI actions and iambic paddle state updates.
  • Log local iambic keyer paddle/PTT events and generated key edges.
  • Log netCW command scheduling with trace ID, source, index, Flex timestamp, payload size, packet size, UDP copy count, and TCP-backstop status.
  • Log each UDP copy send with actual delay and timer slip relative to the intended 0/5/10/15 ms schedule.
  • Log no-netCW TCP fallback use with trace/source/timing context.

How To Use The New Trace Logs

Enable the CW / netCW category in Help → Support, reproduce the keying problem, then inspect aethersdr.log.

For a healthy straight-key MIDI press, the log should show a chain like:

CW MIDI raw trace=42 ...
CW MIDI dispatch trace=42 queueLagMs=...
CW MIDI binding trace=42 param=cw.key ...
CW MIDI main trace=42 callbackToMainMs=...
CW MIDI straight-key trace=42 down=1
CW netcw schedule trace=42 source=midi:cw.key index=... time=0x.... cmd="cw ptt 1" ...
CW netcw udp-send trace=42 copy=0 delayMs=0 actualDelayMs=...
CW netcw schedule trace=42 source=midi:cw.key index=... time=0x.... cmd="cw key 1" ...

For iambic paddles, expect:

CW MIDI binding trace=43 param=cw.dit ...
CW MIDI paddle trace=43 dit=1 dah=0 localIambic=1
CW iambic paddle-event trace=43 ptt=1
CW netcw schedule trace=43 source=midi:iambic-keyer cmd="cw ptt 1" ...
CW iambic key-edge trace=43 down=1
CW netcw schedule trace=43 source=midi:iambic-keyer cmd="cw key 1" ...

The main latency fields to look at are:

  • queueLagMs: RtMidi callback → MIDI manager Qt dispatch
  • callbackToMainMs: RtMidi callback → MainWindow setter dispatch
  • sinceMidiMs: original MIDI callback → CW/iambic/netCW point being logged
  • timerSlipMs: netCW redundant UDP copy timer drift from the intended 0/5/10/15 ms send schedule

This should let us tell whether a user’s issue is controller-side latency, Qt thread handoff latency, local iambic timing, netCW scheduling jitter, or radio-side acceptance/TX behavior.

User Impact

This should address the active reports where straight key or iambic paddle input produces local sidetone but no RF/TX from the Flex radio. Both straight key and iambic keying now ride the same corrected netCW command path.

The new logging is disabled by default and only activates when the support category is enabled, so normal CW operation should not be noisy. When enabled, it gives us enough timing breadcrumbs to debug user-reported MIDI paddle latency, missed key-up/key-down edges, duplicate PTT/key behavior, and delayed netCW packet sends.

Iambic behavior remains intentionally local for timing: Aether still generates dot/dash cadence in IambicKeyer, drives low-latency local sidetone from those generated key edges, and sends normal cw key edges to the radio for the RF signal. There are no new dedicated dit/dah radio commands because FlexLib does not appear to provide such a command surface.

Hardware Validation Notes

This has been validated against the local unit tests and build, but it still needs an on-radio smoke test with a Flex radio:

  1. Connect to a radio and verify stream create netcw succeeds.
  2. Enable Help → Support → CW / netCW logging.
  3. Try straight key input and confirm the radio enters TX and emits RF.
  4. Try MIDI or serial iambic input with local iambic enabled and confirm dot/dash RF follows the local sidetone.
  5. Review aethersdr.log for the trace chain and timer-slip values.
  6. If TX works but the first element clips, the next likely tuning knob is a small PTT lead delay before the first cw key 1 edge.

Validation

  • cmake --build build -j8
  • ./build/iambic_keyer_test
  • ./build/cw_sidetone_test
  • ./build/midi_settings_test
  • ./build/transmit_model_test
  • ./build/transmit_model_apd_test
  • ctest --test-dir build --output-on-failure -j8
  • git diff --check

Fixes #2175 #2079 #1413 #2010 #1075 #1989 #1973 #1537 #1075

👨🏼‍💻 Generated with OpenAI Codex (GPT-5.5 Pro 4/23) and tested by @jensenpat

@jensenpat jensenpat marked this pull request as ready for review May 4, 2026 05:35
Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the writeup, @jensenpat — the diagnosis here is plausible and the changes are tightly scoped. A few concerns before this can land, mostly around empirical evidence rather than code quality:

Major: this PR reverts findings from #2081 without on-radio validation

Three of the changes deliberately undo decisions made in commit 2ced6ee (#2081), which were each justified with very specific empirical observations in the comments you're replacing:

  1. VITA payload null-padding — the prior comment claimed null padding to a 4-byte boundary caused the radio's parser to reject the packet (no key/PTT change, silent 0x50001000 on TCP). This PR re-adds the padding based on Maestro captures.
  2. TCP backstop after UDP sends — the prior comment claimed radio v4.1.5 rejected every CW command over TCP with 0x50001000, both the netcw form and cw key immediate. This PR re-adds it on the theory that dedup-by-index=N will save it.
  3. Timestamp format — switching from a 32-bit epoch-derived value to a 16-bit relative counter with 0x0000 resync.

The "Hardware Validation Notes" section explicitly says this hasn't been smoke-tested against a Flex radio yet. Given that prior PRs in this area have flipped between working and not-working depending on small framing details, I think on-radio validation is a hard prerequisite for a CW keying change — otherwise we risk regressing users who currently get TX on the existing path (the bug reports cite some users where local sidetone works but TX doesn't, which suggests at least some configurations are keying today). Could you stage this on a fork build and confirm at minimum:

  • Straight key produces RF on a current-firmware Flex
  • The TCP backstop doesn't generate 0x50001000 syntax-error spam in lcProtocol
  • Padded VITA packets aren't silently dropped

Question on the 16-bit timestamp claim

The PR description asserts FlexLib uses a 16-bit relative ms counter, but the existing code's commit history suggests 32-bit was chosen with reference to FlexLib output. What's the source for the 16-bit reading — a FlexLib decompile, an API doc, or a Maestro/Wireshark capture? If you have a capture showing the 4-hex-digit field over the wire, linking it would settle this quickly.

Minor

  • kNetCwIdleResetMs = 3000 is reasonable for normal CW spacing (word gap at 20 WPM is ~420 ms) but is a guess in the absence of FlexLib reference behavior. Worth a comment if you have a basis for the value.
  • The 16-bit timestamp wraps every 65.5s of continuous keying without a 3s gap — likely fine, but worth a one-liner that the radio is expected to handle wrap.
  • The cw_sidetone_test setPan(0.0f) change is a clean fix, no concerns there.

The code itself looks correct (math for packetWords stays consistent with the new paddedPayloadBytes, QElapsedTimer usage is right, invalidation in onDisconnected and stream-create both reset state). The blocker is just empirical: the prior comments are too specific to overwrite without confirming the new behavior on real hardware.

@jensenpat jensenpat changed the title Fix netCW keying for straight key and iambic CW Fix and trace netCW keying for straight key and iambic CW May 4, 2026
@jensenpat jensenpat marked this pull request as draft May 4, 2026 06:02
@JonathanPerkins
Copy link
Copy Markdown

@AetherClaude aethersdr-agent asked "The PR description asserts FlexLib uses a 16-bit relative ms counter, but the existing code's commit history suggests 32-bit was chosen with reference to FlexLib output. What's the source for the 16-bit reading — a FlexLib decompile, an API doc, or a Maestro/Wireshark capture? If you have a capture showing the 4-hex-digit field over the wire, linking it would settle this quickly."

This is a Wireshark capture from my Maestro keying my Flex 8600:

C2990|cw key 1 time=0x3FD6 index=53 client_handle=0x3EF669B4
R2990|0|

C2991|ping ms_timestamp=0x3FE2
R2991|0|

S3EF669B4|apd slice=0 mmx=0 client_handle=0x3EF669B4 ant=ANT1 freq=21.032433 rfpower=100 rx_error_mHz=8.972169 equalizer_active=0 configurable=1
S3EF669B4|apd sampler tx_ant=ANT1 selected_sampler=INVALID
S3EF669B4|transmit tune=0 tune_mode=single_tone tx_rf_power_changes_allowed=1 max_power_level=100
S3EF669B4|transmit rfpower=100 tunepower=20 am_carrier_level=100
S3EF669B4|interlock acc_txreq_enable=0 rca_txreq_enable=0 acc_tx_enabled=0 tx1_enabled=0 tx2_enabled=0 tx3_enabled=0 tx_delay=30 acc_tx_delay=0 tx1_delay=0 tx2_delay=0 tx3_delay=0 acc_txreq_polarity=0 rca_txreq_polarity=0 timeout=0
S0|interlock tx_client_handle=0x3EF669B4 state=PTT_REQUESTED reason= source=SWCW tx_allowed=1 amplifier= 
S0|atu status=TUNE_BYPASS atu_enabled=1 memories_enabled=1 using_mem=1
S0|interlock tx_client_handle=0x3EF669B4 state=TRANSMITTING reason= source=SWCW tx_allowed=1 amplifier= 
S0|atu status=TUNE_BYPASS atu_enabled=1 memories_enabled=1 using_mem=1

C2992|cw key 0 time=0x4005 index=54 client_handle=0x3EF669B4
R2992|0|

S0|interlock tx_client_handle=0x3EF669B4 state=UNKEY_REQUESTED reason= source= tx_allowed=1 amplifier= 
S0|atu status=TUNE_BYPASS atu_enabled=1 memories_enabled=1 using_mem=1
S0|interlock tx_client_handle=0x3EF669B4 state=READY reason= source= tx_allowed=1 amplifier= 
S0|atu status=TUNE_BYPASS atu_enabled=1 memories_enabled=1 using_mem=1

The timestamp in the "cw key" commands can be clearly seen as a 16 bit hex number.

In addition, in this thread: https://community.flexradio.com/discussion/7703751/is-keying-possible-via-the-api-or-cat-serial-rts-or-dtr Flex employee Eric KE5DTO stated "The timestamp is a 4 digit free-running millisecond counter" in a later post in that thread he confirmed that "the values are expected to be hex formatted as 0xnnnn where n is [0-9A-F]. So it would roll over after FFFF (2^16-1)."

To further support the 16 bit timestamp, I have written my own keying software that uses 16 timestamps and I have also been involved in the testing of other Flex client software and have Wireshark captures for their CW keying. All use 16 bit timestamps.

I do hope that helps, I have never seen any client try to use a 32 bit timestamp (apart from AetherSDR!).

Jonathan G4IVV

@aethersdr-agent
Copy link
Copy Markdown
Contributor

Thanks Jonathan, this is exactly the kind of source we needed and it settles the question definitively. Three independent confirmations — your Maestro/Flex 8600 Wireshark capture, the Eric KE5DTO (Flex) statement that "the timestamp is a 4 digit free-running millisecond counter… expected to be hex formatted as 0xnnnn… would roll over after FFFF (2^16-1)", and cross-validation against your own keying software and other Flex clients — all converge on a 16-bit free-running ms counter, never 32-bit.

Your capture also confirms two further details the PR matches:

  • Counter is free-running, not reset per event. The interval between cw key 1 time=0x3FD6 index=53 and cw key 0 time=0x4005 index=54 is 0x4005 - 0x3FD6 = 0x2F = 47 ms, a plausible straight-key dot length. That matches the PR's QElapsedTimer-based per-radio relative clock — a single monotonic ms counter that emits the current value at each key/PTT edge — rather than a per-event delta.
  • Hex casing is uppercase. time=0x3FD6, index=53, client_handle=0x3EF669B4 — the radio's status messages already use uppercase (e.g. S3EF669B4|...), and FlexLib's C# ToString("X") would produce the same. The PR formats tsHex and chHex with .toUpper(), so wire format will match the capture.

The history question — why was 32-bit there in the first place — is fair. Looking at git blame on RadioModel.cpp:825-826, the existing line is quint64 timeMs = static_cast<quint64>(QDateTime::currentMSecsSinceEpoch() & 0xFFFFFFFF); formatted as 8 hex digits. That was an early-implementation guess (epoch-derived, masked to 32 bits) rather than an inference from a FlexLib decompile or capture. The PR description's reference to "commit history suggests 32-bit was chosen with reference to FlexLib output" overstates the prior justification — there isn't a captured 32-bit reference behind it. Your evidence is what we should be anchoring on, and the fix aligns with it.

One small thing worth surfacing for the on-radio smoke test in the PR's Hardware Validation section: it'd be good to take a brief Wireshark capture of Aether keying after the fix and confirm we see time=0xNNNN (4 hex digits) and a free-running counter that increments by realistic ms gaps between consecutive index=N events — basically replicating the shape of your Maestro capture. That gives us an apples-to-apples wire-format check, not just "TX engaged."

Will incorporate this rationale into the PR description so the source for the 16-bit reading is explicit and cites your capture rather than implying a FlexLib decompile.

73,
AetherClaude

@jensenpat jensenpat force-pushed the aether/netcw-cw-keying-fix branch from ddcfff6 to 17e3a9c Compare May 4, 2026 19:19
@jensenpat
Copy link
Copy Markdown
Collaborator Author

Rebased to main and submitting for review. Doing some development in public here. I am not strong on CW so I'd rather get this out in public, test and iterate on feedback.

  1. Aiming to address core netCW protocol deficiencies and unblock core CW usage in our platform for straight and iambic keys, using MIDI controller in my lab. I am both seeing and hearing CW RF after making the protocol changes.

  2. As far as CW latency is considered, I've added full timing capture across our pipeline so we can identify bottlenecks next, and ensure we have full coverage of keyer devices. Plan to fast follow with USB HID keyboard bindings so that USB keyer devices can quickly be adopted.

  3. Third party tooling over TCI, DAX or IQ is not in scope for these changes, I'm looking to solve issues purely across the ASDR app surface as this is a large cluster of bugs prior to 1.0 release. I am not looking to own third party tooling at this time.

Special thanks to @JonathanPerkins for validating the netCW packet approach.

@jensenpat jensenpat marked this pull request as ready for review May 4, 2026 19:27
@jensenpat jensenpat linked an issue May 4, 2026 that may be closed by this pull request
2 tasks
@ten9876 ten9876 merged commit fa3c5aa into ten9876:main May 5, 2026
5 checks passed
@ten9876
Copy link
Copy Markdown
Owner

ten9876 commented May 5, 2026

Claude here on Jeremy's behalf — merged via admin squash. Thanks @jensenpat for the rigour: external confirmation from Jonathan Perkins's Maestro capture + Eric KE5DTO's Flex statement nails the 16-bit timestamp finding, and the diagnostic instrumentation is going to make every CW user-report from here on out triageable from logs.

The TCP-backstop restoration was the one piece I was least sure of — but the new trace log surfaces exactly which copies the radio accepts vs. the TCP path, so if 0x50001000 spam returns we'll see it immediately and can revert just that bit while keeping the rest.

Jeremy will run the on-radio smoke (straight key, iambic paddle, log review) on his FLEX-8600. Will follow up here if anything surfaces.

73, Jeremy KK7GWY & Claude (AI dev partner)

ten9876 added a commit that referenced this pull request May 5, 2026
#2336 added paramActionTrace alongside the existing paramAction and
moved MainWindow's connection over to it.  paramAction was still
emitted but had zero remaining consumers.  Drops the redundant signal
declaration + emit, and refreshes the m_midiSetters comment in
MainWindow.h to reference the correct signal.

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ctr2-quad causing Iambic funtion not to be enabled

3 participants