Skip to content

Serialize Qt log file writes to fix macOS tune crash#2284

Merged
ten9876 merged 1 commit intoten9876:mainfrom
jensenpat:aether/macos-log-crash-root-cause
May 2, 2026
Merged

Serialize Qt log file writes to fix macOS tune crash#2284
ten9876 merged 1 commit intoten9876:mainfrom
jensenpat:aether/macos-log-crash-root-cause

Conversation

@jensenpat
Copy link
Copy Markdown
Collaborator

Summary

  • Serialize the global Qt message handler before it touches the shared log file.
  • Replace per-message QTextStream wrapping with a direct UTF-8 QFile::write()/flush() path.
  • Leave all tune, slice, panadapter, and radio command behavior unchanged.

Root Cause

The macOS crash report aborts in libmalloc while Qt is appending a log line:

QDebug::~QDebug() -> MainWindow::logTunePolicyDecision() -> Qt message handler -> QTextStreamPrivate::flushWriteBuffer() -> QFileDevice::writeData() -> QRingBuffer::append() -> malloc abort.

qInstallMessageHandler can receive messages from any Qt thread, but AetherSDR's handler wrote every log message through one global QFile* without synchronization. Concurrent qDebug()/qCInfo() output from the main thread and worker threads can corrupt Qt's internal file/write buffer state. The tune policy diagnostic line in the crash report is the log message that exposed the corruption; the failing object is the logging sink, not the Flex tune command path.

The fix adds one intentionally leaked mutex, mirroring the existing shutdown-safe treatment of the redaction regexes, and holds it across PII redaction, file write, and stderr write so the handler is fully serialized.

Radio Behavior

This PR does not change radio command ordering or panadapter policy. The existing applyTuneRequest() flow still sends slice tune ... autopan=0, optionally applies local pan status, and then sends the same display pan set ... center=... command when the reveal/follow policy decides it is needed. I checked the local FlexLib v4.2.18 references for slice tune and display pan set while confirming this stayed outside the radio API path.

Validation

  • cmake --build build --parallel 10
  • ctest --test-dir build --output-on-failure -j10
  • git diff --check

Local app bundle from validation build:

/private/tmp/AetherSDR-macos-log-crash-root-cause/build/AetherSDR.app

👨🏼‍💻 Generated with OpenAI Codex (GPT-5.5 Pro 4/23) and tested by @jensenpat

@jensenpat jensenpat marked this pull request as ready for review May 2, 2026 14:40
@jensenpat jensenpat requested a review from ten9876 as a code owner May 2, 2026 14:40
Copy link
Copy Markdown
Contributor

@aethersdr-agent aethersdr-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jensenpat — this is a tightly scoped, well-justified fix and the root-cause writeup matches what's in the diff.

A few things I checked and liked:

  • Leaked-mutex pattern matches existing convention. The static QMutex* mutex = new QMutex mirrors the leaked QRegularExpression* pattern already in redactPii() (around lines 74–90), with the same shutdown-ordering rationale called out in the comment. That keeps the file internally consistent.
  • Lock covers redaction + file write + stderr. Holding the lock across redactPii() is slightly stronger than strictly required (the QRegularExpression objects are documented as thread-safe for match), but it's the right call here because it also serializes the s_logFile and stderr writes from a single critical section, which is what the crash report shows is needed.
  • Drops QTextStream(s_logFile) per call. Re-wrapping the same QFile* in a fresh QTextStream on every message is exactly the shape that corrupts Qt's internal write buffer state, so going to QFile::write(lineBytes) + flush() is the right simplification, not just a serialization change.
  • lineBytes reused for stderr. Switching the stderr path from toLocal8Bit() to toUtf8() is a minor behavior change, but AetherSDR's supported platforms (Linux/macOS) are UTF-8 terminals, so this is fine and saves a re-encode.

No null/RAII concerns, the change is contained to src/main.cpp as the PR title implies, and the radio command path is correctly left untouched.

Looks good to me from a review standpoint — leaving final merge to the maintainer.

@ten9876 ten9876 merged commit 84f6701 into ten9876:main May 2, 2026
5 checks passed
ten9876 added a commit that referenced this pull request May 2, 2026
The v0.9.5.1 entry was originally cut for just the TCI TX hotfix
(#2285), but six additional fixes had already landed in main between
the v0.9.5 tag and v0.9.5.1:

- #2275 NR2 wisdom generation off the audio thread
- #2278 SmartLink disconnect teardown
- #2279 Reset RX slice tabs on disconnect (#2254)
- #2280 macOS panadapter pop-out refresh + multi-pan dock layout
- #2282 SmartLink reconnect after WAN drop
- #2284 Qt log handler serialization fixes macOS tune-time crash

CHANGELOG.md gets a per-fix entry with root cause + behavior change.
WhatsNewData.cpp regenerated via scripts/gen_whatsnew.py so the
in-app What's New dialog reflects the full v0.9.5.1 contents.

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
ten9876 added a commit that referenced this pull request May 2, 2026
Match the v0.9.5 `(#NNNN, contributor)` heading style and call out
@rfoust and @jensenpat in the summary for the bulk of this release.

Authors per PR:
- @rfoust: #2278, #2279, #2280, #2282 (SmartLink + macOS popout)
- @jensenpat: #2275, #2284 (NR2 wisdom + Qt log serialization)

WhatsNewData.cpp regenerated.

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
ten9876 added a commit that referenced this pull request May 3, 2026
When AetherSDR is launched from OpenDeck (or any other parent that
captures stderr but doesn't drain it — Stream Deck "Run Command",
systemd user services, GUI launchers), the ~64 KB pipe buffer fills
after about 60 KB of debug output and the next fprintf(stderr, ...)
blocks indefinitely.  Worse, since #2284 the entire messageHandler
ran under a global QMutex to serialize log-file writes, so once one
thread blocked on stderr every other thread that called qDebug /
qInfo / qWarning queued behind it on the mutex — full-app deadlock,
no log output, threads sitting in futex_wait.

Two changes:

1. Move the file write into a tighter scope so the mutex is only
   held across the QFile call.  The fprintf is outside the mutex —
   even an unexpected stderr stall can no longer block other threads'
   log calls.

2. Skip the stderr mirror entirely when stderr is not a TTY.  CLI
   launches keep their console output; Stream Deck / systemd-user /
   GUI launchers go straight to the log file.  The log-file path is
   unchanged in both cases.

isatty / fileno are POSIX; on Windows they live in <io.h> as
_isatty / _fileno.  Wrap accordingly.

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants