fix(logs): move logs flush after on_crash call#1474
Conversation
| } | ||
|
|
||
| // Flush logs in a crash-safe manner before crash handling | ||
| if (options->enable_logs) { | ||
| sentry__logs_flush_crash_safe(); |
There was a problem hiding this comment.
Bug: The on_crash callback is now called before logs are flushed. If the callback crashes or hangs, all crash telemetry and logs will be lost.
Severity: CRITICAL | Confidence: High
🔍 Detailed Analysis
The order of operations within the crash handler has been changed, moving the execution of the user-provided on_crash_func callback to before the sentry__logs_flush_crash_safe() function is called. The on_crash_func is executed within a signal handler context without any timeout or error handling wrapper. If the user's callback code crashes, enters an infinite loop, or otherwise hangs, the execution flow will be interrupted. As a result, the log flushing mechanism will never be reached, leading to the complete loss of all pre-crash logs and preventing the crash report from being sent to Sentry. This is a regression, as logs were previously flushed before executing the potentially unstable user callback.
💡 Suggested Fix
Revert the order of operations in the crash handler. Call sentry__logs_flush_crash_safe() before invoking the user-provided options->on_crash_func. This ensures that logs are safely written to disk before executing potentially faulty user code, preserving critical diagnostic information even if the callback fails.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/backends/sentry_backend_crashpad.cpp#L357-L361
Potential issue: The order of operations within the crash handler has been changed,
moving the execution of the user-provided `on_crash_func` callback to before the
`sentry__logs_flush_crash_safe()` function is called. The `on_crash_func` is executed
within a signal handler context without any timeout or error handling wrapper. If the
user's callback code crashes, enters an infinite loop, or otherwise hangs, the execution
flow will be interrupted. As a result, the log flushing mechanism will never be reached,
leading to the complete loss of all pre-crash logs and preventing the crash report from
being sent to Sentry. This is a regression, as logs were previously flushed before
executing the potentially unstable user callback.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 8276555
There was a problem hiding this comment.
while this is true, we explicitly warn users about on_crash being signal-safe, and if the callback crashes/hangs we would lose our crash events anyway (which we consider user error from not providing a proper on_crash function).
Fixes #1473
#skip-changelog