-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Milestone
Description
We've seen instances of FoundationDB processes crashing without triggering our crash handling mechanism. Based on some initial investigation, I believe this to be happening when the slow task profiling mechanism issues a profiling signal while the process is running somewhere in the TLS code that must not be tolerant of our signal handler.
This is based on the following evidence:
- I've only witnessed this on processes with TLS enabled
- We disable the crash handler when sending the profiling signal
- There are connection related trace events at the time of death in every instance I've seen this happen, and slow task related events showed up in many cases too
- There exist other places in our code where it is unsafe for us to run the profiling signal handler
To fix a problem like this, we need to identify where it's happening and either make it safe to use with slow task profiling or disable slow task profiling during that part of the code.
Metadata
Metadata
Assignees
Labels
No labels