Skip to content

Crash when using TLS with the slow task profiler #2364

@ajbeamon

Description

@ajbeamon

We've seen instances of FoundationDB processes crashing without triggering our crash handling mechanism. Based on some initial investigation, I believe this to be happening when the slow task profiling mechanism issues a profiling signal while the process is running somewhere in the TLS code that must not be tolerant of our signal handler.

This is based on the following evidence:

  1. I've only witnessed this on processes with TLS enabled
  2. We disable the crash handler when sending the profiling signal
  3. There are connection related trace events at the time of death in every instance I've seen this happen, and slow task related events showed up in many cases too
  4. There exist other places in our code where it is unsafe for us to run the profiling signal handler

To fix a problem like this, we need to identify where it's happening and either make it safe to use with slow task profiling or disable slow task profiling during that part of the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions