-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Different guard dogs for threads with different behavior #12927
Description
Different Guarddogs : a guard dog for worker threads and a guard dog for the main thread.
The main thread has different workloads than worker threads, this causes friction in systems that set thresholds for both of them as it’s “one-size fits all”.
For example, the main thread often spends a significant amount of time on parsing new config. That amount of time would be considered excessive for a worker thread to not have finished it’s event loop iteration and check in with its watchdog.
This “one size fits all” guard dog doesn’t work. Instead we need two guard dogs, one that’ll monitor the main thread and (other auxiliary threads)?, and another for the worker threads. This would allow us to better fine tune the thresholds depending on the guarddog (and what threads it watches) -- better allowing us to deploy watchdog capabilities such as watchdog actions, kill timeouts, etc.
The main trade off would be that it’d be harder to enact cross-guard dog policy that would trigger if we’d aggregate across guard dogs (such as multikill events), but won’t trigger because they aren’t sufficient within their individual guard dog to trigger that event.