Skip to content

Workaround for PERF_EVENT_IOC_REFRESH bug#1599

Merged
apangin merged 5 commits intomasterfrom
cpuclock-bug
Nov 18, 2025
Merged

Workaround for PERF_EVENT_IOC_REFRESH bug#1599
apangin merged 5 commits intomasterfrom
cpuclock-bug

Conversation

@apangin
Copy link
Copy Markdown
Member

@apangin apangin commented Nov 18, 2025

Description

Implement workaround for the Linux kernel bug that causes the entire system to hang when running CPU profiler with perf_events engine.

Related issues

#1578

Motivation and context

PERF_EVENT_IOC_REFRESH control allows to disable perf_event automatically on counter overflow. Async-profiler uses this control to temporarily disable perf_event while it is processed inside the signal handler. Without this, event could be recursively triggered again and again during processing. Event is then re-enabled just before return from the signal handler.

In Linux 6.16.x and 6.17.x, there is a kernel bug that causes a deadlock when perf_event is automatically disabled inside hrtimer interrupt handler. To workaround this bug, we do not use PERF_EVENT_IOC_REFRESH feature, but instead disable and re-enable the event manually inside a signal handler. To avoid an extra syscall, the workaround is applied only when running CPU profiling on the affected kernel versions.

How has this been tested?

Ran cpu-clock and cycles profiling on Linux versions 6.14, 6.16 and 6.17.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@franz1981
Copy link
Copy Markdown

Many thanks again @apangin ❤️

@apangin apangin changed the title Cpuclock bug Workaround for PERF_EVENT_IOC_REFRESH bug Nov 18, 2025
@apangin apangin merged commit 763616a into master Nov 18, 2025
57 checks passed
apangin added a commit that referenced this pull request Nov 18, 2025
apangin added a commit that referenced this pull request Nov 18, 2025
apangin added a commit that referenced this pull request Nov 18, 2025
apangin added a commit that referenced this pull request Nov 18, 2025
apangin added a commit that referenced this pull request Nov 18, 2025
Signed-off-by: Andrei Pangin <[email protected]>
(cherry picked from commit b855e0c)
jerrinot added a commit to questdb/questdb that referenced this pull request Nov 28, 2025
this version has a workaround for a bug in kernel 6.17 - async-profiler/async-profiler#1599
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Jan 2, 2026
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Jan 2, 2026
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Jan 6, 2026
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Jan 6, 2026
@apangin apangin deleted the cpuclock-bug branch January 13, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants