Skip to content

fix(prof): incompatibility with ext-grpc#3542

Merged
realFlowControl merged 4 commits intomasterfrom
florian/fix-grpc
Jan 12, 2026
Merged

fix(prof): incompatibility with ext-grpc#3542
realFlowControl merged 4 commits intomasterfrom
florian/fix-grpc

Conversation

@realFlowControl
Copy link
Copy Markdown
Member

@realFlowControl realFlowControl commented Dec 22, 2025

Description

In ext-grpc you can register a plugin-callback for metadata. ext-grpc executes this callback function on another thread in the plugin_get_metadata() function. This is done even in NTS on a GRPC background thread which is not made a PHP thread. It only works, because the PHP main thread is blocked by a PHP function call, technically serving as a mutex for the engine.

In #3175 we fixed ext-grpc compatibility by following PHP's global model and not use thread locals in NTS builds, but it seems we introduced thread locals again in the mean time and as such we might crash in allocation profiling when collecting a sample and calculating the new sampling distance, when ext-grpc executes PHP code on a background thread.

0x7f0e8da9a95b get<rand::rngs::adapter::reseeding::ReseedingRng<rand_chacha::chacha::ChaCha12Core, rand_core::os::OsRng>> (/usr/local/lib/rustlib/src/rust/library/core/src/cell.rs:2169)
0x7f0e8da9a95b next_u64 (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/rngs/thread.rs:112)
0x7f0e8da9a95b sample<rand::rngs::thread::ThreadRng> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/integer.rs:45)
0x7f0e8da9a95b gen<rand::rngs::thread::ThreadRng, u64> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/rng.rs:95)
0x7f0e8da9a95b sample<rand::rngs::thread::ThreadRng> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/float.rs:115)
0x7f0e8da7ea33 gen<rand::rngs::thread::ThreadRng, f64> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/rng.rs:95)
0x7f0e8da7ea33 sample<f64, rand::rngs::thread::ThreadRng> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/rand_distr-0.4.3/src/poisson.rs:95)
0x7f0e8daa7f8d next_sampling_interval (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/mod.rs:61)
0x7f0e8daa7f8d should_collect_allocation (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/mod.rs:71)
0x7f0e8daa7f8d {closure#0} (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/tls_allocation_profiling_stats.rs:67)
0x7f0e8daa7f8d call_once<datadog_php_profiling::allocation::tls_allocation_profiling_stats::allocation_profiling_stats_should_collect::{closure_env#0}, (&mut core::mem::maybe_uninit::MaybeUninit<datadog_php_profiling::allocation::AllocationProfilingStats>)> (/usr/local/lib/rustlib/src/rust/library/core/src/ops/function.rs:250)
0x7f0e8daa7f8d {closure#0}<datadog_php_profiling::allocation::tls_allocation_profiling_stats::allocation_profiling_stats_should_collect::{closure_env#0}, bool> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/tls_allocation_profiling_stats.rs:49)
0x7f0e8daa7f8d try_with<core::cell::UnsafeCell<core::mem::maybe_uninit::MaybeUninit<datadog_php_profiling::allocation::AllocationProfilingStats>>, datadog_php_profiling::allocation::tls_allocation_profiling_stats::allocation_profiling_stats_mut::{closure_env#0}<datadog_php_profiling::allocation::tls_allocation_profiling_stats::allocation_profiling_stats_should_collect::{closure_env#0}, bool>, bool> (/usr/local/lib/rustlib/src/rust/library/std/src/thread/local.rs:283)
0x7f0e8daa7f8d allocation_profiling_stats_mut<datadog_php_profiling::allocation::tls_allocation_profiling_stats::allocation_profiling_stats_should_collect::{closure_env#0}, bool> (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/tls_allocation_profiling_stats.rs:42)
0x7f0e8daa7f8d allocation_profiling_stats_should_collect (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/tls_allocation_profiling_stats.rs:79)
0x7f0e8daa7f8d alloc_prof_malloc (/go/src/github.com/DataDog/apm-reliability/dd-trace-php/profiling/src/allocation/allocation_le83.rs:360)

The basic problem is that this new thread is not made a PHP thread, so it never goes through GINIT and never initialises the ALLOCATION_PROFILING_STATS which means all of the memory there is uninitialised.

What this PR does to address this:

  • use StdRng instead of ThreadRng in PHP NTS (ThreadRng uses thread locals under the hood, we need to avoid this)
  • make ALLOCATION_PROFILING_STATS a global static mut on NTS builds to avoid having non-PHP threads read from un-init memory
  • rename the file tls_allocation_profiling_stats.rs‎ to profiling_stats.rs

You can review this PR commit by commit.

An alternative to this PR could be to switch PHP's globals all together, but I believe that this step needs more exploration than just a simple PR. However, it could help to finally resolve this issue once and for all. In the meantime this PR fixes the compatibility issue with ext-grpc while also removing a thread local access in PHP NTS versions in the allocator hot path 🎉

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@github-actions github-actions Bot added profiling Relates to the Continuous Profiler tracing labels Dec 22, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.73%. Comparing base (98a31cb) to head (6d1be1f).
⚠️ Report is 182 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3542      +/-   ##
==========================================
- Coverage   61.78%   61.73%   -0.05%     
==========================================
  Files         139      139              
  Lines       13051    13051              
  Branches     1712     1712              
==========================================
- Hits         8063     8057       -6     
- Misses       4225     4231       +6     
  Partials      763      763              

see 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98a31cb...6d1be1f. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Dec 22, 2025

Benchmarks [ profiler ]

Benchmark execution time: 2026-01-06 19:24:12

Comparing candidate commit 6d1be1f in PR branch florian/fix-grpc with baseline commit 98a31cb in branch master.

Found 2 performance improvements and 2 performance regressions! Performance is the same for 25 metrics, 7 unstable metrics.

scenario:php-profiler-exceptions-with-profiler

  • 🟩 execution_time [-5.086ms; -2.401ms] or [-5.156%; -2.434%]

scenario:php-profiler-timeline-memory-with-profiler-and-timeline

  • 🟥 cpu_user_time [+46.715ms; +85.755ms] or [+3.062%; +5.620%]
  • 🟥 execution_time [+30.788ms; +46.289ms] or [+2.524%; +3.795%]
  • 🟩 cpu_system_time [-45.760ms; -13.172ms] or [-8.344%; -2.402%]

@realFlowControl realFlowControl marked this pull request as ready for review December 22, 2025 11:33
@realFlowControl realFlowControl requested a review from a team as a code owner December 22, 2025 11:33
@realFlowControl realFlowControl changed the title Fix incompatibility with ext-grpc fix(prof): incompatibility with ext-grpc Dec 23, 2025
Copy link
Copy Markdown
Collaborator

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, would be awesome if you can make a test or even if we just need to do something similar to what grpc does, maybe our own badly behaved function gated by a function or something?

@realFlowControl realFlowControl merged commit 562e64f into master Jan 12, 2026
2006 of 2008 checks passed
@realFlowControl realFlowControl deleted the florian/fix-grpc branch January 12, 2026 13:28
@github-actions github-actions Bot added this to the 1.16.0 milestone Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

profiling Relates to the Continuous Profiler tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants