Skip to content

Improve Telemetry#372

Merged
HenryNdubuaku merged 8 commits intocactus-compute:mainfrom
mhayes853:non-blocking-telemetry
Feb 23, 2026
Merged

Improve Telemetry#372
HenryNdubuaku merged 8 commits intocactus-compute:mainfrom
mhayes853:non-blocking-telemetry

Conversation

@mhayes853
Copy link
Copy Markdown
Contributor

Right now, recording telemetry events blocks the current thread in the FFI to make network calls. This can be unacceptable for concurrency runtimes in languages like in Swift Concurrency where blocking for longer than necessary can potentially starve threads in the cooperative thread pool.

Additionally, applications currently don't have an API to explicitly flush or disable telemetry at this time. The former is useful for uploading when an app backgrounds, and the later is useful if apps want to provide a user-specific setting to opt out of telemetry.

This PR does the following:

  • Introduces a dedicated worker thread for processing telemetry events, this means that any record method will now return immediately.
  • Adds cactus_telemetry_flush and cactus_telemetry_shutdown FFIs to give applications control over the telemetry lifecycle. (cactus_telemetry_flush will block until all enqueued events are processed.)
  • Adds a new telemetry test suite including an optional integration test that runs when the cloud api key is detected in the environment and --enable-telemetry is passed to cactus test.
  • Uses a global telemetry mutex instead of many atomics to guard the telemetry state. This should prevent higher-level data races from producing inconsistent states.
  • Renames cactus_telemetry_bridge.cpp -> cactus_telemetry.cpp and puts all telemetry related FFIs in there for consistency.

@mhayes853 mhayes853 force-pushed the non-blocking-telemetry branch from 00fb657 to bb1482f Compare February 20, 2026 05:05
@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

I love this PR @mhayes853 it fixed a lot of my concerns with telemetry

@HenryNdubuaku HenryNdubuaku merged commit a31f1a2 into cactus-compute:main Feb 23, 2026
1 of 2 checks passed
ncylich pushed a commit that referenced this pull request Feb 24, 2026
* Dedicated telemetry worker

Signed-off-by: mhayes853 <[email protected]>

* FFIs for flush and shutdown + Move set telemetry environment init to shared cactus_telemetry file

Signed-off-by: mhayes853 <[email protected]>

* Cleanup tests

Signed-off-by: mhayes853 <[email protected]>

* Add race test

Signed-off-by: mhayes853 <[email protected]>

* Remove usage of atomics in favor of global lock + add cloud integration test

Signed-off-by: mhayes853 <[email protected]>

* Use api key env var

Signed-off-by: mhayes853 <[email protected]>

* Ensure non-duplicate project ids

Signed-off-by: mhayes853 <[email protected]>

Signed-off-by: mhayes853 <[email protected]>

* Cleanup + Skip Test if --enable-telemetry not passed to cactus test

Signed-off-by: mhayes853 <[email protected]>

Signed-off-by: mhayes853 <[email protected]>

---------

Signed-off-by: mhayes853 <[email protected]>
cattermelon1234 pushed a commit to cattermelon1234/cactus that referenced this pull request Feb 28, 2026
* Dedicated telemetry worker

Signed-off-by: mhayes853 <[email protected]>

* FFIs for flush and shutdown + Move set telemetry environment init to shared cactus_telemetry file

Signed-off-by: mhayes853 <[email protected]>

* Cleanup tests

Signed-off-by: mhayes853 <[email protected]>

* Add race test

Signed-off-by: mhayes853 <[email protected]>

* Remove usage of atomics in favor of global lock + add cloud integration test

Signed-off-by: mhayes853 <[email protected]>

* Use api key env var

Signed-off-by: mhayes853 <[email protected]>

* Ensure non-duplicate project ids

Signed-off-by: mhayes853 <[email protected]>

Signed-off-by: mhayes853 <[email protected]>

* Cleanup + Skip Test if --enable-telemetry not passed to cactus test

Signed-off-by: mhayes853 <[email protected]>

Signed-off-by: mhayes853 <[email protected]>

---------

Signed-off-by: mhayes853 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants