chore(telemetry): track parent processes across forks#16839
chore(telemetry): track parent processes across forks#16839gh-worker-dd-mergequeue-cf854d[bot] merged 12 commits intomainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
Performance SLOsComparing candidate munir/implement-stable-runtime-id (6da2d4a) with baseline main (e061106) 🟡 Near SLO Breach (2 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 19.718ms (SLO: <22.300ms 📉 -11.6%) vs baseline: ~same Memory: ✅ 68.636MB (SLO: <73.500MB -6.6%) vs baseline: +4.7% ✅ exception-replay-enabledTime: ✅ 1.398ms (SLO: <1.450ms -3.6%) vs baseline: +0.3% Memory: ✅ 66.743MB (SLO: <71.500MB -6.7%) vs baseline: +4.8% ✅ iastTime: ✅ 19.707ms (SLO: <22.250ms 📉 -11.4%) vs baseline: -0.2% Memory: ✅ 68.637MB (SLO: <75.000MB -8.5%) vs baseline: +4.7% ✅ profilerTime: ✅ 15.189ms (SLO: <16.550ms -8.2%) vs baseline: +0.2% Memory: ✅ 60.380MB (SLO: <61.000MB 🟡 -1.0%) vs baseline: +4.8% ✅ resource-renamingTime: ✅ 19.628ms (SLO: <21.750ms -9.8%) vs baseline: -0.1% Memory: ✅ 68.615MB (SLO: <73.500MB -6.6%) vs baseline: +4.8% ✅ span-code-originTime: ✅ 20.159ms (SLO: <28.200ms 📉 -28.5%) vs baseline: +1.6% Memory: ✅ 68.531MB (SLO: <75.000MB -8.6%) vs baseline: +4.5% ✅ tracerTime: ✅ 19.672ms (SLO: <21.750ms -9.6%) vs baseline: -0.1% Memory: ✅ 68.767MB (SLO: <75.000MB -8.3%) vs baseline: +4.9% ✅ tracer-and-profilerTime: ✅ 20.990ms (SLO: <23.500ms 📉 -10.7%) vs baseline: ~same Memory: ✅ 70.720MB (SLO: <75.000MB -5.7%) vs baseline: +4.7% ✅ tracer-dont-create-db-spansTime: ✅ 19.833ms (SLO: <21.500ms -7.8%) vs baseline: +0.5% Memory: ✅ 68.716MB (SLO: <75.000MB -8.4%) vs baseline: +4.8% ✅ tracer-minimalTime: ✅ 16.784ms (SLO: <17.500ms -4.1%) vs baseline: -0.3% Memory: ✅ 68.633MB (SLO: <75.000MB -8.5%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 19.661ms (SLO: <21.750ms -9.6%) vs baseline: +0.2% Memory: ✅ 68.601MB (SLO: <72.500MB -5.4%) vs baseline: +4.8% ✅ tracer-no-cachesTime: ✅ 17.598ms (SLO: <19.650ms 📉 -10.4%) vs baseline: +0.2% Memory: ✅ 68.749MB (SLO: <75.000MB -8.3%) vs baseline: +5.0% ✅ tracer-no-databasesTime: ✅ 19.335ms (SLO: <20.100ms -3.8%) vs baseline: +0.2% Memory: ✅ 68.721MB (SLO: <75.000MB -8.4%) vs baseline: +4.9% ✅ tracer-no-middlewareTime: ✅ 19.432ms (SLO: <21.500ms -9.6%) vs baseline: +0.2% Memory: ✅ 68.715MB (SLO: <75.000MB -8.4%) vs baseline: +5.0% ✅ tracer-no-templatesTime: ✅ 19.821ms (SLO: <22.000ms -9.9%) vs baseline: +2.0% Memory: ✅ 68.788MB (SLO: <73.500MB -6.4%) vs baseline: +5.1% 🟡 recursivecomputation - 8/8✅ deepTime: ✅ 310.764ms (SLO: <320.950ms -3.2%) vs baseline: ~same Memory: ✅ 37.493MB (SLO: <38.750MB -3.2%) vs baseline: +5.1% ✅ deep-profiledTime: ✅ 329.277ms (SLO: <359.150ms -8.3%) vs baseline: +0.2% Memory: ✅ 43.942MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ mediumTime: ✅ 7.302ms (SLO: <7.400ms 🟡 -1.3%) vs baseline: ~same Memory: ✅ 36.451MB (SLO: <38.000MB -4.1%) vs baseline: +5.1% ✅ shallowTime: ✅ 1.019ms (SLO: <1.050ms -2.9%) vs baseline: +1.7% Memory: ✅ 36.372MB (SLO: <38.000MB -4.3%) vs baseline: +5.3%
|
Co-authored-by: Munir Abdinur <[email protected]>
Co-authored-by: Munir Abdinur <[email protected]>
Codeowners resolved as |
Co-authored-by: Munir Abdinur <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e1e5f7187e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-authored-by: Munir Abdinur <[email protected]>
Co-authored-by: Munir Abdinur <[email protected]>
## Summary Implements the [Stable Service Instance Identifier RFC](https://docs.google.com/document/d/1ECKj9_NnwaKYtFqm3p3Rlpicx5d-OQcdj9kI2jvRqVU) for Go instrumentation telemetry. - **`DD-Session-ID`**: always present on every telemetry request, set to the current `runtime_id` - **`DD-Root-Session-ID`**: present only in child processes, inherited via `_DD_ROOT_GO_SESSION_ID` env var. Omitted when equal to session ID — backend infers root = self when absent - **Auto-propagation**: `globalconfig.init()` sets `_DD_ROOT_GO_SESSION_ID` in `os.Environ()` so child processes spawned via `os/exec` inherit it automatically without any user-side calls ## Changes - `internal/globalconfig/globalconfig.go`: adds `rootSessionID` field, `init()` reads/sets `_DD_ROOT_GO_SESSION_ID` (internal env var, not in supported_configurations), `RootSessionID()` getter - `internal/telemetry/internal/writer.go`: adds `DD-Session-ID` (always) and `DD-Root-Session-ID` (child processes only) to pre-baked telemetry headers - Tests for both globalconfig (including cross-process propagation) and writer ## Related - System-tests PR: DataDog/system-tests#6510 - Node.js PR: DataDog/dd-trace-js#7821 - dd-trace-py fork tracking: DataDog/dd-trace-py#16839 - dd-trace-py spawn tracking: DataDog/dd-trace-py#16842 Co-authored-by: ayan.khan <[email protected]>
|
/merge -f |
|
View all feedbacks in Devflow UI.
You need to provide a reason for skipping checks |
|
/merge -f PR has been stuck in merge queue for over a week |
|
View all feedbacks in Devflow UI.
Arguments errors:
If you need support, contact us on Slack #devflow with those details! |
Circular import analysis
|
Description
Implements session identifier headers for instrumentation telemetry as part of the Stable Service
Instance Identifier RFC. Each telemetry request now includes:
DD-Session-ID: current process runtime ID (always present)DD-Root-Session-ID: original ancestor's runtime ID (forked processes only)DD-Parent-Session-ID: immediate parent's runtime ID (forked processes only)This enables the backend to correlate related processes across fork trees without relying on
runtime_idregeneration._PARENT_RUNTIME_IDtracking is added toddtrace/internal/runtimealongside env var seeding(
_DD_ROOT_SESSION_ID,_DD_PARENT_SESSION_ID) to supportmultiprocessingspawn/forkserverstart methods.
Testing
tests/tracer/runtime/test_runtime_id.py::test_parent_runtime_id— validatesget_parent_runtime_id()isNonein the root process and correctly tracks the immediateparent across nested forks
tests/telemetry/test_telemetry.py::test_session_id_headers_across_forks— uses the testagent to capture telemetry requests from a parent → child → grandchild fork tree and asserts
the correct structure of all three session headers
Risks
The new headers are additive — existing payload fields (
runtime_id) are unchanged. No impactto profiling, DI, crash tracking, or remote config.
Additional Notes
DD-Root-Session-IDandDD-Parent-Session-IDare omitted from root process requests per theRFC spec: if
DD-Root-Session-IDis absent, the backend assumesroot_session_id == session_id.