Skip to content

Add Otel env var telemetry#7391

Merged
cecile75 merged 10 commits intomasterfrom
cecile/otelconfMap_telemetry
Aug 30, 2024
Merged

Add Otel env var telemetry#7391
cecile75 merged 10 commits intomasterfrom
cecile/otelconfMap_telemetry

Conversation

@cecile75
Copy link
Copy Markdown
Contributor

@cecile75 cecile75 commented Aug 6, 2024

What Does This Do

Generate telemetry metrics

  • when an Datadog env var and Otel env var are both set
  • when we can't map the Otel env var value to a Datadog form.

Motivation

"Otel env var support in APM client libraries" RFC

Additional Notes

Difference from specifications:

  • DD_TRACE_DEBUG: my understanding is that the “LOG_LEVEL” for OtelEnvironment should be set to null, which is not the case. The application uses the value of DD_TRACE_DEBUG so it behaves as expected but we don't get a warning or telemetry for this.

  • If OTEL_SERVICE_NAME and DD_SERVICE_NAME are set, we get the expected behaviour, warning and telemetry. but if OTEL_SERVICE_NAME and DD_SERVICE, the application uses the value of DD_SERVICE so it behaves as expected but we don't get a warning or telemetry for this.

  • We set the value of OTEL_METRIC_EXPORTER in OtelEnvironment for RUNTIME_METRICS_ENABLED without checking if a value for DD_RUNTIME_METRICS_ENABLED was set. The application uses the value of DD_RUNTIME_METRICS_ENABLED so it behaves as expected but we don't get a warning or telemetry for this.

  • Same for DD_TRACE_ENABLED and OTEL_TRACES_EXPORTER. The application uses the value of DD_TRACE_ENABLED so it behaves as expected but we don't get a warning or telemetry for this.

Notes on the Tests:

When a metric is generated, it's duplicated in the OtelEnvMetricCollectorTest tests. We call twice setupOteEnvironment() because of the separation of the configuration done in the rebuild function of DDSpecification between datadog/trace/api/InstrumenterConfig.java and internal-api/src/main/java/datadog/trace/api/Config.java.
We are calling ConfigProvider.createDefault()) twice which leads to having 2 identical metrics instead of 1.

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Aug 6, 2024

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master cecile/otelconfMap_telemetry
git_commit_date 1724945155 1724948422
git_commit_sha ea5731f 02438e9
release_version 1.39.0-SNAPSHOT~ea5731fd2d 1.39.0-SNAPSHOT~02438e9be8
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1724950769 1724950769
ci_job_id 621899649 621899649
ci_pipeline_id 43176876 43176876
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 47 metrics, 16 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.052 s) : 0, 1051694
Total [baseline] (8.501 s) : 0, 8500808
Agent [candidate] (1.051 s) : 0, 1050796
Total [candidate] (8.484 s) : 0, 8484078
section iast
Agent [baseline] (1.172 s) : 0, 1171588
Total [baseline] (8.983 s) : 0, 8983128
Agent [candidate] (1.177 s) : 0, 1177016
Total [candidate] (8.977 s) : 0, 8976861
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.174 s) : 0, 1173933
Total [baseline] (8.96 s) : 0, 8960228
Agent [candidate] (1.186 s) : 0, 1185609
Total [candidate] (8.978 s) : 0, 8977754
section iast_TELEMETRY_OFF
Agent [baseline] (1.17 s) : 0, 1170001
Total [baseline] (8.954 s) : 0, 8954361
Agent [candidate] (1.182 s) : 0, 1182182
Total [candidate] (8.977 s) : 0, 8977233
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.052 s -
Agent iast 1.172 s 119.894 ms (11.4%)
Agent iast_HARDCODED_SECRET_DISABLED 1.174 s 122.239 ms (11.6%)
Agent iast_TELEMETRY_OFF 1.17 s 118.307 ms (11.2%)
Total tracing 8.501 s -
Total iast 8.983 s 482.319 ms (5.7%)
Total iast_HARDCODED_SECRET_DISABLED 8.96 s 459.42 ms (5.4%)
Total iast_TELEMETRY_OFF 8.954 s 453.552 ms (5.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.051 s -
Agent iast 1.177 s 126.22 ms (12.0%)
Agent iast_HARDCODED_SECRET_DISABLED 1.186 s 134.813 ms (12.8%)
Agent iast_TELEMETRY_OFF 1.182 s 131.386 ms (12.5%)
Total tracing 8.484 s -
Total iast 8.977 s 492.783 ms (5.8%)
Total iast_HARDCODED_SECRET_DISABLED 8.978 s 493.676 ms (5.8%)
Total iast_TELEMETRY_OFF 8.977 s 493.155 ms (5.8%)
gantt
    title insecure-bank - break down per module: candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (671.996 ms) : 0, 671996
BytebuddyAgent [candidate] (670.604 ms) : 0, 670604
GlobalTracer [baseline] (306.505 ms) : 0, 306505
GlobalTracer [candidate] (306.543 ms) : 0, 306543
AppSec [baseline] (51.369 ms) : 0, 51369
AppSec [candidate] (51.646 ms) : 0, 51646
Remote Config [baseline] (675.474 µs) : 0, 675
Remote Config [candidate] (682.21 µs) : 0, 682
Telemetry [baseline] (7.516 ms) : 0, 7516
Telemetry [candidate] (7.678 ms) : 0, 7678
section iast
BytebuddyAgent [baseline] (777.949 ms) : 0, 777949
BytebuddyAgent [candidate] (781.379 ms) : 0, 781379
GlobalTracer [baseline] (295.492 ms) : 0, 295492
GlobalTracer [candidate] (298.022 ms) : 0, 298022
AppSec [baseline] (50.419 ms) : 0, 50419
AppSec [candidate] (51.574 ms) : 0, 51574
Remote Config [baseline] (588.32 µs) : 0, 588
Remote Config [candidate] (600.388 µs) : 0, 600
Telemetry [baseline] (8.637 ms) : 0, 8637
Telemetry [candidate] (7.427 ms) : 0, 7427
IAST [baseline] (24.928 ms) : 0, 24928
IAST [candidate] (24.363 ms) : 0, 24363
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (778.366 ms) : 0, 778366
BytebuddyAgent [candidate] (787.873 ms) : 0, 787873
GlobalTracer [baseline] (296.425 ms) : 0, 296425
GlobalTracer [candidate] (299.334 ms) : 0, 299334
AppSec [baseline] (50.619 ms) : 0, 50619
AppSec [candidate] (53.932 ms) : 0, 53932
Remote Config [baseline] (604.957 µs) : 0, 605
Remote Config [candidate] (595.477 µs) : 0, 595
Telemetry [baseline] (9.644 ms) : 0, 9644
Telemetry [candidate] (8.109 ms) : 0, 8109
IAST [baseline] (24.656 ms) : 0, 24656
IAST [candidate] (21.95 ms) : 0, 21950
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (775.571 ms) : 0, 775571
BytebuddyAgent [candidate] (783.243 ms) : 0, 783243
GlobalTracer [baseline] (296.422 ms) : 0, 296422
GlobalTracer [candidate] (300.395 ms) : 0, 300395
AppSec [baseline] (53.115 ms) : 0, 53115
AppSec [candidate] (52.707 ms) : 0, 52707
Remote Config [baseline] (588.877 µs) : 0, 589
Remote Config [candidate] (587.108 µs) : 0, 587
Telemetry [baseline] (7.807 ms) : 0, 7807
Telemetry [candidate] (7.283 ms) : 0, 7283
IAST [baseline] (22.92 ms) : 0, 22920
IAST [candidate] (24.226 ms) : 0, 24226
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.048 s) : 0, 1047798
Total [baseline] (10.345 s) : 0, 10345215
Agent [candidate] (1.052 s) : 0, 1052468
Total [candidate] (10.301 s) : 0, 10300683
section appsec
Agent [baseline] (1.184 s) : 0, 1184216
Total [baseline] (10.568 s) : 0, 10567572
Agent [candidate] (1.178 s) : 0, 1178100
Total [candidate] (10.485 s) : 0, 10485296
section iast
Agent [baseline] (1.181 s) : 0, 1180772
Total [baseline] (10.875 s) : 0, 10875037
Agent [candidate] (1.172 s) : 0, 1171797
Total [candidate] (10.764 s) : 0, 10764397
section profiling
Agent [baseline] (1.256 s) : 0, 1255614
Total [baseline] (10.623 s) : 0, 10622501
Agent [candidate] (1.247 s) : 0, 1246504
Total [candidate] (10.558 s) : 0, 10557593
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.048 s -
Agent appsec 1.184 s 136.419 ms (13.0%)
Agent iast 1.181 s 132.974 ms (12.7%)
Agent profiling 1.256 s 207.816 ms (19.8%)
Total tracing 10.345 s -
Total appsec 10.568 s 222.358 ms (2.1%)
Total iast 10.875 s 529.822 ms (5.1%)
Total profiling 10.623 s 277.286 ms (2.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.052 s -
Agent appsec 1.178 s 125.632 ms (11.9%)
Agent iast 1.172 s 119.33 ms (11.3%)
Agent profiling 1.247 s 194.037 ms (18.4%)
Total tracing 10.301 s -
Total appsec 10.485 s 184.613 ms (1.8%)
Total iast 10.764 s 463.714 ms (4.5%)
Total profiling 10.558 s 256.91 ms (2.5%)
gantt
    title petclinic - break down per module: candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (668.822 ms) : 0, 668822
BytebuddyAgent [candidate] (671.192 ms) : 0, 671192
GlobalTracer [baseline] (306.152 ms) : 0, 306152
GlobalTracer [candidate] (308.078 ms) : 0, 308078
AppSec [baseline] (51.15 ms) : 0, 51150
AppSec [candidate] (51.435 ms) : 0, 51435
Remote Config [baseline] (667.994 µs) : 0, 668
Remote Config [candidate] (669.485 µs) : 0, 669
Telemetry [baseline] (7.466 ms) : 0, 7466
Telemetry [candidate] (7.454 ms) : 0, 7454
section appsec
BytebuddyAgent [baseline] (688.458 ms) : 0, 688458
BytebuddyAgent [candidate] (682.765 ms) : 0, 682765
GlobalTracer [baseline] (301.308 ms) : 0, 301308
GlobalTracer [candidate] (302.147 ms) : 0, 302147
AppSec [baseline] (160.491 ms) : 0, 160491
AppSec [candidate] (161.459 ms) : 0, 161459
Remote Config [baseline] (612.63 µs) : 0, 613
Remote Config [candidate] (949.439 µs) : 0, 949
Telemetry [baseline] (9.519 ms) : 0, 9519
Telemetry [candidate] (7.892 ms) : 0, 7892
IAST [baseline] (20.448 ms) : 0, 20448
IAST [candidate] (18.513 ms) : 0, 18513
section iast
BytebuddyAgent [baseline] (783.063 ms) : 0, 783063
BytebuddyAgent [candidate] (778.898 ms) : 0, 778898
GlobalTracer [baseline] (298.464 ms) : 0, 298464
GlobalTracer [candidate] (295.492 ms) : 0, 295492
AppSec [baseline] (49.392 ms) : 0, 49392
AppSec [candidate] (52.581 ms) : 0, 52581
Remote Config [baseline] (616.49 µs) : 0, 616
Remote Config [candidate] (576.059 µs) : 0, 576
Telemetry [baseline] (9.611 ms) : 0, 9611
Telemetry [candidate] (7.306 ms) : 0, 7306
IAST [baseline] (26.011 ms) : 0, 26011
IAST [candidate] (23.34 ms) : 0, 23340
section profiling
BytebuddyAgent [baseline] (668.637 ms) : 0, 668637
BytebuddyAgent [candidate] (663.769 ms) : 0, 663769
GlobalTracer [baseline] (391.271 ms) : 0, 391271
GlobalTracer [candidate] (388.866 ms) : 0, 388866
AppSec [baseline] (52.829 ms) : 0, 52829
AppSec [candidate] (52.145 ms) : 0, 52145
Remote Config [baseline] (696.438 µs) : 0, 696
Remote Config [candidate] (676.455 µs) : 0, 676
Telemetry [baseline] (7.451 ms) : 0, 7451
Telemetry [candidate] (7.434 ms) : 0, 7434
ProfilingAgent [baseline] (97.156 ms) : 0, 97156
ProfilingAgent [candidate] (95.794 ms) : 0, 95794
Profiling [baseline] (97.18 ms) : 0, 97180
Profiling [candidate] (95.818 ms) : 0, 95818
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-08-29T16:32:10 2024-08-29T16:39:01
git_branch master cecile/otelconfMap_telemetry
git_commit_date 1724945155 1724948422
git_commit_sha ea5731f 02438e9
release_version 1.39.0-SNAPSHOT~ea5731fd2d 1.39.0-SNAPSHOT~02438e9be8
start_time 2024-08-29T16:31:57 2024-08-29T16:38:48
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1724949887 1724949887
ci_job_id 621899650 621899650
ci_pipeline_id 43176876 43176876
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 19 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.349 ms) : 1330, 1369
.   : milestone, 1349,
appsec (1.737 ms) : 1713, 1760
.   : milestone, 1737,
appsec_no_iast (1.733 ms) : 1709, 1757
.   : milestone, 1733,
iast (1.475 ms) : 1453, 1497
.   : milestone, 1475,
profiling (1.53 ms) : 1505, 1555
.   : milestone, 1530,
tracing (1.463 ms) : 1439, 1488
.   : milestone, 1463,
section candidate
no_agent (1.329 ms) : 1311, 1347
.   : milestone, 1329,
appsec (1.737 ms) : 1714, 1761
.   : milestone, 1737,
appsec_no_iast (1.718 ms) : 1694, 1743
.   : milestone, 1718,
iast (1.487 ms) : 1464, 1510
.   : milestone, 1487,
profiling (1.483 ms) : 1460, 1506
.   : milestone, 1483,
tracing (1.464 ms) : 1439, 1489
.   : milestone, 1464,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.349 ms [1.33 ms, 1.369 ms] -
appsec 1.737 ms [1.713 ms, 1.76 ms] 387.155 µs (28.7%)
appsec_no_iast 1.733 ms [1.709 ms, 1.757 ms] 383.826 µs (28.4%)
iast 1.475 ms [1.453 ms, 1.497 ms] 125.857 µs (9.3%)
profiling 1.53 ms [1.505 ms, 1.555 ms] 180.628 µs (13.4%)
tracing 1.463 ms [1.439 ms, 1.488 ms] 113.93 µs (8.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.329 ms [1.311 ms, 1.347 ms] -
appsec 1.737 ms [1.714 ms, 1.761 ms] 408.286 µs (30.7%)
appsec_no_iast 1.718 ms [1.694 ms, 1.743 ms] 389.193 µs (29.3%)
iast 1.487 ms [1.464 ms, 1.51 ms] 158.267 µs (11.9%)
profiling 1.483 ms [1.46 ms, 1.506 ms] 153.813 µs (11.6%)
tracing 1.464 ms [1.439 ms, 1.489 ms] 134.869 µs (10.1%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d
    dateFormat X
    axisFormat %s
section baseline
no_agent (364.772 µs) : 344, 386
.   : milestone, 365,
iast (481.631 µs) : 459, 504
.   : milestone, 482,
iast_FULL (556.314 µs) : 534, 578
.   : milestone, 556,
iast_GLOBAL (504.25 µs) : 483, 526
.   : milestone, 504,
iast_HARDCODED_SECRET_DISABLED (478.554 µs) : 456, 501
.   : milestone, 479,
iast_INACTIVE (437.07 µs) : 417, 458
.   : milestone, 437,
iast_TELEMETRY_OFF (468.674 µs) : 446, 491
.   : milestone, 469,
tracing (435.442 µs) : 415, 456
.   : milestone, 435,
section candidate
no_agent (365.121 µs) : 346, 385
.   : milestone, 365,
iast (484.463 µs) : 462, 507
.   : milestone, 484,
iast_FULL (547.492 µs) : 526, 569
.   : milestone, 547,
iast_GLOBAL (496.506 µs) : 475, 518
.   : milestone, 497,
iast_HARDCODED_SECRET_DISABLED (482.146 µs) : 461, 503
.   : milestone, 482,
iast_INACTIVE (454.232 µs) : 433, 476
.   : milestone, 454,
iast_TELEMETRY_OFF (471.066 µs) : 449, 493
.   : milestone, 471,
tracing (436.894 µs) : 416, 458
.   : milestone, 437,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 364.772 µs [343.587 µs, 385.957 µs] -
iast 481.631 µs [459.262 µs, 504.0 µs] 116.859 µs (32.0%)
iast_FULL 556.314 µs [534.341 µs, 578.288 µs] 191.542 µs (52.5%)
iast_GLOBAL 504.25 µs [482.593 µs, 525.907 µs] 139.478 µs (38.2%)
iast_HARDCODED_SECRET_DISABLED 478.554 µs [456.467 µs, 500.641 µs] 113.782 µs (31.2%)
iast_INACTIVE 437.07 µs [416.635 µs, 457.506 µs] 72.299 µs (19.8%)
iast_TELEMETRY_OFF 468.674 µs [446.448 µs, 490.901 µs] 103.902 µs (28.5%)
tracing 435.442 µs [415.367 µs, 455.517 µs] 70.671 µs (19.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 365.121 µs [345.538 µs, 384.705 µs] -
iast 484.463 µs [461.881 µs, 507.045 µs] 119.342 µs (32.7%)
iast_FULL 547.492 µs [526.185 µs, 568.799 µs] 182.37 µs (49.9%)
iast_GLOBAL 496.506 µs [475.488 µs, 517.524 µs] 131.385 µs (36.0%)
iast_HARDCODED_SECRET_DISABLED 482.146 µs [461.14 µs, 503.152 µs] 117.025 µs (32.1%)
iast_INACTIVE 454.232 µs [432.88 µs, 475.584 µs] 89.11 µs (24.4%)
iast_TELEMETRY_OFF 471.066 µs [448.682 µs, 493.451 µs] 105.945 µs (29.0%)
tracing 436.894 µs [416.172 µs, 457.617 µs] 71.773 µs (19.7%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master cecile/otelconfMap_telemetry
git_commit_date 1724945155 1724948422
git_commit_sha ea5731f 02438e9
release_version 1.39.0-SNAPSHOT~ea5731fd2d 1.39.0-SNAPSHOT~02438e9be8
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1724950414 1724950414
ci_job_id 621899651 621899651
ci_pipeline_id 43176876 43176876
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.451 ms) : 1440, 1462
.   : milestone, 1451,
appsec (2.223 ms) : 2188, 2257
.   : milestone, 2223,
iast (1.975 ms) : 1933, 2017
.   : milestone, 1975,
iast_GLOBAL (2.012 ms) : 1968, 2056
.   : milestone, 2012,
profiling (1.862 ms) : 1826, 1898
.   : milestone, 1862,
tracing (1.85 ms) : 1817, 1884
.   : milestone, 1850,
section candidate
no_agent (1.458 ms) : 1446, 1469
.   : milestone, 1458,
appsec (2.223 ms) : 2188, 2259
.   : milestone, 2223,
iast (1.971 ms) : 1929, 2014
.   : milestone, 1971,
iast_GLOBAL (2.015 ms) : 1971, 2058
.   : milestone, 2015,
profiling (1.862 ms) : 1827, 1898
.   : milestone, 1862,
tracing (1.83 ms) : 1797, 1863
.   : milestone, 1830,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.451 ms [1.44 ms, 1.462 ms] -
appsec 2.223 ms [2.188 ms, 2.257 ms] 771.989 µs (53.2%)
iast 1.975 ms [1.933 ms, 2.017 ms] 524.022 µs (36.1%)
iast_GLOBAL 2.012 ms [1.968 ms, 2.056 ms] 561.249 µs (38.7%)
profiling 1.862 ms [1.826 ms, 1.898 ms] 411.416 µs (28.4%)
tracing 1.85 ms [1.817 ms, 1.884 ms] 399.652 µs (27.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.458 ms [1.446 ms, 1.469 ms] -
appsec 2.223 ms [2.188 ms, 2.259 ms] 765.898 µs (52.5%)
iast 1.971 ms [1.929 ms, 2.014 ms] 513.91 µs (35.3%)
iast_GLOBAL 2.015 ms [1.971 ms, 2.058 ms] 557.2 µs (38.2%)
profiling 1.862 ms [1.827 ms, 1.898 ms] 404.98 µs (27.8%)
tracing 1.83 ms [1.797 ms, 1.863 ms] 372.823 µs (25.6%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.39.0-SNAPSHOT~02438e9be8, baseline=1.39.0-SNAPSHOT~ea5731fd2d
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.47 s) : 15470000, 15470000
.   : milestone, 15470000,
appsec (15.064 s) : 15064000, 15064000
.   : milestone, 15064000,
iast (18.627 s) : 18627000, 18627000
.   : milestone, 18627000,
iast_GLOBAL (18.275 s) : 18275000, 18275000
.   : milestone, 18275000,
profiling (15.193 s) : 15193000, 15193000
.   : milestone, 15193000,
tracing (14.94 s) : 14940000, 14940000
.   : milestone, 14940000,
section candidate
no_agent (15.431 s) : 15431000, 15431000
.   : milestone, 15431000,
appsec (15.041 s) : 15041000, 15041000
.   : milestone, 15041000,
iast (18.884 s) : 18884000, 18884000
.   : milestone, 18884000,
iast_GLOBAL (18.073 s) : 18073000, 18073000
.   : milestone, 18073000,
profiling (15.419 s) : 15419000, 15419000
.   : milestone, 15419000,
tracing (14.915 s) : 14915000, 14915000
.   : milestone, 14915000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.47 s [15.47 s, 15.47 s] -
appsec 15.064 s [15.064 s, 15.064 s] -406.0 ms (-2.6%)
iast 18.627 s [18.627 s, 18.627 s] 3.157 s (20.4%)
iast_GLOBAL 18.275 s [18.275 s, 18.275 s] 2.805 s (18.1%)
profiling 15.193 s [15.193 s, 15.193 s] -277.0 ms (-1.8%)
tracing 14.94 s [14.94 s, 14.94 s] -530.0 ms (-3.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.431 s [15.431 s, 15.431 s] -
appsec 15.041 s [15.041 s, 15.041 s] -390.0 ms (-2.5%)
iast 18.884 s [18.884 s, 18.884 s] 3.453 s (22.4%)
iast_GLOBAL 18.073 s [18.073 s, 18.073 s] 2.642 s (17.1%)
profiling 15.419 s [15.419 s, 15.419 s] -12.0 ms (-0.1%)
tracing 14.915 s [14.915 s, 14.915 s] -516.0 ms (-3.3%)

@cecile75 cecile75 force-pushed the cecile/otelconfMap_telemetry branch from 5c15e92 to 93632f4 Compare August 7, 2024 13:30
@cecile75 cecile75 marked this pull request as ready for review August 9, 2024 12:15
@cecile75 cecile75 requested review from a team as code owners August 9, 2024 12:15
@cecile75 cecile75 requested review from dougqh and nayeem-kamal August 9, 2024 12:15
@PerfectSlayer PerfectSlayer added type: enhancement Enhancements and improvements comp: telemetry Telemetry labels Aug 12, 2024
private void setMetricOtelEnvVarMetric(String metricName, final String... tags) {
if (!metricsQueue.offer(
new OtelEnvMetricCollector.OtelEnvMetric(NAMESPACE, true, metricName, "count", 1, tags))) {
log.warn("Unable to add telemetry metric {} for {}", metricName, tags[0]);
Copy link
Copy Markdown
Contributor

@mcculls mcculls Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use debug here because this isn't actionable by the user

type.equalsIgnoreCase("metrics") ? "runtime_metrics_enabled" : "trace_enabled";
otelEnvMetricCollector.setInvalidOtelEnvVarMetric(
"otel_" + type.toLowerCase() + "_exporter", "dd_" + ddEnvVar);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to do this more cleanly would be to add a second parameter to mapDataCollection which has the Datadog environment variable name - then you could call setInvalidOtelEnvVarMetric if that parameter is not-null, and use its value in the metric tag. If the parameter was null then you'd call setUnsupportedOtelEnvVarMetric instead.

Alternatively you could just call setUnsupportedOtelEnvVarMetric for all 3 cases - TBH either way we're recording that the user attempted to use a value that we didn't support or cannot map. And for these particular settings we will only really ever support none.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with setUnsupportedOtelEnvVarMetric for all 3 cases since it's not going to change in the future.

@cecile75 cecile75 requested a review from mcculls August 12, 2024 15:46
@bm1549 bm1549 requested a review from mtoffl01 August 16, 2024 20:54
Copy link
Copy Markdown
Contributor

@mtoffl01 mtoffl01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question directed at Cecile or anyone reviewing:
I assume the issues mentioned in your PR description are out of scope for this PR? But is there any momentum to get this work adjusted?
Cecile is "flagging" these issues with the PR description + the commented out tests, but there must be a better way to track this, to avoid future confusion about why these cases don't work as expected.

Comment:
Not blocking, but you probably could've formatted the OtelEnvMetricCollectorTest tests into tables/data driven tests, since there's so much repetition across tests.

setup:
injectEnvConfig('DD_SERVICE_NAME', 'DD_TEST_SERVICE', false)
injectEnvConfig('OTEL_SERVICE_NAME', 'OTEL_TEST_SERVICE', false)
injectEnvConfig('DD_TRACE_OTEL_ENABLED', 'true', false)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was confused why all the tests had DD_TRACE_OTEL_ENABLED, but then I deduced that OTEL must be enabled in order for the code that checks otel env vars to run (and ultimately for otel instrumentation to occur).
Maybe this just wasn't obvious to me because Go does not have this requirement, but it might be good to leave a comment in at least one spot stating why you have this env var set to true everywhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can deduce it from the test ""otel disabled - no metric" but ok, I'm adding a comment.

metrics.size() == 4
}

/// The following tests don't work because the code is not implemented in the OtelEnvironmentConfigSource
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description on your PR was helpful, maybe add a [succinct] version of that info here for quick recognition. E.g,

Suggested change
/// The following tests don't work because the code is not implemented in the OtelEnvironmentConfigSource
/// Although DD env vars take precedence as expected, no warning/telemetry is outputted when there is conflict between envvars in the below tests because the code is not implemented in the OtelEnvironmentConfigSource

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

public void setHidingOtelEnvVarMetric(String otelName, String ddName) {
setMetricOtelEnvVarMetric(
OTEL_ENV_HIDING_METRIC_NAME,
CONFIG_OTEL_KEY_TAG + otelName,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to use a cache on these concat operations to reduce allocation and ultimately garbage collection.
For now, I don't think it is mandatory.

Copy link
Copy Markdown
Contributor Author

@cecile75 cecile75 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougqh At the most, each CONFIG_OTEL_KEY_TAG + otelName will be done once at startup as we're only sending otel env var telemetry at startup maybe with exception of the conf key being modified by RC.

if (this.metricsQueue.isEmpty()) {
return Collections.emptyList();
}
List<OtelEnvMetricCollector.OtelEnvMetric> drained = new ArrayList<>(this.metricsQueue.size());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the sizing of the ArrayList.

Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're likely to want to add some caches to reduce allocation from the concatenation later on. However, I'm not sure about the usage patterns of the API, so I cannot say for certain if we must. I'll leave that to your discretion.

@cecile75
Copy link
Copy Markdown
Contributor Author

cecile75 commented Aug 28, 2024

Question directed at Cecile or anyone reviewing: I assume the issues mentioned in your PR description are out of scope for this PR? But is there any momentum to get this work adjusted? Cecile is "flagging" these issues with the PR description + the commented out tests, but there must be a better way to track this, to avoid future confusion about why these cases don't work as expected.

@mtoffl01
The scope of this PR was to add telemetry to the already implemented "Otel env var support in APM client libraries" RFC.
The discrepancies I've noticed prevent me to generate telemetry for those cases as we don't check them in the current code. So the least I could do was to write down these discrepancies in this PR description. Do you have an idea on how to track this better?

@mtoffl01
Copy link
Copy Markdown
Contributor

Question directed at Cecile or anyone reviewing: I assume the issues mentioned in your PR description are out of scope for this PR? But is there any momentum to get this work adjusted? Cecile is "flagging" these issues with the PR description + the commented out tests, but there must be a better way to track this, to avoid future confusion about why these cases don't work as expected.

@mtoffl01 The scope of this PR was to add telemetry to the already implemented "Otel env var support in APM client libraries" RFC. The discrepancies I've noticed prevent me to generate telemetry for those cases as we don't check them in the current code. So the least I could do was to write down these discrepancies in this PR description. Do you have an idea on how to track this better?

Yes, system tests (Credit to @zacharycmontoya for the idea). I'll look into what exists, feel free to DM me if you already know. But no need for it to block your PR.

@mtoffl01
Copy link
Copy Markdown
Contributor

mtoffl01 commented Aug 29, 2024

The issues you noted about lack of warning/telemetry can be tracked by enabling (& marking bugs where they exist) the Test_Otel_Env_Vars system tests system tests and enabling the test_telemetry_otel_env_hiding & test_telemetry_otel_env_invalid tests in test_telemetry.py.

The former should already be passing for Java (Except maybe on logging a warning for all conditions), but it was never enabled, so I'm working on that now. The latter can only be enabled after your PR is merged. Where the tests don't pass, we'll mark the bug. And that's how we'll track these issues!

@cecile75 cecile75 merged commit dca0c49 into master Aug 30, 2024
@cecile75 cecile75 deleted the cecile/otelconfMap_telemetry branch August 30, 2024 05:44
@github-actions github-actions Bot added this to the 1.39.0 milestone Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: telemetry Telemetry type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants