[MLOS-459] Support enriched evalmetric event submission by gsvigruha · Pull Request #7503 · DataDog/dd-trace-js

gsvigruha · 2026-02-12T03:20:17Z

What does this PR do?

Adds reasoning, metadata and assessment to the submitEvaluation endpoint.
Add support for JSON value type.

Motivation

Close feature gaps between the Python SDK.
Customer FR: https://datadoghq.atlassian.net/browse/MLOS-459

Test

github-actions · 2026-02-12T03:20:58Z

Overall package size

Self size: 4.61 MB
Deduped: 5.45 MB
No deduping: 5.45 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 2.0.6 | 81.92 kB | 813.08 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

codecov · 2026-02-12T03:21:00Z

Codecov Report

❌ Patch coverage is 0% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.18%. Comparing base (c589ad4) to head (27c9b5a).
⚠️ Report is 7 commits behind head on master.

Files with missing lines	Patch %	Lines
packages/dd-trace/src/llmobs/sdk.js	0.00%	21 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #7503      +/-   ##
==========================================
+ Coverage   80.13%   80.18%   +0.04%     
==========================================
  Files         730      731       +1     
  Lines       31104    31212     +108     
==========================================
+ Hits        24926    25026     +100     
- Misses       6178     6186       +8

Flag	Coverage Δ
aiguard-macos	`39.00% <0.00%> (-0.18%)`	⬇️
aiguard-ubuntu	`39.12% <0.00%> (-0.18%)`	⬇️
aiguard-windows	`38.86% <0.00%> (-0.18%)`	⬇️
apm-capabilities-tracing-macos	`48.76% <0.00%> (-0.04%)`	⬇️
apm-capabilities-tracing-ubuntu	`48.79% <0.00%> (-0.04%)`	⬇️
apm-capabilities-tracing-windows	`48.49% <0.00%> (-0.04%)`	⬇️
apm-integrations-child-process	`38.57% <0.00%> (-0.17%)`	⬇️
apm-integrations-couchbase-18	`37.47% <0.00%> (-0.17%)`	⬇️
apm-integrations-couchbase-eol	`37.96% <0.00%> (-0.02%)`	⬇️
apm-integrations-oracledb	`38.00% <0.00%> (-0.17%)`	⬇️
appsec-express	`55.37% <0.00%> (-0.16%)`	⬇️
appsec-fastify	`51.99% <0.00%> (-0.14%)`	⬇️
appsec-graphql	`52.33% <0.00%> (-0.14%)`	⬇️
appsec-kafka	`44.57% <0.00%> (-0.21%)`	⬇️
appsec-ldapjs	`44.32% <0.00%> (-0.15%)`	⬇️
appsec-lodash	`44.00% <0.00%> (-0.15%)`	⬇️
appsec-macos	`58.43% <0.00%> (-0.14%)`	⬇️
appsec-mongodb-core	`49.24% <0.00%> (-0.15%)`	⬇️
appsec-mongoose	`49.93% <0.00%> (-0.14%)`	⬇️
appsec-mysql	`51.33% <0.00%> (-0.14%)`	⬇️
appsec-node-serialize	`43.51% <0.00%> (-0.15%)`	⬇️
appsec-passport	`48.08% <0.00%> (-0.17%)`	⬇️
appsec-postgres	`51.08% <0.00%> (-0.14%)`	⬇️
appsec-sourcing	`42.86% <0.00%> (-0.15%)`	⬇️
appsec-template	`43.68% <0.00%> (-0.15%)`	⬇️
appsec-ubuntu	`58.51% <0.00%> (-0.14%)`	⬇️
appsec-windows	`58.30% <0.00%> (-0.14%)`	⬇️
instrumentations-instrumentation-bluebird	`32.27% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-body-parser	`40.78% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-child_process	`37.88% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-cookie-parser	`34.49% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-express	`34.83% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-express-mongo-sanitize	`34.63% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-express-session	`40.40% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-fs	`31.87% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-generic-pool	`30.19% <ø> (ø)`
instrumentations-instrumentation-http	`39.59% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-knex	`32.27% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-mongoose	`33.62% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-multer	`40.52% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-mysql2	`38.37% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-passport	`44.39% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-passport-http	`44.04% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-passport-local	`44.60% <0.00%> (-0.16%)`	⬇️
instrumentations-instrumentation-pg	`37.78% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-promise	`32.19% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-promise-js	`32.20% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-q	`32.24% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-url	`32.16% <0.00%> (-0.17%)`	⬇️
instrumentations-instrumentation-when	`32.21% <0.00%> (-0.17%)`	⬇️
llmobs-ai	`41.55% <0.00%> (-0.03%)`	⬇️
llmobs-anthropic	`40.60% <0.00%> (-0.16%)`	⬇️
llmobs-bedrock	`39.49% <0.00%> (-0.15%)`	⬇️
llmobs-google-genai	`40.10% <0.00%> (-0.16%)`	⬇️
llmobs-langchain	`39.64% <0.00%> (-0.13%)`	⬇️
llmobs-openai	`44.44% <0.00%> (-0.16%)`	⬇️
llmobs-vertex-ai	`40.31% <0.00%> (-0.23%)`	⬇️
platform-core	`29.71% <ø> (ø)`
platform-esbuild	`32.89% <ø> (ø)`
platform-instrumentations-misc	`40.53% <ø> (ø)`
platform-shimmer	`36.14% <ø> (ø)`
platform-unit-guardrails	`31.27% <ø> (ø)`
plugins-azure-event-hubs	`24.02% <ø> (ø)`
plugins-azure-service-bus	`23.42% <ø> (ø)`
plugins-bullmq	`43.70% <0.00%> (-0.18%)`	⬇️
plugins-cassandra	`38.04% <0.00%> (-0.31%)`	⬇️
plugins-cookie	`25.08% <ø> (ø)`
plugins-cookie-parser	`24.87% <ø> (ø)`
plugins-crypto	`24.72% <ø> (ø)`
plugins-dd-trace-api	`38.42% <0.00%> (-0.18%)`	⬇️
plugins-express-mongo-sanitize	`25.04% <ø> (ø)`
plugins-express-session	`24.83% <ø> (ø)`
plugins-fastify	`42.51% <0.00%> (-0.17%)`	⬇️
plugins-fetch	`38.57% <0.00%> (-0.16%)`	⬇️
plugins-fs	`38.67% <0.00%> (-0.18%)`	⬇️
plugins-generic-pool	`24.06% <ø> (ø)`
plugins-google-cloud-pubsub	`45.72% <0.00%> (-0.14%)`	⬇️
plugins-grpc	`41.25% <0.00%> (-0.17%)`	⬇️
plugins-handlebars	`25.08% <ø> (ø)`
plugins-hapi	`40.42% <0.00%> (-0.32%)`	⬇️
plugins-hono	`40.68% <0.00%> (-0.17%)`	⬇️
plugins-ioredis	`38.47% <0.00%> (-0.17%)`	⬇️
plugins-knex	`24.80% <ø> (ø)`
plugins-ldapjs	`22.61% <ø> (ø)`
plugins-light-my-request	`24.48% <ø> (ø)`
plugins-limitd-client	`32.56% <0.00%> (-0.17%)`	⬇️
plugins-lodash	`24.13% <ø> (ø)`
plugins-mariadb	`39.58% <0.00%> (?)`
plugins-memcached	`38.21% <0.00%> (-0.18%)`	⬇️
plugins-microgateway-core	`39.44% <0.00%> (-0.17%)`	⬇️
plugins-moleculer	`40.81% <0.00%> (-0.17%)`	⬇️
plugins-mongodb	`39.45% <0.00%> (-0.17%)`	⬇️
plugins-mongodb-core	`39.08% <0.00%> (-0.17%)`	⬇️
plugins-mongoose	`39.13% <0.00%> (-0.17%)`	⬇️
plugins-multer	`24.83% <ø> (ø)`
plugins-mysql	`39.22% <0.00%> (-0.15%)`	⬇️
plugins-mysql2	`39.32% <0.00%> (-0.17%)`	⬇️
plugins-node-serialize	`25.12% <ø> (ø)`
plugins-opensearch	`37.87% <0.00%> (-0.31%)`	⬇️
plugins-passport-http	`24.91% <ø> (ø)`
plugins-postgres	`35.73% <0.00%> (-0.14%)`	⬇️
plugins-process	`24.72% <ø> (ø)`
plugins-pug	`25.08% <ø> (ø)`
plugins-redis	`38.95% <0.00%> (-0.18%)`	⬇️
plugins-router	`43.43% <0.00%> (-0.03%)`	⬇️
plugins-sequelize	`23.66% <ø> (ø)`
plugins-test-and-upstream-amqp10	`38.55% <0.00%> (-0.02%)`	⬇️
plugins-test-and-upstream-amqplib	`43.89% <0.00%> (-0.18%)`	⬇️
plugins-test-and-upstream-apollo	`39.27% <0.00%> (-0.15%)`	⬇️
plugins-test-and-upstream-avsc	`38.81% <0.00%> (-0.18%)`	⬇️
plugins-test-and-upstream-bunyan	`33.87% <0.00%> (-0.17%)`	⬇️
plugins-test-and-upstream-connect	`41.10% <0.00%> (-0.17%)`	⬇️
plugins-test-and-upstream-graphql	`40.22% <0.00%> (-0.17%)`	⬇️
plugins-test-and-upstream-koa	`40.67% <0.00%> (-0.17%)`	⬇️
plugins-test-and-upstream-protobufjs	`39.05% <0.00%> (-0.18%)`	⬇️
plugins-test-and-upstream-rhea	`44.14% <0.00%> (-0.18%)`	⬇️
plugins-undici	`39.37% <0.00%> (-0.16%)`	⬇️
plugins-url	`24.72% <ø> (ø)`
plugins-valkey	`38.13% <0.00%> (-0.14%)`	⬇️
plugins-vm	`24.72% <ø> (ø)`
plugins-winston	`34.25% <0.00%> (-0.16%)`	⬇️
plugins-ws	`42.22% <0.00%> (-0.17%)`	⬇️
profiling-macos	`39.97% <0.00%> (-0.17%)`	⬇️
profiling-ubuntu	`40.10% <0.00%> (-0.17%)`	⬇️
profiling-windows	`41.34% <0.00%> (-0.17%)`	⬇️
serverless-azure-functions-client	`23.75% <ø> (ø)`
serverless-azure-functions-eventhubs	`23.75% <ø> (ø)`
serverless-azure-functions-servicebus	`23.75% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

pr-commenter · 2026-02-12T03:35:21Z

Benchmarks

Benchmark execution time: 2026-02-13 15:48:25

Comparing candidate commit 27c9b5a in PR branch gergely.svigruha/eval-data-model-new-fields with baseline commit c589ad4 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 231 metrics, 29 unstable metrics.

sabrenner

noting for transparency on the pr that we talked briefly offline that we might also have to include an update to use the v2 endpoint of the evaluation metrics api. this should not have an impact on the existing api here

packages/dd-trace/src/llmobs/sdk.js

…svigruha/eval-data-model-new-fields

index.d.ts

packages/dd-trace/src/llmobs/sdk.js

index.d.ts

sabrenner

just a couple more js-specific comments! thanks for attaching manual tests in the pr description 🙇

sabrenner · 2026-02-13T14:44:57Z

packages/dd-trace/src/llmobs/sdk.js

+      if (reasoning !== undefined) {
+        payload.reasoning = reasoning
+      }
+      if (metadata !== undefined) {
+        payload.metadata = metadata
+      }
+      if (assessment !== undefined) {
+        payload.assessment = assessment
+      }


these will allow null values through (in js, null !== undefined). we can do this instead

Suggested change

if (reasoning !== undefined) {

payload.reasoning = reasoning

}

if (metadata !== undefined) {

payload.metadata = metadata

}

if (assessment !== undefined) {

payload.assessment = assessment

}

if (reasoning != null) {

payload.reasoning = reasoning

}

if (metadata != null) {

payload.metadata = metadata

}

if (assessment != null) {

payload.assessment = assessment

}

as null == undefined

thanks! i can never wrap my head around TS null vs undefined and == vs ===

packages/dd-trace/src/llmobs/sdk.js

index.d.ts

sabrenner

looks great! unless the diff is bugged on my end it looks like we still have

assessment?: string

instead of 'pass' | 'fail' in the index.d.ts but not a blocker

gsvigruha · 2026-02-13T15:39:25Z

I added 'pass' | 'fail' to reasoning 🤦 thanks for the catch

* Add reasoning, assessment and metadata * more guards * nit * fix syntax * some unit tests * more tests * fix lint * undefined * address comments * partial revert * revert metadata * pass / fail * fix test * fix test * json * token * doh * fix message * fix doc * fixes * fix

gsvigruha added 2 commits February 11, 2026 22:16

Add reasoning, assessment and metadata

39b7e5f

more guards

6659f48

This comment has been minimized.

Sign in to view

gsvigruha added 3 commits February 11, 2026 22:22

nit

050159b

fix syntax

ec65a22

some unit tests

27c0bbb

gsvigruha added 2 commits February 11, 2026 22:36

more tests

f5d8c0c

fix lint

efa1beb

gsvigruha changed the title ~~Gergely.svigruha/eval data model new fields~~ [MLOS-459] Support enriched evalmetric event submission Feb 12, 2026

undefined

a9d1101

gsvigruha added the semver-patch label Feb 12, 2026

gsvigruha marked this pull request as ready for review February 12, 2026 03:55

gsvigruha requested a review from a team as a code owner February 12, 2026 03:55

gsvigruha added semver-minor and removed semver-patch labels Feb 12, 2026

sabrenner self-assigned this Feb 12, 2026

sabrenner reviewed Feb 12, 2026

View reviewed changes

packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved

packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved

packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved

packages/dd-trace/src/llmobs/sdk.js Show resolved Hide resolved

address comments

c28d69e

gsvigruha requested a review from a team as a code owner February 12, 2026 17:52

gsvigruha added 3 commits February 12, 2026 12:57

partial revert

b4ff637

revert metadata

c42986a

Merge branch 'master' of github.com:DataDog/dd-trace-js into gergely.…

fea646b

…svigruha/eval-data-model-new-fields

sabrenner reviewed Feb 12, 2026

View reviewed changes

index.d.ts Outdated Show resolved Hide resolved

gsvigruha added 4 commits February 12, 2026 15:19

pass / fail

7395f2c

fix test

9f5ab5b

fix test

f095bae

json

24ea08f

gsvigruha added 2 commits February 12, 2026 16:32

token

62572a6

doh

10dc76c

BridgeAR reviewed Feb 13, 2026

View reviewed changes

packages/dd-trace/src/llmobs/sdk.js Show resolved Hide resolved

fix message

f1a3609

aniszoubiramar reviewed Feb 13, 2026

View reviewed changes

index.d.ts Outdated Show resolved Hide resolved

aniszoubiramar reviewed Feb 13, 2026

View reviewed changes

index.d.ts Outdated Show resolved Hide resolved

fix doc

ef31b62

sabrenner reviewed Feb 13, 2026

View reviewed changes

fixes

7335b85

sabrenner previously approved these changes Feb 13, 2026

View reviewed changes

fix

27c9b5a

gsvigruha dismissed sabrenner’s stale review via 27c9b5a February 13, 2026 15:38

sabrenner approved these changes Feb 13, 2026

View reviewed changes

gsvigruha merged commit 4ed9593 into master Feb 13, 2026
919 of 924 checks passed

gsvigruha deleted the gergely.svigruha/eval-data-model-new-fields branch February 13, 2026 16:32

dd-octo-sts bot mentioned this pull request Feb 14, 2026

v5.87.0 proposal #7523

Merged

Conversation

gsvigruha commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Test

Uh oh!

This comment has been minimized.

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall package size

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pr-commenter bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

sabrenner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sabrenner left a comment

Choose a reason for hiding this comment

Uh oh!

sabrenner Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gsvigruha Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sabrenner left a comment

Choose a reason for hiding this comment

Uh oh!

gsvigruha commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gsvigruha commented Feb 12, 2026 •

edited

Loading

github-actions bot commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading

pr-commenter bot commented Feb 12, 2026 •

edited

Loading