Skip to content

fix: Handle both SQS StringValue and SNS BinaryValue in context extraction#7205

Merged
robcarlan-datadog merged 2 commits intomasterfrom
rob.carlan/DSMON-901-read-sqs-attributes-b64
Jul 11, 2025
Merged

fix: Handle both SQS StringValue and SNS BinaryValue in context extraction#7205
robcarlan-datadog merged 2 commits intomasterfrom
rob.carlan/DSMON-901-read-sqs-attributes-b64

Conversation

@robcarlan-datadog
Copy link
Copy Markdown
Contributor

@robcarlan-datadog robcarlan-datadog commented Jul 10, 2025

Summary of changes

SNS -> SQS context propagation was broken, as we encode the value of the _datadog header into base64. On the consume side, we assume it is written as a string, which is only true if we send a message directly through SQS.

The fix is to decode the _datadog value based on the type, just as in this PR.

This soles issues where the DSM map doesn't show links between SNS and SQS, as the checkpoint is still created on the consumer, just without a link to the parent.

Reason for change

Implementation details

Test coverage

I tested this via system-tests, with a description of my test methodology in this PR.
The SNS system test above pass when I ran locally with this tracer build.

Other details

@dd-trace-dotnet-ci-bot
Copy link
Copy Markdown

dd-trace-dotnet-ci-bot Bot commented Jul 10, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (72ms)  : 71, 74
     .   : milestone, 72,
    master - mean (72ms)  : 71, 73
     .   : milestone, 72,

    section Baseline
    This PR (7205) - mean (68ms)  : 66, 71
     .   : milestone, 68,
    master - mean (69ms)  : 65, 73
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (1,023ms)  : 996, 1050
     .   : milestone, 1023,
    master - mean (1,023ms)  : 1000, 1045
     .   : milestone, 1023,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (105ms)  : 104, 106
     .   : milestone, 105,
    master - mean (105ms)  : 104, 106
     .   : milestone, 105,

    section Baseline
    This PR (7205) - mean (104ms)  : 102, 107
     .   : milestone, 104,
    master - mean (104ms)  : 101, 107
     .   : milestone, 104,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (703ms)  : 682, 724
     .   : milestone, 703,
    master - mean (705ms)  : 688, 721
     .   : milestone, 705,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (92ms)  : 91, 93
     .   : milestone, 92,
    master - mean (93ms)  : 91, 94
     .   : milestone, 93,

    section Baseline
    This PR (7205) - mean (92ms)  : 90, 94
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (664ms)  : 639, 690
     .   : milestone, 664,
    master - mean (666ms)  : 643, 690
     .   : milestone, 666,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (199ms)  : 193, 206
     .   : milestone, 199,
    master - mean (196ms)  : 191, 200
     .   : milestone, 196,

    section Baseline
    This PR (7205) - mean (195ms)  : 187, 204
     .   : milestone, 195,
    master - mean (192ms)  : 185, 198
     .   : milestone, 192,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (1,151ms)  : 1121, 1180
     .   : milestone, 1151,
    master - mean (1,126ms)  : 1103, 1149
     .   : milestone, 1126,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (276ms)  : 270, 282
     .   : milestone, 276,
    master - mean (272ms)  : 266, 279
     .   : milestone, 272,

    section Baseline
    This PR (7205) - mean (275ms)  : 267, 283
     .   : milestone, 275,
    master - mean (270ms)  : 262, 278
     .   : milestone, 270,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (914ms)  : 881, 947
     .   : milestone, 914,
    master - mean (897ms)  : 870, 924
     .   : milestone, 897,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (273ms)  : 265, 280
     .   : milestone, 273,
    master - mean (264ms)  : 258, 271
     .   : milestone, 264,

    section Baseline
    This PR (7205) - mean (271ms)  : 264, 278
     .   : milestone, 271,
    master - mean (266ms)  : 261, 272
     .   : milestone, 266,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (901ms)  : 879, 924
     .   : milestone, 901,
    master - mean (885ms)  : 857, 913
     .   : milestone, 885,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7205) - mean (271ms)  : 263, 279
     .   : milestone, 271,
    master - mean (267ms)  : 260, 275
     .   : milestone, 267,

    section Baseline
    This PR (7205) - mean (272ms)  : 264, 280
     .   : milestone, 272,
    master - mean (267ms)  : 258, 276
     .   : milestone, 267,

    section CallTarget+Inlining+NGEN
    This PR (7205) - mean (809ms)  : 776, 841
     .   : milestone, 809,
    master - mean (798ms)  : 773, 822
     .   : milestone, 798,

Loading

@robcarlan-datadog robcarlan-datadog marked this pull request as ready for review July 11, 2025 01:01
@robcarlan-datadog robcarlan-datadog requested review from a team as code owners July 11, 2025 01:01
Copy link
Copy Markdown
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one perf-related suggestion, thaanks!

Copy link
Copy Markdown
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@robcarlan-datadog
Copy link
Copy Markdown
Contributor Author

Confirmed system-tests passed locally when using the agent after my most recent commit

@robcarlan-datadog robcarlan-datadog merged commit 63ad7e2 into master Jul 11, 2025
150 checks passed
@robcarlan-datadog robcarlan-datadog deleted the rob.carlan/DSMON-901-read-sqs-attributes-b64 branch July 11, 2025 15:17
@github-actions github-actions Bot added this to the vNext-v3 milestone Jul 11, 2025
chojomok pushed a commit that referenced this pull request Jul 15, 2025
…ction (#7205)

## Summary of changes
SNS -> SQS context propagation was broken, as we encode the value of the
`_datadog` header into base64. On the consume side, we assume it is
written as a string, which is only true if we send a message directly
through SQS.

The fix is to decode the `_datadog` value based on the type, just as in
[this](DataDog/datadog-lambda-js#269) PR.

This soles issues where the DSM map doesn't show links between SNS and
SQS, as the checkpoint is still created on the consumer, just without a
link to the parent.

## Reason for change

## Implementation details

## Test coverage
I tested this via system-tests, with a description of my test
methodology in [this
PR](DataDog/system-tests#4897).
The SNS system test above pass when I ran locally with this tracer
build.

## Other details
<!-- Fixes #{issue} -->

<!-- ⚠️ Note: where possible, please obtain 2 approvals prior to
merging. Unless CODEOWNERS specifies otherwise, for external teams it is
typically best to have one review from a team member, and one review
from apm-dotnet. Trivial changes do not require 2 reviews. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants