Skip to content

[cDAC] Convert cDAC dump tests to run on Helix#124782

Draft
max-charlamb wants to merge 4 commits intodotnet:mainfrom
max-charlamb:cdac-dumptests-helix-2
Draft

[cDAC] Convert cDAC dump tests to run on Helix#124782
max-charlamb wants to merge 4 commits intodotnet:mainfrom
max-charlamb:cdac-dumptests-helix-2

Conversation

@max-charlamb
Copy link
Member

Replace the DumpCreation + DumpTest ADO stages in runtime-diagnostics.yml with a single CdacDumpTests stage that builds, prepares a Helix payload, and sends dump generation + test execution to Helix machines.

Changes:

  • Add BuildDebuggeesOnly target to DumpTests.targets for building debuggees without generating dumps
  • Add PrepareHelixPayload target + XUnitConsoleRunner package to csproj; generates debuggee-metadata.props from GetDumpTypes for dynamic discovery
  • Create cdac-dump-helix.proj Helix SDK project that imports the generated metadata, builds per-OS commands for dump generation and xunit test execution
  • Update runtime-diagnostics.yml to use Helix instead of ADO stages
  • Add windows_arm64 platform support

Copilot AI review requested due to automatic review settings February 24, 2026 03:42
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR converts the cDAC dump tests from running as ADO pipeline stages to running on Helix infrastructure. The change simplifies the pipeline by replacing separate DumpCreation and DumpTest stages with a single CdacDumpTests stage that generates dumps and runs tests on Helix machines.

Changes:

  • Adds Helix infrastructure for remote test execution
  • Enables windows_arm64 platform support
  • Consolidates dump generation and testing into a single Helix-based stage

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
cdac-dump-helix.proj New Helix SDK project that orchestrates dump generation and test execution on Helix; defines commands, queues, and work items
Microsoft.Diagnostics.DataContractReader.DumpTests.csproj Adds PrepareHelixPayload target to stage test artifacts and debuggees; adds XUnitConsoleRunner package reference
DumpTests.targets Adds BuildDebuggeesOnly target for building debuggees without generating dumps locally
runtime-diagnostics.yml Replaces two-stage ADO workflow (DumpCreation + DumpTest) with single CdacDumpTests stage using Helix

Comment on lines +91 to +92
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact equality conditions won't handle debuggees with DumpTypes="Heap;Full" (which is documented as supported in Debuggees/Directory.Build.props line 14). A debuggee with semicolon-separated dump types would be excluded from both _HeapDebuggee and _FullDebuggee lists.

To fix this, the DumpTypes metadata needs to be split into individual items first before filtering. A common MSBuild pattern for this is:

  1. Create an intermediate ItemGroup that transforms each debuggee×dumptype combination into separate items
  2. Then filter those items by dump type name

Alternatively, use Contains checks, but ensure they don't have false positives (e.g., check for the value equaling the type or containing it surrounded by semicolons).

Suggested change
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap' Or $([System.String]::Copy(';%(DumpTypes);').Contains(';Heap;'))" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full' Or $([System.String]::Copy(';%(DumpTypes);').Contains(';Full;'))" DumpDir="full" MiniDumpType="4" />

Copilot uses AI. Check for mistakes.
-bl:$(Build.SourcesDirectory)/artifacts/log/DumpTestPayload.binlog
displayName: 'Prepare Helix Payload'
- powershell: |
$testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PowerShell script uses a wildcard pattern net* to find the testhost directory and selects the first match. This could be fragile if multiple testhost directories exist (e.g., from previous builds or different configurations). Consider using a more specific pattern that includes the target architecture and OS, similar to how DumpTests.targets constructs the TestHostDir path (line 44): $(NetCoreAppCurrent)-$(HostOS)-$(_TestHostConfig)-$(_HostArch).

Suggested change
$testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*" | Select-Object -First 1 -ExpandProperty FullName
$testhostDir = Get-ChildItem -Directory -Path "$(Build.SourcesDirectory)/artifacts/bin/testhost/net*-$(osGroup)-$(_BuildConfig)-$(archType)" | Select-Object -First 1 -ExpandProperty FullName

Copilot uses AI. Check for mistakes.

<!-- Map TargetOS + TargetArchitecture to Helix queues -->
<PropertyGroup Condition="'$(HelixTargetQueues)' == ''">
<HelixTargetQueues Condition="'$(TargetOS)' == 'windows' AND '$(TargetArchitecture)' == 'x64'">Windows.11.Amd64.Client.Open</HelixTargetQueues>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep all helix queue references in the yaml files under eng? The Helix queue names get updated regularly. Having the names spread throughout the repo will make it harder.

@max-charlamb max-charlamb force-pushed the cdac-dumptests-helix-2 branch from d82438f to 3f3aefe Compare February 24, 2026 17:32
Copilot AI review requested due to automatic review settings February 24, 2026 17:46
@max-charlamb max-charlamb force-pushed the cdac-dumptests-helix-2 branch from 3f3aefe to d1ce702 Compare February 24, 2026 17:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment on lines 86 to 94
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
<_AllDebuggeeMetadata Include="@(_HeapDebuggee);@(_FullDebuggee)" />
</ItemGroup>

<ItemGroup>
<_MetadataLines Include="&lt;Project&gt;" />
<_MetadataLines Include=" &lt;ItemGroup&gt;" />
<_MetadataLines Include="@(_AllDebuggeeMetadata->' &lt;_Debuggee Include=&quot;%(Identity)&quot; DumpDir=&quot;%(DumpDir)&quot; MiniDumpType=&quot;%(MiniDumpType)&quot; /&gt;')" />
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition logic doesn't correctly handle debuggees with semicolon-separated DumpTypes (e.g., "Heap;Full"). The string comparison '%(DumpTypes)' == 'Heap' will fail when DumpTypes contains "Heap;Full", causing those debuggees to be excluded from both _HeapDebuggee and _FullDebuggee.

To fix this, you need to split the DumpTypes metadata first, similar to how DumpTests.targets handles this in the _GenerateDumpsForDebuggee target (lines 105-108). One approach would be to create separate items for each dump type before filtering:

<ItemGroup>
  <_DebuggeeTypeExpanded Include="@(_DebuggeeWithTypes->Metadata('DumpTypes'))" DebuggeeNameoriginal="%(Identity)" />
</ItemGroup>

Then filter based on the expanded items. Alternatively, use the MSBuild Contains function if available, though note that MSBuild item metadata conditions don't support the Contains function directly in this context.

Suggested change
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
<_AllDebuggeeMetadata Include="@(_HeapDebuggee);@(_FullDebuggee)" />
</ItemGroup>
<ItemGroup>
<_MetadataLines Include="&lt;Project&gt;" />
<_MetadataLines Include=" &lt;ItemGroup&gt;" />
<_MetadataLines Include="@(_AllDebuggeeMetadata->' &lt;_Debuggee Include=&quot;%(Identity)&quot; DumpDir=&quot;%(DumpDir)&quot; MiniDumpType=&quot;%(MiniDumpType)&quot; /&gt;')" />
<_DebuggeeTypeExpanded Include="@(_DebuggeeWithTypes->'%(DumpTypes)')" DebuggeeName="%(_DebuggeeWithTypes.Identity)" />
</ItemGroup>
<ItemGroup>
<_HeapDebuggee Include="@(_DebuggeeTypeExpanded)" Condition="'%(Identity)' == 'Heap'">
<DebuggeeName>%(DebuggeeName)</DebuggeeName>
<DumpDir>heap</DumpDir>
<MiniDumpType>2</MiniDumpType>
</_HeapDebuggee>
<_FullDebuggee Include="@(_DebuggeeTypeExpanded)" Condition="'%(Identity)' == 'Full'">
<DebuggeeName>%(DebuggeeName)</DebuggeeName>
<DumpDir>full</DumpDir>
<MiniDumpType>4</MiniDumpType>
</_FullDebuggee>
<_AllDebuggeeMetadata Include="@(_HeapDebuggee);@(_FullDebuggee)" />
</ItemGroup>
<ItemGroup>
<_MetadataLines Include="&lt;Project&gt;" />
<_MetadataLines Include=" &lt;ItemGroup&gt;" />
<_MetadataLines Include="@(_AllDebuggeeMetadata->' &lt;_Debuggee Include=&quot;%(DebuggeeName)&quot; DumpDir=&quot;%(DumpDir)&quot; MiniDumpType=&quot;%(MiniDumpType)&quot; /&gt;')" />

Copilot uses AI. Check for mistakes.
@max-charlamb max-charlamb force-pushed the cdac-dumptests-helix-2 branch 2 times, most recently from 3f93840 to 569cce9 Compare February 24, 2026 19:41
Copilot AI review requested due to automatic review settings February 24, 2026 19:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

<!-- Pre-commands: enable dump generation and set dump root for tests -->
<ItemGroup Condition="'$(TargetOS)' == 'windows'">
<!-- Allow heap dump generation with the unsigned locally-built DAC -->
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f" />
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows Helix pre-command that sets DisableAuxProviderSignatureCheck can fail on machines without admin rights. Unlike the existing MSBuild target in DumpTests.targets (which is opt-in and ignores failures), Helix will treat a failing pre-command as a work item failure. Consider making this opt-in (pipeline-controlled) and/or ensuring the command cannot fail the work item (e.g., ignore exit code and emit a diagnostic message instead).

Suggested change
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f" />
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f || echo Failed to set DisableAuxProviderSignatureCheck (non-fatal)" />

Copilot uses AI. Check for mistakes.
jobTemplate: /eng/pipelines/common/global-build-job.yml
buildConfig: release
platforms: ${{ parameters.cdacDumpPlatforms }}
shouldContinueOnError: true
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldContinueOnError: true for the whole CdacDumpTests platform matrix means failures in the build/Helix submission steps can be downgraded to SucceededWithIssues instead of failing the job. This risks the pipeline not gating on dump test failures (unlike the previous DumpTest stage which explicitly failed when tests had issues). Consider removing shouldContinueOnError, or adding a final explicit failure step when the job status is SucceededWithIssues so failures still block the stage as intended.

Suggested change
shouldContinueOnError: true

Copilot uses AI. Check for mistakes.
Replace the DumpCreation + DumpTest ADO stages in runtime-diagnostics.yml
with a single CdacDumpTests stage that builds debuggees on ADO and sends
dump generation + test execution to Helix machines.

Flow: ADO builds runtime + debuggees, prepares a Helix payload containing
test DLLs, debuggee binaries, and auto-generated dump type metadata. On
Helix, each debuggee is run to produce a crash dump, then xunit tests
validate the dumps.

Changes:
- Add BuildDebuggeesOnly target to DumpTests.targets using Exec with
  dotnet build (ensures implicit NuGet restore, matching _GenerateLocalDump)
- Add PrepareHelixPayload target + XUnitConsoleRunner package to csproj;
  copies tests + debuggees, generates debuggee-metadata.props from
  GetDumpTypes for dynamic debuggee discovery
- Create cdac-dump-helix.proj Helix SDK project that imports the generated
  metadata, builds per-OS dump generation + xunit test execution commands
- Update runtime-diagnostics.yml with single CdacDumpTests stage
- Add windows_arm64 platform support

Co-authored-by: Copilot <[email protected]>
@max-charlamb max-charlamb force-pushed the cdac-dumptests-helix-2 branch from 569cce9 to 3d682f9 Compare February 25, 2026 17:57
Copilot AI review requested due to automatic review settings February 26, 2026 17:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment on lines +62 to +68
<PropertyGroup Condition="'$(TargetOS)' == 'windows'">
<_TestCommands>@(_SourcePlatform->'set "CDAC_DUMP_ROOT=%HELIX_WORKITEM_PAYLOAD%\dumps\%(Identity)" &amp; %HELIX_CORRELATION_PAYLOAD%\dotnet.exe exec --runtimeconfig %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json %HELIX_WORKITEM_PAYLOAD%\tests\xunit.console.dll %HELIX_WORKITEM_PAYLOAD%\tests\Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults_%(Identity).xml -nologo', ' &amp; ')</_TestCommands>
</PropertyGroup>

<PropertyGroup Condition="'$(TargetOS)' != 'windows'">
<_TestCommands>@(_SourcePlatform->'CDAC_DUMP_ROOT=$HELIX_WORKITEM_PAYLOAD/dumps/%(Identity) $HELIX_CORRELATION_PAYLOAD/dotnet exec --runtimeconfig $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.runtimeconfig.json --depsfile $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.deps.json $HELIX_WORKITEM_PAYLOAD/tests/xunit.console.dll $HELIX_WORKITEM_PAYLOAD/tests/Microsoft.Diagnostics.DataContractReader.DumpTests.dll -xml testResults_%(Identity).xml -nologo', ' %3B ')</_TestCommands>
</PropertyGroup>
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The x-plat test work item chains multiple xunit runs with & (Windows) / ; (Unix). This means the overall work item exit code will be the last xunit invocation’s exit code, so failures from earlier source platforms can be masked if a later run passes. If this is intended to be caught via Helix's xUnit reporter, ensure the reporter is guaranteed to ingest all result files; otherwise aggregate failures explicitly (e.g., track an error flag per run and return non-zero at the end) or split into multiple Helix work items (one per source platform).

Copilot uses AI. Check for mistakes.
<!-- Pre-commands: enable dump generation -->
<ItemGroup Condition="'$(TargetOS)' == 'windows'">
<!-- Allow heap dump generation with the unsigned locally-built DAC -->
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f" />
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Helix pre-command that writes DisableAuxProviderSignatureCheck uses reg add ... without ignoring failures. Since this is a machine-wide HKLM write that may require elevation, a failure here can abort dump generation and prevent any dumps from being uploaded. Make this step best-effort (ignore errors) or otherwise ensure it won't fail the work item when the registry write isn't permitted.

Suggested change
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f" />
<HelixPreCommand Include="reg add &quot;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\MiniDumpSettings&quot; /v DisableAuxProviderSignatureCheck /t REG_DWORD /d 1 /f || echo DisableAuxProviderSignatureCheck registry write failed (ignored)" />

Copilot uses AI. Check for mistakes.
jobTemplate: /eng/pipelines/common/global-build-job.yml
buildConfig: release
platforms: ${{ parameters.cdacDumpPlatforms }}
shouldContinueOnError: true
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new Helix-based stages set shouldContinueOnError: true, which makes the send-to-helix-inner-step.yml invocation run with continueOnError: true. Unlike the previous ADO-based DumpTest flow, there’s no final “fail if tests failed” step, so Helix failures can leave the job/stage as SucceededWithIssues instead of failing the pipeline. If dump test failures are meant to be gating, consider removing shouldContinueOnError: true here or reintroducing an explicit failure step when Agent.JobStatus == SucceededWithIssues.

Suggested change
shouldContinueOnError: true

Copilot uses AI. Check for mistakes.

<!-- Copy xunit console runner files to tests directory -->
<ItemGroup>
<_XunitConsoleFiles Include="$([System.IO.Path]::GetDirectoryName('$(XunitConsoleNetCoreAppPath)'))\*" />
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrepareHelixPayload copies xunit console runner files using a hardcoded \* path separator (...GetDirectoryName('$(XunitConsoleNetCoreAppPath)'))\*). When this target runs on Linux/macOS build agents, the backslash can produce an invalid glob and skip copying the runner. Use NormalizePath/NormalizeDirectory (or $(MSBuildThisFileDirectory)-style joins) so the include pattern is OS-agnostic.

Suggested change
<_XunitConsoleFiles Include="$([System.IO.Path]::GetDirectoryName('$(XunitConsoleNetCoreAppPath)'))\*" />
<_XunitConsoleFiles Include="$([MSBuild]::NormalizeDirectory($([System.IO.Path]::GetDirectoryName('$(XunitConsoleNetCoreAppPath)'))))*" />

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +90
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated debuggee-metadata.props only emits metadata when %(DumpTypes) is exactly Heap or exactly Full. However DumpTypes is documented/used elsewhere as supporting Heap;Full (generate both). If any debuggee sets Heap;Full, it will be omitted from the Helix metadata and its dumps/tests won't run. Consider splitting %(DumpTypes) on ; and emitting one _Debuggee item per dump type.

Suggested change
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Heap'" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="'%(DumpTypes)' == 'Full'" DumpDir="full" MiniDumpType="4" />
<_HeapDebuggee Include="@(_DebuggeeWithTypes)" Condition="$([System.String]::Copy('%(DumpTypes)').Contains('Heap'))" DumpDir="heap" MiniDumpType="2" />
<_FullDebuggee Include="@(_DebuggeeWithTypes)" Condition="$([System.String]::Copy('%(DumpTypes)').Contains('Full'))" DumpDir="full" MiniDumpType="4" />

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants