Skip to content

ci: fix Windows job flakiness caused by dirty workspace#3694

Merged
Leiyks merged 6 commits intomasterfrom
leiyks/fix-windows-job-flakiness
Mar 17, 2026
Merged

ci: fix Windows job flakiness caused by dirty workspace#3694
Leiyks merged 6 commits intomasterfrom
leiyks/fix-windows-job-flakiness

Conversation

@Leiyks
Copy link
Copy Markdown
Contributor

@Leiyks Leiyks commented Mar 6, 2026

Summary

Fixes Windows CI job flakiness caused by leftover files from previous runs on persistent Windows runners.

Root causes fixed:

  • x64/Release/php_ddtrace.dll / .pdb: NTFS junction points/reparse points that PS 5.1 Remove-Item -Recurse and git checkout can't remove, causing git checkout to fail with "Invalid argument"
  • run-tests.php and PHP test files: held open with "Permission Denied" by previous job's processes
  • Docker containers from previous runs holding php_ddtrace.dll open across jobs

Solution:

  • Extracted shared windows_git_setup() function in generate-common.php used by all three affected Windows jobs
  • Kills leftover Docker containers before cleanup
  • Uses cmd.exe rd /s /q from the parent directory (handles junction points that PS 5.1 can't)
  • Manual git clone + git checkout with $LASTEXITCODE guards (PS 5.1 ignores $PSNativeCommandUseErrorActionPreference)
  • All three jobs now use GIT_STRATEGY: none to skip GitLab's built-in checkout

Jobs fixed:

  • compile extension windows (generate-package.php)
  • windows test_c (generate-tracer.php)
  • verify windows (generate-package.php) — uses a variant windows_git_setup_with_packages() that saves/restores the packages/ artifact around the workspace wipe

@datadog-official
Copy link
Copy Markdown

datadog-official bot commented Mar 6, 2026

⚠️ Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 1028 Tests failed

testSearchPhpBinaries from integration.DDTrace\Tests\Integration\PHPInstallerTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integration\PHPInstallerTest::testSearchPhpBinaries
Test code or tested code printed unexpected output: Searching for available php binaries, this operation might take a while.
testSimplePushAndProcess from laravel-58-test.DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V5_8\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b2ea3b00000000046f2ca99853ebe0
tid: 69b2ea3b00000000
hexProcessTraceId: 046f2ca99853ebe0
hexProcessSpanId: df86221a5e291fd3
processTraceId: 319523205483326432
processSpanId: 16106598613981405139

phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:106
testSimplePushAndProcess from laravel-8x-test.DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest (Datadog) (Fix with Cursor)
DDTrace\Tests\Integrations\Laravel\V8_x\QueueTest::testSimplePushAndProcess
Test code or tested code printed unexpected output: spanLinksTraceId: 69b2ea6a00000000c578b57468a1d1af
tid: 69b2ea6a00000000
hexProcessTraceId: c578b57468a1d1af
hexProcessSpanId: bfc8cd46aa21d8b7
processTraceId: 14229322534253351343
processSpanId: 13819521159972116663
View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: b0d2cde | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.30%. Comparing base (7d767af) to head (b0d2cde).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3694      +/-   ##
==========================================
- Coverage   62.40%   62.30%   -0.11%     
==========================================
  Files         142      142              
  Lines       13586    13586              
  Branches     1775     1775              
==========================================
- Hits         8479     8465      -14     
- Misses       4301     4314      +13     
- Partials      806      807       +1     

see 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d767af...b0d2cde. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Leiyks added 2 commits March 11, 2026 15:55
The cmd.exe "for /d" loop used to clean the workspace skips directory
entries during deletion (enumerates and deletes in the same pass), leaving
artifact output dirs from previous runs. When git clone then fails because
the workspace isn't empty, $PSNativeCommandUseErrorActionPreference = $true
is silently ignored on Windows PowerShell 5.1 (requires PS 7.3+), so the
script continues without source code and phpize.bat fails with exit 10.

Fixes:
- Replace cmd.exe cleanup loop with PowerShell-native Get-ChildItem | Remove-Item
  which handles each entry independently and tolerates locked files
- Add WARNING log line if any items could not be removed (aids debugging)
- Remove $PSNativeCommandUseErrorActionPreference (no-op on PS 5.1)
- Add explicit $LASTEXITCODE checks after git clone, checkout, and submodule init

Applied to both generate-package.php (compile extension windows) and
generate-tracer.php (windows test_c).
… cleanup

PowerShell 5.1's Remove-Item -Recurse throws "mismatch between the tag
specified in the request and the tag present in the reparse point" when
the workspace contains Windows junction points (created by switch-php,
e.g. /php <<===>> /php-nts) or NTFS symlinks (from core.symlinks=true
git clone). This caused the entire cleanup to fail silently, leaving
the full previous repo tree in place and making git clone fail again.

Fix: navigate to the parent directory and run cmd.exe "rd /s /q" on
the whole workspace directory. cmd.exe rd removes junction entries
without following them into their targets, avoiding the reparse point
issue entirely. The directory is then recreated empty before returning.
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from f068d19 to b7d680b Compare March 11, 2026 14:56
php_ddtrace.dll (and other workspace files) are locked with "Access is
denied" when a Docker container from a previous job run is still alive
with the workspace volume mounted. This causes rd /s /q to fail and
git clone to fail again.

Fix: force-remove all running containers (docker rm -f $(docker ps -aq))
before the rd /s /q workspace cleanup, releasing all file handles.
@Leiyks Leiyks changed the title ci: fix Windows workspace cleanup and fail-fast for git operations ci: fix Windows job flakiness caused by dirty workspace Mar 12, 2026
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 7f0beb4 to 181b7e6 Compare March 12, 2026 12:37
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 181b7e6 to c286608 Compare March 12, 2026 14:33
@Leiyks Leiyks marked this pull request as ready for review March 12, 2026 14:48
@Leiyks Leiyks requested a review from a team as a code owner March 12, 2026 14:48
Leiyks added 2 commits March 12, 2026 16:02
…leanup

Applies the same GIT_STRATEGY: none + manual clone pattern to the
verify windows job, saving/restoring the packages/ artifact around the
workspace cleanup to avoid git checkout failures on locked/junction files.
@Leiyks Leiyks force-pushed the leiyks/fix-windows-job-flakiness branch from 394ad5e to b0d2cde Compare March 12, 2026 16:16
@Leiyks Leiyks merged commit 93187d9 into master Mar 17, 2026
2063 checks passed
@Leiyks Leiyks deleted the leiyks/fix-windows-job-flakiness branch March 17, 2026 13:14
@github-actions github-actions bot added this to the 1.17.0 milestone Mar 17, 2026
bwoebi added a commit that referenced this pull request Mar 20, 2026
…dd-update

* 'master' of github.com:DataDog/dd-trace-php:
  feat(sidecar): add thread mode as fallback connection for restricted environments (#3573)
  Migrate deprecated GitLab runner tags (#3715)
  Adds process tags to remote config payload (#3658)
  perf(config): cache sys getenv (#3670)
  Fixes the tag name for process tags (#3709)
  Fix debugger ephemerals handling (#3685)
  Fix #3651: Prevent crash during shutdown in Frankenphp (#3662)
  Add dynamic instrumentation and exception replay to startup logging (#3667)
  chore: bump bytes crate from 1.9.0 to 1.11.1 to address CVE-2026-25541 (#3669)
  Merge pull request #3701 from DataDog/brian.marks/add-ksr-tag
  ci: fix Windows job flakiness caused by dirty workspace (#3694)
  Fixup CI owner association (#3704)
  Add Rust rewrite of the AppSec helper alongside the C++ implementation
  Remove debug instruction
  Fix script order
  debug
  Fix exploration logic
  chore(ci): add final_status property on junit XML [APMSP-2610]
  Fix DD_TRACE_SYMFONY_HTTP_ROUTE=false
  Optimize Symfony http.route caching with path map approach
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants