-
Notifications
You must be signed in to change notification settings - Fork 381
Description
Summary
When a test exhausts all Auto Test Retry (ATR) attempts and still fails, the test.has_failed_all_retries tag is not set on the last retry span. This tag is correctly set for Early Flake Detection (EFD) retries and Attempt-to-Fix retries, but the ATR code path was missed.
The bug affects all framework instrumentations: Jest, Mocha, Vitest, Playwright, and Cucumber.
Steps to Reproduce
- Enable Auto Test Retries (
flaky_test_retries_enabled: truein settings) - Run a test that always fails (exhausts all 5 default retries)
- Inspect the test spans
Expected: The last retry span has test.has_failed_all_retries: true
Actual: No span has the test.has_failed_all_retries tag. Only test.is_retry: true and test.retry_reason: auto_test_retry are set.
Reproduction with mockdog
# Using Shepherd (https://github.com/nicholasgasior/shepherd) e2e test infrastructure:
# Create an always-failing Jest test
cat > atrAlwaysFail.spec.js <<TESTEOF
describe('ATR Always Fail', () => {
it('test_always_fail', () => {
expect(1).toBe(2);
});
});
TESTEOF
# Run with ATR-enabled mockdog scenario (flaky_test_retries_enabled: true)
# Result: 6 spans (1 original + 5 retries), all fail, but no test.has_failed_all_retries tag on any spanRoot Cause
In the instrumentation code, failedAllTests / hasFailedAllRetries is only set to true under two conditions:
- Attempt-to-Fix retries (Test Management)
- EFD retries (Early Flake Detection)
There is no code path that sets it for ATR retries.
Affected files
All five framework instrumentations have the same gap:
| File | Variable | ATR sets it? |
|---|---|---|
packages/datadog-instrumentations/src/jest.js:579-669 |
failedAllTests |
No |
packages/datadog-instrumentations/src/mocha/utils.js:260-289 |
hasFailedAllRetries |
No |
packages/datadog-instrumentations/src/vitest.js:1027-1047 |
hasFailedAllRetries |
No |
packages/datadog-instrumentations/src/playwright.js:388-405 |
test._ddHasFailedAllRetries |
No |
packages/datadog-instrumentations/src/cucumber.js:312-368 |
hasFailedAllRetries |
No |
Jest example (packages/datadog-instrumentations/src/jest.js)
// Line 579: initialized to false
let failedAllTests = false
// Lines 594-599: only set for Attempt-to-Fix
if (isAttemptToFix) {
// ...
if (testStatuses.every(status => status === 'fail')) {
failedAllTests = true // ← ONLY for attempt-to-fix
}
}
// Lines 667-669: only set for EFD
if (efdRetryCount > 0 && testStatuses.length === efdRetryCount + 1 &&
testStatuses.every(status => status === 'fail')) {
failedAllTests = true // ← ONLY for EFD
}
// Lines 711-714: ATR retry detection — no failedAllTests logic
let isAtrRetry = false
if (this.isFlakyTestRetriesEnabled && event.test?.invocations > 1 && !isAttemptToFix && !isEfdRetry) {
isAtrRetry = true
// ← Missing: failedAllTests = true when all ATR retries fail
}Comparison with Ruby (datadog-ci-rb)
Ruby's datadog-ci-rb handles this correctly in lib/datadog/ci/test_retries/component.rb:132-133:
def tag_last_retry(test_span)
test_span&.set_tag(TAG_HAS_FAILED_ALL_RETRIES, "true") if test_span&.all_executions_failed?
endThis method is called for ALL retry strategies (ATR, EFD, attempt-to-fix) via a unified code path, ensuring consistent tagging regardless of the retry reason.
Suggested Fix
For each framework instrumentation, add ATR-specific logic that sets failedAllTests / hasFailedAllRetries to true when:
- ATR is enabled
- The test has exhausted all retry attempts (invocations === maxRetries + 1)
- Every execution (original + all retries) has status
fail
Jest example fix
// After the existing EFD block (~line 670), add:
// ATR: check if all retries have been exhausted and all failed
if (this.isFlakyTestRetriesEnabled && !isAttemptToFix && !isEfdRetry) {
const maxRetries = Number(this.global[RETRY_TIMES]) || 0
if (event.test?.invocations === maxRetries + 1 && status === 'fail') {
// All invocations failed (since ATR stops early on first pass,
// reaching maxRetries + 1 with a fail status means all attempts failed)
failedAllTests = true
}
}The same pattern should be applied to mocha, vitest, playwright, and cucumber instrumentations.
Integration Test Coverage
The existing integration tests only verify TEST_HAS_FAILED_ALL_RETRIES for EFD (jest.spec.js:2464) and attempt-to-fix (jest.spec.js:4746). A new test should be added for the ATR case, similar to the existing EFD test but with isFlakyTestRetriesEnabled: true instead of EFD settings.
Impact
- Severity: Low-medium. The core ATR retry behavior works correctly (retries happen, tagging with
test.is_retryandtest.retry_reasonis correct, build status is correct). Only the informationaltest.has_failed_all_retriestag is missing. - Affected users: Anyone using Auto Test Retries who filters or reports on tests that failed all retries — the Datadog UI or API queries filtering on this tag will miss ATR-exhausted tests.
Environment
- dd-trace-js version: 6.0.0-pre (master branch, tested 2026-03-03)
- Node.js: v24.13.1
- Test framework: Jest 27.5.1 (but affects all frameworks)
- OS: macOS (darwin arm64)