test(agent): add timeout and error logging to checkAgentStatus#7724
test(agent): add timeout and error logging to checkAgentStatus#7724
Conversation
The checkAgentStatus function in the test agent helper makes an HTTP request to the test agent with no timeout. If the TCP connection is established but no HTTP response is received, the promise hangs indefinitely. This is a likely cause of flaky test timeouts with no indication of the root cause (e.g. the AI Guard Windows CI job). Add a 2s timeout with a descriptive warning so hangs are caught early and the cause is visible in CI logs. Also log unexpected errors (other than ECONNREFUSED, which is the normal "no test agent" case) to aid future debugging.
Overall package sizeSelf size: 4.95 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.0 | 81.15 kB | 815.98 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7724 +/- ##
=======================================
Coverage 80.38% 80.39%
=======================================
Files 741 741
Lines 32063 32064 +1
=======================================
+ Hits 25775 25777 +2
+ Misses 6288 6287 -1 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2026-03-10 09:58:35 Comparing candidate commit 527b91b in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 230 metrics, 30 unstable metrics. |
The checkAgentStatus function in the test agent helper makes an HTTP request to the test agent with no timeout. If the TCP connection is established but no HTTP response is received, the promise hangs indefinitely. This is a likely cause of flaky test timeouts with no indication of the root cause (e.g. the AI Guard Windows CI job). Add a 2s timeout with a descriptive warning so hangs are caught early and the cause is visible in CI logs. Also log unexpected errors (other than ECONNREFUSED, which is the normal "no test agent" case) to aid future debugging.
The checkAgentStatus function in the test agent helper makes an HTTP request to the test agent with no timeout. If the TCP connection is established but no HTTP response is received, the promise hangs indefinitely. This is a likely cause of flaky test timeouts with no indication of the root cause (e.g. the AI Guard Windows CI job). Add a 2s timeout with a descriptive warning so hangs are caught early and the cause is visible in CI logs. Also log unexpected errors (other than ECONNREFUSED, which is the normal "no test agent" case) to aid future debugging.

What does this PR do?
Improves observability and resilience of the
checkAgentStatus()function in the test agent helper (packages/dd-trace/test/plugins/agent.js). This function checks whether a real test agent is running before each test, but previously had no timeout on its HTTP request — meaning if a TCP connection was established but no HTTP response received, the promise would hang indefinitely, causing opaque Mocha timeouts.Changes:
checkAgentStatus()HTTP request with a descriptive warning when hit, so the root cause is immediately visible in CI logs instead of surfacing as a generic test timeout.ECONNREFUSED, which is the normal "no test agent" case) to aid future debugging.Motivation
The AI Guard Windows CI job has been experiencing flaky timeouts in the
beforeEachhook, which callsagent.load()→checkAgentStatus(). The lack of a timeout on the HTTP request is a likely cause — if something on the CI runner accepts the TCP connection on port 9126 without speaking HTTP, the test hangs with no useful diagnostic output. This change ensures hangs are caught early with a clear message.