Skip to content

src: fix libuv assertion on windows#61999

Open
liuxingbaoyu wants to merge 2 commits intonodejs:mainfrom
liuxingbaoyu:fix-56645
Open

src: fix libuv assertion on windows#61999
liuxingbaoyu wants to merge 2 commits intonodejs:mainfrom
liuxingbaoyu:fix-56645

Conversation

@liuxingbaoyu
Copy link
Copy Markdown
Contributor

Set the pointer to nullptr after calling uv_close to avoid assertions.

Fixes: #56645

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Feb 26, 2026
Set the pointer to `nullptr` after calling `uv_close`
to avoid assertions.

Fixes: nodejs#56645
@juanarbol
Copy link
Copy Markdown
Member

That handle is gonna leak, and I don't think it's gonna fix anything.

This looks pretty much a AI-generated solution... that does actually nothing regarding your intention. πŸ’”

@liuxingbaoyu
Copy link
Copy Markdown
Contributor Author

This is not an AI-generated solution, it's an inspiration I got from other code.

node/src/node_platform.cc

Lines 411 to 436 in a8eb690

void PerIsolatePlatformData::Shutdown() {
auto foreground_tasks_locked = foreground_tasks_.Lock();
auto foreground_delayed_tasks_locked = foreground_delayed_tasks_.Lock();
foreground_delayed_tasks_locked.PopAll();
foreground_tasks_locked.PopAll();
scheduled_delayed_tasks_.clear();
if (flush_tasks_ != nullptr) {
// Both destroying the scheduled_delayed_tasks_ lists and closing
// flush_tasks_ handle add tasks to the event loop. We keep a count of all
// non-closed handles, and when that reaches zero, we inform any shutdown
// callbacks that the platform is done as far as this Isolate is concerned.
self_reference_ = shared_from_this();
uv_close(reinterpret_cast<uv_handle_t*>(flush_tasks_),
[](uv_handle_t* handle) {
std::unique_ptr<uv_async_t> flush_tasks{
reinterpret_cast<uv_async_t*>(handle)};
PerIsolatePlatformData* platform_data =
static_cast<PerIsolatePlatformData*>(flush_tasks->data);
platform_data->DecreaseHandleCount();
platform_data->self_reference_.reset();
});
flush_tasks_ = nullptr;
}
}

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

βœ… All modified and coverable lines are covered by tests.
βœ… Project coverage is 89.73%. Comparing base (488a854) to head (a53efb0).
⚠️ Report is 34 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #61999      +/-   ##
==========================================
- Coverage   89.77%   89.73%   -0.05%     
==========================================
  Files         674      676       +2     
  Lines      205705   206068     +363     
  Branches    39449    39514      +65     
==========================================
+ Hits       184670   184907     +237     
- Misses      13280    13312      +32     
- Partials     7755     7849      +94     
Files with missing lines Coverage Ξ”
src/node_platform.cc 76.19% <100.00%> (-0.06%) ⬇️

... and 51 files with indirect coverage changes

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@juanarbol
Copy link
Copy Markdown
Member

This is not an AI-generated solution, it's an inspiration I got from other code.

My bad... but sadly still... I don't think this is gonna fix the fundamental problem at all and still leaking.

Comment thread src/node_platform.cc Outdated
double delay_in_seconds) {
auto locked = tasks_.Lock();

if (flush_tasks_ == nullptr) return;
Copy link
Copy Markdown
Member

@addaleax addaleax Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "just not sending" is indeed a valid solution to this issue (which I have not verified), you could set a boolean flag next to flush_tasks_ instead of making it a heap-allocated variable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

v8 is calling PostDelayedTaskOnWorkerThread here, which looks like a task for wasm caching.
https://github.com/v8/v8/blob/1fb9a7b9671a724ebdcf57db3807d4af56dc6cbb/src/wasm/module-compiler.cc#L4004-L4011
Ignoring it should be safe, because anyway, the delay here is 2 seconds, and the node will have finished long before then.

@liuxingbaoyu
Copy link
Copy Markdown
Contributor Author

@juanarbol I understand now. The AI ​​is telling me that the handle wasn't deleted. AI ​​is more familiar with C++ than I am. πŸ€¦β€β™‚οΈ

Comment thread src/node_platform.cc
timers.push_back(timer);
for (uv_timer_t* timer : timers)
scheduler_->TakeTimerTask(timer);
scheduler_->has_shut_down_ = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This access doesn't look thread safe. I think adding auto locked = scheduler_->tasks_.Lock(); should do it.
Another, and maybe simpler option, could be moving the check to the Stop() method so we are sure after a StopTask is pushed, no other task will.

void Stop() {
    auto locked = tasks_.Lock();
    if (has_shut_down_) return;
    has_shut_down_ = true;
    locked.Push(std::make_unique<StopTask>(this));
    uv_async_send(&flush_tasks_);
}

sanyasamineva0x added a commit to TrueConf/trueconf-openclaw-channel that referenced this pull request Apr 30, 2026
* fix(bin): work around libuv async assertion on Windows exit

trueconf-setup crashes at the end of a successful run on Windows + Node
24.x with `assertion failed !(handle->flags & UV_HANDLE_CLOSING)` in
src/win/async.c. Root cause is the open Node issue nodejs/node#56645 β€”
Node calls `uv_async_send()` on a closing handle when `process.exit()`
short-circuits libuv teardown right after `fetch()`. PR
nodejs/node#61999 has been open since Feb 2026 with no backport.

Two cooperating workarounds, both documented in the upstream thread:

- src/probe.mjs: validateOAuthCredentials now always owns its undici
  Agent and closes it in finally, so no keep-alive sockets dangle into
  loop teardown. Cleanup is best-effort β€” errors are swallowed so the
  OAuth result classification stays unaffected.

- bin/trueconf-setup.mjs: CLI entry sets `process.exitCode` instead of
  calling `process.exit()`, letting the loop drain naturally after the
  dispatcher and clack readline release their handles. Shell exit
  contract is unchanged.

Tests:
- probe.test.mjs: assert dispatcher.close() runs on the success path
  and on the fetch-throws path.
- bin-cli-subprocess-exits.test.ts: spawn the CLI as a real subprocess
  and assert it exits within an 8s budget on success (0) and OAuth
  failure (1) β€” catches loop-hang regressions on any platform.

970 tests | 2 skipped on macOS, tsc clean, oxlint clean. Windows UAT
pending.

* fix(bin): watchdog + tighter exit-budget test on review feedback

Two review findings on the prior libuv-assertion commit:

- bin/trueconf-setup.mjs: a future regression that leaks an event-loop
  handle (clack stdin in raw mode, sharp libvips worker, an unref-missed
  timer) would now silently freeze the wizard's terminal forever instead
  of crashing. Add a 10-second .unref()'d watchdog that prints a clear
  diagnostic to stderr and forces exit, so a leak is loud, not silent.
  The timer self-defeats on the happy path because .unref() lets the
  loop drain ahead of it.

- tests/integration/bin-cli-subprocess-exits.test.ts: the prior version
  had three weaknesses. (1) The CLI entry block does not parse
  --config from argv, so passing `--config /tmp/x.json` was silently
  ignored and the success test wrote a fake-server config into the
  developer's real ~/.openclaw/openclaw.json. Sandbox via `HOME` and
  `USERPROFILE` env vars instead β€” that's the only knob the wizard's
  default `homedir()` lookup reads. (2) The 8s budget was too lax: a
  partial leak (5s timer ref) would slip under the kill timer
  unnoticed. Assert elapsedMs < 3000 explicitly so partial regressions
  fail loud. (3) Hard-coded port `1` for the failure path can SYN-drop
  on CI firewalls and time out, masking a real loop-hang as a flake.
  Reserve a closed loopback port via listen(0) β†’ close instead β€” that
  reliably ECONNREFUSEs.

970 tests | 2 skipped on macOS, tsc clean, oxlint clean. Windows UAT
still pending.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. review wanted PRs that need reviews.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

libuv assertion on Windows with Node.js 23.x

5 participants