-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Request fewer threads in the thread pool #57885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @mangod9 Issue Details
Fixes #39559
|
|
The first commit is a non-functional change if it would be easier to review the commits separately. |
Perf results14-core 28-thread x64 processor
32-core arm64 processor
|
src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPoolWorkQueue.cs
Outdated
Show resolved
Hide resolved
- Currently up to proc count yet-to-be-serviced thread requests are made when each work item is enqueued and when each work item is dequeued. Some of the conditions for making a thread request are speculative, making it difficult to reduce the cap without running into issues (see #8951 (comment)). - When the thread pool is not fully loaded, more threads are requested than necessary, causing more threads to wake up and compete for work items, and this shows a perf degradation at some point as load is decreased - Fixed by using a similar solution to https://github.com/dotnet/runtime/blob/50576e326d1015906608e3c06670344e335c3225/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEngine.Unix.cs#L209. With this change, at most one thread is requested at a time. A thread pool thread requests another thread for parallelization of work after dequeuing the first work item and does not request any more threads, leaving it up to whichever thread services the new thread request to parallelize further. - After this, there was a regression in ASP.NET benchmarks on arm64 with default number of connections, see #39559 for more info. Increased the spin-waiting duration on thread pool worker threads to compensate. Spin-waiting longer is preferrable to busy-waiting and competing with other threads for work items, and hitting a full wait and waking up frequently. - The change seems to solve the perf cliff seen at some point as load is decreased, especially on arm64. A portion of the perf improvements seen on arm64 at lower load is due to the increase in spin-waiting. Fixes #39559
|
Rebased |
|
Failure is known and unrelated |
|
Nice improvements on windows-x64: dotnet/perf-autofiling-issues#1366 and dotnet/perf-autofiling-issues#1339 |
|
Ubuntu/x64 improvements: dotnet/perf-autofiling-issues#1330 |
|
Arm64 improvements: dotnet/perf-autofiling-issues#1448 |
|
arm64 improvements: dotnet/perf-autofiling-issues#1543 |
|
arm64 improvements: dotnet/perf-autofiling-issues#1542 |
|
Can these changes explain a perf improvement on Windows for Fortunes? |
Possibly, it seems like the likely change out of the set. Fortunes queues fewer work items to the worker pool, so it would probably benefit more from this change compared to the faster platform benchmarks, but typically thread pool changes (at least on Linux) have not changed perf much on Fortunes, so not sure. Things work a bit differently on Windows though, as the IO pool threads may be competing for time with the worker pool threads. |
Fixes #39559