Fix DNS resolver object churn for multiple sessions#10897
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #10897 +/- ##
========================================
Coverage 98.75% 98.76%
========================================
Files 129 129
Lines 38931 39085 +154
Branches 2164 2169 +5
========================================
+ Hits 38447 38601 +154
Misses 336 336
Partials 148 148
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
CodSpeed Performance ReportMerging #10897 will improve performances by 8.06%Comparing Summary
Benchmarks breakdown
|
|
oh right, we support multiple event loops still so we need to account for that |
|
Something else is actually leaking them outside of this |
|
I need to move the fixture to top level and make it auto use to find it. I'm going to put it at top level, and than revert it after I fix all the test leaks |
This reverts commit fbc47e7.
|
We have far too many session leaks to fix them in this PR. I'll have to count before and after instead |
Backport to 3.12: 💔 cherry-picking failed — conflicts found❌ Failed to cleanly apply 4624fed on top of patchback/backports/3.12/4624fed82608d9b659c4284814a570e0b332b8db/pr-10897 Backporting merged PR #10897 into master
🤖 @patchback |
(cherry picked from commit 4624fed)
Resolvers were never closed when TCPConnector is closed. Thus, _DNSResolverManager.release_resolver is never called, keeping the resolver referenced from _loop_data, which prevents pycares from garbage collecting it. Regression introduced in 3.12.0 in aio-libs#10897
Resolvers were never closed when TCPConnector is closed. Thus, _DNSResolverManager.release_resolver is never called, keeping the resolver referenced from _loop_data, which prevents pycares from garbage collecting it. Regression introduced in 3.12.0 in aio-libs#10897
Resolvers were never closed when TCPConnector is closed. Thus, _DNSResolverManager.release_resolver is never called, keeping the resolver referenced from _loop_data, which prevents pycares from garbage collecting it. Regression introduced in 3.12.0 in aio-libs#10897
Since #5862 we recreate the aiohttp WebSession to force reload /etc/resolv.conf. However, since aiohttp 3.12.0 (first shipped with Supervisor 2025.05.4) aiohttp no longer initializes a separate DNSResolver instance (aio-libs/aiohttp#10897) for each WebSession. This rendered #5862 essentially useless. Force a separate DNSResolver isntance to force /etc/resolv.conf reread. Note: The automatic reload of /etc/resolv.conf doesn't work in our case since /etc is stored on a overlayfs. Inotify isn't working in this case.
Problem
As reported in #10847, aiohttp currently creates a new
DNSResolverobject for eachAsyncResolverinstance, leading to excessive resolver object churn when using multiple sessions. This causes unnecessary resource usage and potential performance degradation, as the c-ares library documentation indicates that a single resolver can handle unlimited queries.Solution
This PR introduces a shared resolver management system:
_DNSResolverManagersingleton class that maintains a single sharedaiodns.DNSResolverinstanceAsyncResolverto use the shared resolver for default arguments while still supporting custom resolver configurationsget_resolverandrelease_resolvermethodsThe key design point is requiring clients to be explicitly passed to
get_resolver, ensuring that the manager can track all users of the shared resolver and properly clean up when they're done.Implementation Details
_DNSResolverManageris a singleton that ensures only one shared resolver existsweakref.WeakSetto avoid reference cyclesBenefits
Related issue number
fixes #10847