Skip to content

Recreate aiohttp ClientSession after DNS plug-in load#5862

Merged
agners merged 6 commits intomainfrom
recreate-websession-after-dns-plug-in-load
May 6, 2025
Merged

Recreate aiohttp ClientSession after DNS plug-in load#5862
agners merged 6 commits intomainfrom
recreate-websession-after-dns-plug-in-load

Conversation

@agners
Copy link
Copy Markdown
Member

@agners agners commented May 1, 2025

Proposed change

Create a temporary ClientSession early in case we need to load version information from the internet. This doesn't use the final DNS setup and hence might fail to load in certain situations since we don't have the fallback mechanims in place yet. But if the DNS container image is present, we'll continue the setup and load the DNS plug-in. We then can recreate the ClientSession such that it uses the DNS plug-in.

This works around an issue with aiodns, which today doesn't reload resolv.conf automatically when it changes. This lead to Supervisor using the initial resolv.conf as created by Docker. It meant that we did not use the DNS plug-in (and its fallback capabilities) in Supervisor. Also it meant that changes to the DNS setup at runtime did not propagate to the aiohttp ClientSession (as observed in #5332).

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

Summary by CodeRabbit

  • Refactor

    • Improved initialization and management of network sessions for better reliability and error handling.
    • Enhanced DNS plugin setup to ensure correct DNS server usage during network operations.
    • Centralized valid API states for consistent system validation.
  • Tests

    • Updated test suites to use improved network session and connectivity mocks, increasing test reliability and clarity.
    • Added new fixtures and mock classes to streamline HTTP response handling in tests.
    • Revised test function signatures to support new mocking strategies and dependencies.
    • Simplified HTTP call mocking by directly mocking session methods in supervisor tests.

agners added 3 commits May 1, 2025 11:08
Create a temporary ClientSession early in case we need to load version
information from the internet. This doesn't use the final DNS setup
and hence might fail to load in certain situations since we don't have
the fallback mechanims in place yet. But if the DNS container image
is present, we'll continue the setup and load the DNS plug-in. We then
can recreate the ClientSession such that it uses the DNS plug-in.

This works around an issue with aiodns, which today doesn't reload
`resolv.conf` automatically when it changes. This lead to Supervisor
using the initial `resolv.conf` as created by Docker. It meant that
we did not use the DNS plug-in (and its fallback capabilities) in
Supervisor. Also it meant that changes to the DNS setup at runtime
did not propagate to the aiohttp ClientSession (as observed in #5332).
Currently in several places pytest actually uses the aiohttp
ClientSession and reaches out to the internet. This is not ideal
for unit tests and should be avoided.

This creates several new fixtures to aid this effort: The `websession`
fixture simply returns a mocked aiohttp.ClientSession, which can be
used whenever a function is tested which needs the global websession.

A separate new fixture to mock the connectivity check named
`supervisor_internet` since this is often used through the Job
decorator which require INTERNET_SYSTEM.

And the `mock_update_data` uses the already existing update json
test data from the fixture directory instead of loading the data
from the internet.
When recreating the aiohttp ClientSession, log information what
nameservers exactly are going to be used.
@agners agners added the bugfix A bug fix label May 1, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented May 1, 2025

📝 Walkthrough

"""

Walkthrough

The changes refactor how the aiohttp ClientSession is initialized and managed within the Supervisor codebase. The ClientSession is no longer created in the constructor but is instead initialized asynchronously through a new init_websession method, allowing for explicit re-creation after DNS configuration changes. The DNS plugin now reinitializes the web session after updating DNS settings to ensure correct resolver usage. Tests are adjusted to use new fixtures and mocks for the web session and supervisor internet connectivity, and a MockResponse class is introduced to facilitate HTTP response mocking in tests. No changes to public API signatures are made outside of test code.

Changes

File(s) Change Summary
supervisor/core.py, supervisor/coresys.py, supervisor/plugins/dns.py Refactored websession (aiohttp ClientSession) management: moved initialization to async init_websession, added explicit reinitialization after DNS setup, updated comments and error handling for session usage.
supervisor/const.py, supervisor/api/middleware/security.py Added VALID_API_STATES constant to represent valid API states; updated middleware to use this constant for state validation instead of explicit checks.
tests/common.py Added MockResponse class to mock aiohttp-like HTTP responses for testing.
tests/conftest.py Added fixtures: supervisor_internet (mocks connectivity check), websession (mocks aiohttp ClientSession), and mock_update_data (mocks updater JSON). Removed explicit websession close in teardown.
tests/addons/test_manager.py, tests/api/test_auth.py, tests/api/test_discovery.py, Updated test signatures to accept new websession or supervisor_internet mock fixtures, removed inline patching/mocking in favor of fixture injection.
tests/api/test_backups.py, tests/api/test_security.py, tests/api/test_store.py, Added supervisor_internet mock fixture to test signatures for backup, security, and store-related API tests.
tests/api/test_supervisor.py, tests/misc/test_tasks.py, tests/os/test_manager.py, Updated test signatures to accept new fixtures, refactored HTTP mocking to use MockResponse and fixture-based setup, removed direct patching of aiohttp methods.
tests/plugins/test_plugin_manager.py, tests/resolution/fixup/test_system_execute_integrity.py Updated test signatures to accept mock_update_data and/or supervisor_internet fixtures.
tests/test_core.py, tests/test_coresys.py, tests/test_supervisor.py, tests/test_updater.py Updated test signatures and HTTP mocking to use new fixtures and MockResponse, refactored test logic to work with new websession management.
pyproject.toml Added new pytest marker no_mock_init_websession to disable autouse mock of init_websession for specific tests.

Sequence Diagram(s)

sequenceDiagram
    participant Supervisor
    participant CoreSys
    participant DNSPlugin
    participant aiohttp.ClientSession

    Supervisor->>CoreSys: await init_websession()
    CoreSys->>aiohttp.ClientSession: Create session with current /etc/resolv.conf
    Note right of CoreSys: Session uses Docker DNS (127.0.0.11) initially

    Supervisor->>DNSPlugin: Setup DNS plugin
    DNSPlugin->>CoreSys: await init_websession()
    CoreSys->>aiohttp.ClientSession: Re-create session with updated /etc/resolv.conf
    Note right of CoreSys: Session now uses Supervisor-managed DNS

    Supervisor->>CoreSys: Use websession for HTTP requests
Loading

Assessment against linked issues

Objective Addressed Explanation
Ensure aiohttp ClientSession uses updated DNS settings after Supervisor DNS changes (#5857)
Recreate ClientSession after DNS plugin updates /etc/resolv.conf (#5857)
Avoid using outdated DNS configuration in long-lived ClientSession (#5857)
Tests reflect new websession lifecycle and DNS behavior (#5857)
"""

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddb8590 and 417f4e4.

📒 Files selected for processing (1)
  • supervisor/coresys.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • supervisor/coresys.py
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Build i386 supervisor
  • GitHub Check: Build amd64 supervisor
  • GitHub Check: Build armv7 supervisor
  • GitHub Check: Build armhf supervisor
  • GitHub Check: Build aarch64 supervisor
  • GitHub Check: Run tests Python 3.13.3
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/common.py (1)

108-134: Well-structured MockResponse implementation for aiohttp testing.

The new MockResponse class provides a clean implementation for mocking aiohttp responses in tests, with appropriate async methods for text/read operations and context manager support.

One small suggestion:

 async def __aexit__(self, exc_type, exc, tb):
     """Exit the context manager."""
+    pass
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25f93cd and 507ce8d.

📒 Files selected for processing (20)
  • supervisor/core.py (1 hunks)
  • supervisor/coresys.py (3 hunks)
  • supervisor/plugins/dns.py (1 hunks)
  • tests/addons/test_manager.py (1 hunks)
  • tests/api/test_auth.py (3 hunks)
  • tests/api/test_backups.py (8 hunks)
  • tests/api/test_discovery.py (1 hunks)
  • tests/api/test_security.py (2 hunks)
  • tests/api/test_store.py (2 hunks)
  • tests/api/test_supervisor.py (3 hunks)
  • tests/common.py (1 hunks)
  • tests/conftest.py (3 hunks)
  • tests/misc/test_tasks.py (3 hunks)
  • tests/os/test_manager.py (3 hunks)
  • tests/plugins/test_plugin_manager.py (3 hunks)
  • tests/resolution/fixup/test_system_execute_integrity.py (2 hunks)
  • tests/test_core.py (3 hunks)
  • tests/test_coresys.py (2 hunks)
  • tests/test_supervisor.py (5 hunks)
  • tests/test_updater.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
`*/**(html|markdown|md)`: - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure t...

*/**(html|markdown|md): - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure the goal of the instruction is fronted.

  • Apply the Microsoft Style Guide to ensure documentation maintains clarity and conciseness.
  • In step-by-step instructions, front the location phrase in the instructional sentence.
  • In step-by-step instructions, front the 'goal' in the instructional sentence.
  • In step-by-step instructions, if in doubt what to front, front the 'goal' before the location phrase in the instructional sentence.
  • do not hyphenate terms like 'top-right' or 'bottom-left' with 'corner'
  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
`*/**(html|markdown|md)`: - Use bold to mark UI strings. - If "" are used to mark UI strings, replace them by bold.

*/**(html|markdown|md): - Use bold to mark UI strings.

  • If "" are used to mark UI strings, replace them by bold.
  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
`*/**(html|markdown|md)`: - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"

*/**(html|markdown|md): - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"

  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
`*/**(html|markdown|md)`: - Use sentence-style capitalization also in headings.

*/**(html|markdown|md): - Use sentence-style capitalization also in headings.

  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
`*/**(html|markdown|md)`: do not comment on HTML used for icons

*/**(html|markdown|md): do not comment on HTML used for icons

  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
`*/**(html|markdown|md)`: Avoid flagging inline HTML for embedding videos in future reviews for this repository.

*/**(html|markdown|md): Avoid flagging inline HTML for embedding videos in future reviews for this repository.

  • tests/common.py
  • tests/test_coresys.py
  • supervisor/core.py
  • tests/test_core.py
  • supervisor/coresys.py
  • tests/conftest.py
  • tests/test_updater.py
  • tests/test_supervisor.py
🧬 Code Graph Analysis (11)
tests/api/test_store.py (1)
tests/conftest.py (3)
  • api_client (494-517)
  • coresys (331-410)
  • supervisor_internet (521-525)
tests/misc/test_tasks.py (2)
tests/common.py (3)
  • MockResponse (108-133)
  • text (124-126)
  • update_text (116-118)
tests/conftest.py (3)
  • coresys (331-410)
  • mock_update_data (536-542)
  • supervisor_internet (521-525)
tests/plugins/test_plugin_manager.py (2)
tests/common.py (1)
  • MockResponse (108-133)
tests/conftest.py (3)
  • coresys (331-410)
  • mock_update_data (536-542)
  • supervisor_internet (521-525)
tests/api/test_discovery.py (3)
tests/conftest.py (4)
  • api_client (494-517)
  • coresys (331-410)
  • install_addon_ssh (607-615)
  • websession (529-532)
supervisor/coresys.py (2)
  • CoreSys (60-642)
  • websession (182-186)
supervisor/addons/addon.py (1)
  • Addon (132-1608)
tests/api/test_supervisor.py (1)
tests/conftest.py (4)
  • api_client (494-517)
  • coresys (331-410)
  • supervisor_internet (521-525)
  • websession (529-532)
tests/addons/test_manager.py (2)
tests/conftest.py (3)
  • coresys (331-410)
  • install_addon_ssh (607-615)
  • websession (529-532)
supervisor/coresys.py (2)
  • CoreSys (60-642)
  • websession (182-186)
tests/api/test_auth.py (2)
tests/conftest.py (3)
  • coresys (331-410)
  • api_client (494-517)
  • websession (529-532)
tests/common.py (2)
  • MockResponse (108-133)
  • text (124-126)
tests/os/test_manager.py (3)
tests/common.py (1)
  • MockResponse (108-133)
tests/dbus_service_mocks/rauc.py (1)
  • Rauc (143-212)
tests/conftest.py (3)
  • coresys (331-410)
  • mock_update_data (536-542)
  • supervisor_internet (521-525)
tests/test_core.py (1)
tests/conftest.py (2)
  • coresys (331-410)
  • websession (529-532)
tests/conftest.py (3)
tests/common.py (2)
  • MockResponse (108-133)
  • text (124-126)
supervisor/coresys.py (3)
  • supervisor (292-296)
  • supervisor (299-303)
  • websession (182-186)
supervisor/supervisor.py (1)
  • check_connectivity (287-297)
tests/test_supervisor.py (4)
tests/common.py (2)
  • MockResponse (108-133)
  • reset_last_call (103-105)
tests/conftest.py (3)
  • websession (529-532)
  • coresys (331-410)
  • tmp_supervisor_data (425-444)
supervisor/supervisor.py (3)
  • connectivity (72-74)
  • connectivity (77-85)
  • update_apparmor (123-191)
supervisor/const.py (1)
  • UpdateChannel (443-448)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Build armv7 supervisor
  • GitHub Check: Build armhf supervisor
  • GitHub Check: Build aarch64 supervisor
  • GitHub Check: Run tests Python 3.13.3
🔇 Additional comments (54)
tests/api/test_security.py (2)

3-4: Added AsyncMock import for test parameter support.

This import is required to support the new supervisor_internet parameter in the test function.


41-43: Added supervisor_internet mock parameter to tests.

The test now accepts the supervisor_internet fixture from the test configuration to mock the Supervisor's internet connectivity check. This aligns with the new websession initialization pattern introduced in this PR.

tests/plugins/test_plugin_manager.py (3)

3-4: Added AsyncMock import to support mocking.

The import statement has been updated to include AsyncMock for testing asynchronous connectivity checks.


13-14: Added MockResponse import for HTTP response mocking.

This import allows tests to use the MockResponse class that simulates aiohttp response objects.


42-44: Added fixture parameters for web requests and internet connectivity.

The test function now accepts two important fixtures:

  1. mock_update_data - Provides mocked updater JSON data responses
  2. supervisor_internet - Simulates the Supervisor's internet connectivity check

This change supports the new websession initialization pattern implemented in this PR.

tests/api/test_store.py (2)

5-6: Updated import statement to include AsyncMock.

The import has been expanded to include AsyncMock for the new parameter in test functions.


95-97: Added supervisor_internet parameter and return type annotation.

The function signature has been updated to:

  1. Accept the supervisor_internet fixture of type AsyncMock to mock internet connectivity checks
  2. Add an explicit return type annotation of -> None

This change aligns with the broader refactoring of websession initialization in the Supervisor.

tests/test_coresys.py (1)

40-49: Appropriate refactoring to test async websession initialization.

The test correctly verifies that the custom user agent is set when initializing the ClientSession asynchronously. This change aligns with the PR objective of allowing ClientSession recreation after DNS plugin loading.

tests/api/test_auth.py (3)

3-4: Update to imports to accommodate new mocking approach.

Added support for MockResponse to replace direct aiohttp patching.


12-12: Good addition of MockResponse import.

This supports the shift to using the centralized mock response class.


81-99: Well-implemented transition to injected websession mock.

The test has been correctly updated to use the websession fixture with the new MockResponse class, replacing direct patching of aiohttp. This is consistent with the mocking approach used across the test suite.

tests/resolution/fixup/test_system_execute_integrity.py (2)

19-19: Appropriate addition of supervisor_internet parameter.

Adding the AsyncMock parameter ensures the test works with the updated fixture system that mocks internet connectivity.


45-45: Consistent addition of supervisor_internet parameter.

Similar to the previous test function, this ensures compatibility with the updated fixture system.

tests/api/test_discovery.py (1)

86-91: Properly updated test signature to use injected websession mock

The addition of the websession parameter aligns with the broader pattern of injecting mocks rather than patching or assigning them within test functions. This approach improves test consistency and reduces boilerplate code.

tests/addons/test_manager.py (1)

192-194: Good refactor to use the websession fixture

This change correctly updates the function signature to use the injected websession mock instead of manually creating it inside the test. This is consistent with the pattern used throughout the test suite and improves maintainability.

tests/misc/test_tasks.py (3)

20-20: Updated imports to include MockResponse

Good addition of the MockResponse import to support the refactored mocking approach.


174-179: Improved test setup with specialized fixtures

Excellent refactoring to use the dedicated mock_update_data and supervisor_internet fixtures instead of direct patching. This approach:

  1. Improves test readability by clearly indicating dependencies
  2. Makes test behavior more consistent across the test suite
  3. Centralizes mocking logic to reduce duplication

202-203: Updated version data manipulation to use MockResponse methods

Good adaptation of the version updating logic to work with the new MockResponse approach, properly using text() and update_text() methods.

supervisor/core.py (1)

127-139: Well-documented early websession initialization

Excellent implementation and documentation of the early websession initialization. The detailed comments:

  1. Clearly explain the purpose (getting initial version information)
  2. Document the limitations (Docker DNS without fallbacks might fail)
  3. Explain the different behavior in various installation scenarios
  4. Document the plan to reinitialize the session after DNS plugin setup

This change directly addresses the core issue described in PR #5857 where the supervisor couldn't use the DNS plugin's fallback mechanisms because the ClientSession was created too early.

tests/test_core.py (3)

37-37: Parameter addition follows test pattern refactoring.

The addition of the websession: MagicMock parameter aligns with the PR's goal to refactor how aiohttp ClientSession is initialized and managed. This ensures the test correctly handles the websession dependency that's now externally injected.


55-57: Parameter addition maintains consistent test patterns.

Adding the websession: MagicMock parameter to this test function ensures consistency with the refactored approach for managing aiohttp ClientSession. The parameter signature is properly formatted according to the project's style.


72-74: Parameter addition correctly implements dependency injection.

The addition of the websession: MagicMock parameter here completes the refactoring pattern across all related test functions, ensuring consistent dependency injection throughout the test suite.

tests/api/test_supervisor.py (3)

5-5: Import addition supports new parameter types.

The updated import correctly adds AsyncMock alongside the existing MagicMock import to support the new parameter types in the test functions.


36-38: Parameter addition ensures proper internet connectivity mocking.

Adding the supervisor_internet: AsyncMock parameter allows the test to properly mock supervisor internet connectivity, which is essential for the aiohttp ClientSession initialization changes in the PR.


234-236: Parameter additions maintain consistent testing approach.

The addition of both supervisor_internet: AsyncMock and websession: MagicMock parameters ensures this test function has all the dependencies needed after the ClientSession initialization refactoring. This creates a consistent approach to dependency injection across the test suite.

tests/os/test_manager.py (4)

3-3: Import update adds necessary mock types.

The expanded import now includes AsyncMock alongside existing mock types, properly supporting the new parameter types used in test functions.


13-13: Common test utilities import improves test infrastructure.

Adding the MockResponse import from tests.common promotes code reuse by utilizing a shared test utility for HTTP response mocking, which is needed for the websession tests.


21-23: Parameter additions follow consistent dependency injection pattern.

Adding mock_update_data: MockResponse and supervisor_internet: AsyncMock as parameters ensures this test has all the necessary mocked dependencies after the ClientSession initialization refactoring.


71-73: Parameter addition ensures consistent test approach.

Adding the supervisor_internet: AsyncMock parameter to this test function aligns with the pattern established across the test suite, providing consistent mocking of internet connectivity.

tests/api/test_backups.py (7)

279-280: Parameter addition follows established test pattern.

Adding the supervisor_internet: AsyncMock parameter ensures this test has access to the mocked internet connectivity, following the consistent pattern established throughout the test suite.


476-477: Parameter addition maintains consistent mocking approach.

The supervisor_internet: AsyncMock parameter addition ensures this test properly mocks the supervisor's internet connectivity, which is necessary after the ClientSession initialization refactoring.


1015-1016: Parameter addition ensures proper dependency injection.

The supervisor_internet: AsyncMock parameter addition follows the consistent pattern of dependency injection for mocked internet connectivity established throughout the test suite.


1065-1066: Parameter addition maintains testing pattern consistency.

Adding the supervisor_internet: AsyncMock parameter ensures consistent mocking of supervisor internet connectivity across all backup restoration tests.


1336-1337: Parameter addition follows test refactoring pattern.

The supervisor_internet: AsyncMock parameter addition aligns with the established pattern for mocking supervisor internet connectivity in tests involving backup operations.


1396-1397: Parameter addition maintains consistent test approach.

Adding the supervisor_internet: AsyncMock parameter ensures this test follows the same pattern of dependency injection established across the test suite.


1422-1425: Parameter additions align with test suite refactoring.

The additions of api_client: TestClient, coresys: CoreSys, and supervisor_internet: AsyncMock parameters follow the consistent pattern established across the test suite for dependency injection.

supervisor/coresys.py (3)

102-102: Change to websession initialization looks good!

This change initializes the _websession attribute as None instead of creating it immediately in the constructor, which is crucial for the DNS resolution fix. This supports the PR objective of recreating the session after DNS plugin load.


112-132: Strong implementation of async websession initialization

This new asynchronous method init_websession properly:

  1. Closes any existing session before creating a new one
  2. Creates an AsyncResolver for DNS resolution
  3. Logs the nameservers being used (helpful for debugging)
  4. Sets up a TCPConnector with the resolver
  5. Initializes the ClientSession with proper headers

This implementation addresses the core issue by allowing recreation of the ClientSession after DNS plugin changes, ensuring the updated DNS configuration is used.


184-185: Good defensive programming

Adding this runtime check ensures that code attempting to use the websession before it's properly initialized will fail fast with a clear error message, rather than experiencing a NoneType error later.

tests/conftest.py (4)

56-57: Good import update

Adding the MockResponse import supports the new testing approach for HTTP responses.


520-525: Well-designed supervisor_internet fixture

This fixture cleanly mocks the Supervisor's internet connectivity check, returning True by default. This allows tests to simulate internet connectivity without making actual network requests.


528-532: Well-implemented websession fixture

This fixture replaces the previous approach of patching ClientSession methods directly. Using spec_set=ClientSession ensures type compatibility with the real ClientSession class.


535-542: Good mock_update_data fixture implementation

This fixture creates a consistent mock response for update data, simplifying tests that need to verify version data processing. Using the MockResponse class ensures compatibility with async context managers used in the codebase.

tests/test_supervisor.py (6)

5-5: Import update aligns with new testing approach

Explicitly importing AsyncMock supports the new testing pattern using mock fixtures.


26-26: Good addition of testing utilities

Adding MockResponse and reset_last_call imports supports the updated testing approach.


32-36: Test updated correctly to use websession fixture

The test has been properly updated to use the injected websession mock instead of patching ClientSession directly, which aligns with the new approach to session management.

Also applies to: 42-43


57-60: Proper test adaptation for throttling checks

The connectivity check throttling test has been correctly updated to use the websession fixture, maintaining test functionality with the new session management approach.

Also applies to: 66-67


97-102: Good update to apparmor test

The test for updating apparmor has been properly modified to use the websession fixture and MockResponse, which aligns with the new approach while maintaining test coverage.

Also applies to: 108-111


115-120: Properly updated error test case

The test for apparmor update errors has been correctly adapted to use the websession fixture and MockResponse pattern, maintaining error testing coverage.

tests/test_updater.py (5)

4-4: Good addition of json module

Adding the json module import supports parsing the mock response data correctly.


15-15: Proper testing utilities import

Including MockResponse from common supports the updated testing approach.


24-26: Well-adapted test_fetch_versions function

The test has been properly updated to:

  1. Accept the new mock fixtures
  2. Parse JSON data from the mock response instead of a direct web request
  3. Maintain the same validation logic

This ensures the test continues to work correctly with the new session management approach.

Also applies to: 32-33


77-83: Properly updated test_os_update_path signature

Adding the mock fixtures to the function signature ensures this test works with the new session management approach, while maintaining the same test logic.


94-98: Good update to connectivity test

The test for delayed fetching has been properly adapted to use the websession fixture, ensuring it works with the new session management approach while maintaining test coverage.

Comment on lines +180 to +186

# Reinitializing aiohttp.ClientSession after DNS setup makes sure that
# aiodns is using the right DNS servers (see #5857).
# At this point it should be fairly safe to replace the session since
# we only use the session synchronously during setup and not thorugh the
# API which previously caused issues (see #5851).
await self.coresys.init_websession()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Reinitialize websession after DNS setup to fix resolver issues.

This is a key change that addresses the core issue of this PR. Instead of just checking connectivity, the code now explicitly reinitializes the aiohttp ClientSession to ensure that aiodns uses the correct DNS servers after the DNS plugin is loaded.

The detailed comments explain the reasoning: replacing the session at this point is safe because it's only used synchronously during setup and not through the API, which prevents issues that were previously encountered.

This fix addresses issue #5857 by ensuring that DNS resolution works correctly after the DNS plugin is loaded, allowing for proper use of fallback mechanisms.

Previous attempts to reinitialize the ClientSession have shown
use of the ClientSession after it was closed due to API requets
being handled in parallel to the reinitialization (see #5851).
Make sure this is not possible by refusing to reinitialize the
ClientSession when the API is available.
@agners agners mentioned this pull request May 1, 2025
14 tasks
@agners agners force-pushed the recreate-websession-after-dns-plug-in-load branch from ab88778 to 8a6e041 Compare May 1, 2025 13:54
Also sure we don't create aiohttp ClientSession objects unnecessarily.
@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 2, 2025

aio-libs/aiodns#145 should hopefully solve this problem upstream and avoid the need to do this.

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 2, 2025

Hopefully #5864 will take care of this 🤞

@agners
Copy link
Copy Markdown
Member Author

agners commented May 5, 2025

Hopefully #5864 will take care of this 🤞

Unfortunately, it seems not 😢 I've removed the re-init and instead read the nameservers currently reported by the DNSResolver instance of the aiohttp AsyncResolver. However, even after inserting a asyncio.sleep(10), the nameserver still seems to remain the same name server (127.0.0.11 in this case), while /etc/resolv.conf has the DNS plug-in IP listed first (172.30.32.3) 🤔 .

I've double checked, the relevant Python package versions look good:

f2baee4a1873:/# pip list | grep -e pycares -e aiodns
aiodns                     3.3.0
pycares                    4.8.0

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 5, 2025

However, even after inserting a asyncio.sleep(10), the nameserver still seems to remain the same name server (127.0.0.11 in this case),

Can you share how you are testing this?

@agners
Copy link
Copy Markdown
Member Author

agners commented May 5, 2025

Ok, I've dig into what is going on here, and it seems c-ares rely on inotify to monitor for changes:

https://github.com/c-ares/c-ares/blob/v1.34.5/src/lib/event/ares_event_configchg.c#L157-L162

So we monitor /etc for changes. However, crucially this doesn't work for two rather common situations:

  1. systemd-resolved or NetworkManager might use symlinks to store resolv.conf in a location more suitable where changes are made often. E.g. systemd-resolved symlink /etc/resolv.conf to /run/systemd/resolve/stub-resolv.conf. With that, even when opening /etc/resolv.conf, there is no "change" in /etc directly, since the file modified is located elsewhere.
  2. Containers usually have /etc part of the overlayfs file system. Since /etc is not on the overlayed file system, changes will not get monitored.

So unfortunately, the current implementation is not really reliable in detecting /etc/resolv.conf changes 😢

Ideally we just resolve through a statically configured resolver which is reliably gets DNS changes through other mechanisms. The plug-in DNS actually does that already, its just that Supervisor is a special situation since it sets up the plug-in first.

Ideally we'd make the local resolver available on OS level, such that it is there from the very beginning.

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 5, 2025

So we monitor /etc for changes. However, crucially this doesn't work for two rather common situations:

  1. systemd-resolved or NetworkManager might use symlinks to store resolv.conf in a location more suitable where changes are made often. E.g. systemd-resolved symlink /etc/resolv.conf to /run/systemd/resolve/stub-resolv.conf. With that, even when opening /etc/resolv.conf, there is no "change" in /etc directly, since the file modified is located elsewhere.
  2. Containers usually have /etc part of the overlayfs file system. Since /etc is not on the overlayed file system, changes will not get monitored.

Is this something we can get c-ares to solve upstream?

@agners
Copy link
Copy Markdown
Member Author

agners commented May 5, 2025

So we monitor /etc for changes. However, crucially this doesn't work for two rather common situations:

  1. systemd-resolved or NetworkManager might use symlinks to store resolv.conf in a location more suitable where changes are made often. E.g. systemd-resolved symlink /etc/resolv.conf to /run/systemd/resolve/stub-resolv.conf. With that, even when opening /etc/resolv.conf, there is no "change" in /etc directly, since the file modified is located elsewhere.
  2. Containers usually have /etc part of the overlayfs file system. Since /etc is not on the overlayed file system, changes will not get monitored.

Is this something we can get c-ares to solve upstream?

For 1, I guess we could follow the symlink and monitor that parent directory.

For 2, which is actual the problem here, we could monitor /etc/resolv.conf directly. From what I understand, Docker creates /etc/resolv.conf in the overlayed file system, so monitoring the file itself should trigger when it gets updated. But this relies on current Docker behavior.

@agners agners requested a review from sairon May 6, 2025 08:59
@agners
Copy link
Copy Markdown
Member Author

agners commented May 6, 2025

Since this works right now, and is really the only re-initialization we need (DNS changes afterwards are handled by the resoolver running in the DNS plug-in, the resolv.conf of Supervisor stays static), let's move forward with this PR. It also includes some nice tests cleanup etc. Once resolv.conf changes correctly get picked up, we can simply drop the ClientSession re-initialization after writing the resolv.conf.

Copy link
Copy Markdown
Member

@sairon sairon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, if the tests somehow catch how things are initialized, should be fine.

FWIW, I was thinking about another approach at this - and that's using short-lived ClientSession where relevant. So subclass ClientSession, move the code that's currently in init_websession to the class initializer, and use it as a context manager instead of sys_websession where it makes sense (namely for requests going out of the docker networks). The websession property then could initialize the session (along with an INFO logging entry maybe) instead of raising an error. Anyway with 9d869e1 implemented we might be fine with the single one without the risk of races. Probably just a food for thought now.

@agners
Copy link
Copy Markdown
Member Author

agners commented May 6, 2025

FWIW, I was thinking about another approach at this - and that's using short-lived ClientSession where relevant. So subclass ClientSession, move the code that's currently in init_websession to the class initializer, and use it as a context manager instead of sys_websession where it makes sense (namely for requests going out of the docker networks). The websession property then could initialize the session (along with an INFO logging entry maybe) instead of raising an error. Anyway with 9d869e1 implemented we might be fine with the single one without the risk of races. Probably just a food for thought now.

Yeah I assume that as long as there are no calls through the API, all code using the web session (during startup) is sequential, and hence we should be free to safely initialize. I guess we'll see if that is indeed the case on Sentry rather quickly.

Creating ClientSession for callees and let them use it as a context on their end would make sure the lifetime is controlled on caller side, which would solve the problem. But it also would not allow to share the session which brings features such as connection pooling etc.

@agners agners merged commit 85f8107 into main May 6, 2025
22 checks passed
@agners agners deleted the recreate-websession-after-dns-plug-in-load branch May 6, 2025 14:23
@github-actions github-actions bot locked and limited conversation to collaborators May 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Supervisor using DNS settings from host OS /etc/resolv.conf

3 participants