Skip to content

Feature: Passive PUID/PGID Support & Startup Sequence Refactor#1381

Merged
jokob-sk merged 8 commits intonetalertx:mainfrom
adamoutler:PUID
Jan 4, 2026
Merged

Feature: Passive PUID/PGID Support & Startup Sequence Refactor#1381
jokob-sk merged 8 commits intonetalertx:mainfrom
adamoutler:PUID

Conversation

@adamoutler
Copy link
Member

@adamoutler adamoutler commented Jan 3, 2026

This PR introduces support for running the container with arbitrary PUID/PGID values, allowing for better security practices while maintaining backward compatibilityfor existing setups. The goal is to enable these features passively so no configuration changes are required for current users, but those wishing to lock down permissions can now do so.

This PR is split into three distinct commits for easier review:

1. New PUID startup sequence

  • Root Entrypoint: Introduced root-entrypoint.sh to perform necessary setup tasks (like fixing permissions on specific mounts) before dropping privileges to the specified user. root-entrypoint.sh is a somewhat "optional" method to start the system and can be bypassed. It requires the following and will not fail without them:
    • Container starts as UID:0
    • Caps Granted: CHOWN, SETUID, SETGID
  • Refactored Entrypoint: The startup logic in entrypoint.d has been reorganized to handle permission checks and user switching dynamically.
  • Auditing & Safety:
    • 10-capabilities-audit.sh: Checks for required Linux capabilities. Terminates if mandatory capabilities are missing (Layer2 caps were previously baked into the python binary to support, drop-all + failure to add results in 15-mounts failing with an obscure permission error). Warns if any other cap is missing.
    • 60-expected-user-id-match.sh: Verifies the runtime user matches the requested configuration.
    • Others were shifted in numeric order to allow gaps.

2. Unit tests

  • Added a comprehensive suite of Docker-based tests in docker_tests.
  • Scenarios Covered:
    • Standard PUID/PGID usage.
    • Missing capabilities (e.g., NET_ADMIN, NET_RAW).
    • Read-only vs. Writable volume mounts.
    • Configuration file permissions.
  • These tests ensure the new startup sequence behaves correctly under various environment constraints and fails gracefully when requirements aren't met.
  • Enabled previously disabled tests
  • Removed irrelevant and now-duplicate tests

3. Docs

  • Added PUID_PGID_SECURITY.md: A detailed guide on how to use the new PUID/PGID features and the security benefits.
  • Updated docker-troubleshooting: Added troubleshooting steps for common permission and capability issues.

Key Benefits

  • Non-Breaking: Existing docker-compose setups continue to work exactly as before.
  • Security: Enables running NetAlertX as a non-root user, reducing the attack surface.
  • Diagnostics: The new startup scripts provide much clearer error messages regarding missing permissions or capabilities, aiding in faster troubleshooting.

TODO

  • Docker docs needs update for new cap requirements before release
  • Verify URLs of all documents after published (they should work but require merge to test links)

Summary by CodeRabbit

  • New Features

    • PUID/PGID runtime support with automatic permission priming and a root‑to‑service startup flow that drops privileges.
    • Runtime capability audits and explicit troubleshooting messages for missing capabilities.
  • Bug Fixes

    • Startup checks made tolerant/non‑fatal to allow operation under restricted environments.
  • Documentation

    • New security and troubleshooting docs for PUID/PGID and capability-related startup issues.
  • Chores / Tests

    • Updated Docker Compose test scenarios and improved test logging and diagnostics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

Adds a root-priming entrypoint, shifts privilege-drop to startup, expands capability handling (CHOWN/SETUID/SETGID/NET_BIND_SERVICE), reworks mount/mandatory-folder checks and diagnostics, updates Docker/devcontainer and compose/test configs, and broadens tests and troubleshooting docs for PUID/PGID and capability scenarios.

Changes

Cohort / File(s) Summary
Root entrypoint & main entrypoint
install/production-filesystem/root-entrypoint.sh, install/production-filesystem/entrypoint.sh
New root-entrypoint.sh primes runtime paths, validates numeric PUID/PGID, attempts privilege drop (su-exec), and re-execs entrypoint.sh; entrypoint.sh updated to re-exec via root-entrypoint when appropriate and to defer fatal exits to later checks.
Entrypoint.d — additions & refactors
install/production-filesystem/entrypoint.d/10-capabilities-audit.sh, .../05-data-migration.sh, .../15-mounts.py, .../30-mandatory-folders.sh, .../35-apply-conf-override.sh, .../40-writable-config.sh, .../45-nginx-config.sh, .../60-expected-user-id-match.sh, .../90-excessive-capabilities.sh
New capability-audit script; mounts diagnostics rewritten (table render, new exit conditions); mandatory folders moved to 30-*; data-migration made non-fatal when /data missing; various scripts adjusted for non-fatal behavior and POSIX compatibility.
Entrypoint.d — removals
install/production-filesystem/entrypoint.d/0-storage-permission.sh, .../25-mandatory-folders.sh (old), .../60-user-netalertx.sh, .../85-layer-2-capabilities.sh
Deleted legacy scripts that performed earlier root-centric chown/chmod enforcement and an nmap-based capability probe (replaced by new audit/priming flow).
Dockerfiles & devcontainer images
Dockerfile, .devcontainer/Dockerfile, .devcontainer/resources/devcontainer-Dockerfile
Switches entrypoint shell to /bin/bash in build stages, adds/copies root-entrypoint.sh, ensures entrypoint executables via chmod +x, adds su-exec and extra setcap grants, sets hardened stage USER 0, and adjusts ownership/permission steps to include root-entrypoint.
docker-compose & runtime config
docker-compose.yml, .devcontainer/devcontainer.json
docker-compose.yml moves to host network_mode, expands cap_add (CHOWN, SETUID, SETGID, NET_BIND_SERVICE), introduces PUID/PGID envs, and favors root-owned/tmpfs semantics; devcontainer adds NET_BIND_SERVICE cap and updates post-create/start messages.
Services startup scripts
install/production-filesystem/services/start-nginx.sh, install/production-filesystem/services/start-php-fpm.sh
Nginx uses stderr target; PHP-FPM start uses command array, process-substitution tee for stderr logging, and propagates exit status.
Permissions & capability guidance docs
docs/PUID_PGID_SECURITY.md, docs/docker-troubleshooting/PUID_PGID_SECURITY.md, docs/docker-troubleshooting/missing-capabilities.md, install/production-filesystem/README.md
New PUID/PGID security docs (numeric-only validation, failing on malformed input), CAP_CHOWN troubleshooting, and expanded Boot Flow/security guidance describing root-entrypoint → entrypoint → entrypoint.d → services sequence.
Test infra & test configs
test/docker_tests/*, test/docker_tests/configurations/*.yml, test/docker_tests/test_docker_compose_scenarios.json, test/docker_tests/pytest.ini
Extensive test additions/rewrites: PUID/PGID tests, capability-missing scenarios, enhanced logging/capture, new helpers (_assert_contains_any, audit-stream capture), compose variants updated to add CHOWN and simplify tmpfs uid/gid usage, many docker-compose test files adjusted for new capability expectations.
Dev & editor updates
.devcontainer/scripts/setup.sh, .github/copilot-instructions.md, .gitignore, .vscode/settings.json, pyproject.toml
Setup script adjusts tmp dirs and permissions; copilot instructions reorganized; .gitignore adds .venv and test_mounts/; VSCode adds python test settings; pyproject.toml/pytest paths updated.
Misc tests & tooling
test/docker_tests/test_container_environment.py, test/docker_tests/test_entrypoint.py, additional unit tests
Tests updated to capture and print docker outputs, add new transition tests (root → user), switch compose CLI calls to docker compose, and to assert presence of troubleshooting links and diagnostic logs.

Sequence Diagram(s)

sequenceDiagram
    participant Init as PID 1
    participant RootEP as /root-entrypoint.sh
    participant Entrypoint as /entrypoint.sh
    participant EPD as entrypoint.d scripts
    participant Services as Services (nginx/php-fpm)

    Init->>RootEP: exec (as root)
    RootEP->>RootEP: validate PUID/PGID (numeric)
    RootEP->>RootEP: prime runtime paths (chown/chmod attempts)
    RootEP->>RootEP: attempt privilege drop (su-exec)
    alt drop succeeds
        RootEP->>Entrypoint: re-exec as non-root (PUID:PGID)
    else drop fails
        RootEP->>RootEP: log warning, set check-only / fallback
        RootEP->>Entrypoint: re-exec as current user
    end

    Entrypoint->>Entrypoint: check ENTRYPOINT_PRIMED flag
    Entrypoint->>Entrypoint: perform startup checks (collect FAILED_STATUS)
    Entrypoint->>EPD: source ordered entrypoint.d scripts
    EPD->>EPD: 10-capabilities-audit -> report missing caps
    EPD->>EPD: 15-mounts -> render mount table, set error flags
    EPD->>EPD: 30-mandatory-folders -> ensure volatile dirs/files
    EPD->>EPD: other checks (nginx/config/user-id matches)
    Entrypoint->>Services: launch services (nginx stderr, php-fpm tee)
    Services->>Services: run & handle signals (PID 1 lifecycle)
Loading

Possibly related PRs

Poem

🐰 Root primes the nest, then hands the crown away,
PUIDs in digits, no tricks in play—
Caps are checked, mounts are scanned with care,
Services start, logs tee out to share,
A hopping cheer for safer Docker days! 🥕

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.18% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: introducing PUID/PGID support with a startup sequence refactor, which aligns with the comprehensive changes across the codebase.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
install/production-filesystem/entrypoint.d/45-nginx-config.sh (2)

13-35: Consider fail-fast behavior for missing nginx config mount.

When a custom PORT is specified but the nginx config mount is missing, the container now silently falls back to the default port (Line 34). This creates a confusing user experience where:

  1. User sets PORT=8080
  2. Mount is missing, warning is printed (easily missed in logs)
  3. Container starts successfully on port 20211
  4. User expects service on port 8080 but it's unreachable

Consider failing fast with a non-zero exit when a custom PORT is set but the required mount is missing, as this indicates a configuration error that should be corrected.

Based on learnings, I understand the current approach supports test suite compatibility, but this particular case (custom PORT with missing mount) may warrant stricter handling to prevent user confusion.


38-57: Same concern: write test failure with custom PORT should fail fast.

Similar to the missing mount case, when a write test fails (Line 56), the container continues with default config despite the user requesting a custom PORT. This masks a permission issue that should be fixed.

Consider exit 1 here to enforce proper mount permissions when non-default PORT is used.

install/production-filesystem/entrypoint.sh (1)

336-340: Potential type mismatch in numeric comparison.

FAILED_STATUS is assigned string values (e.g., "", "1") but line 338 uses -eq for numeric comparison. If FAILED_STATUS is empty at this point, the comparison may fail or produce unexpected behavior.

🔎 Proposed fix
-if [ "${FAILED_STATUS}" -eq 0 ] && [ "${FAILED_NAME}" != "signal" ]; then
+if [ "${FAILED_STATUS:-1}" -eq 0 ] && [ "${FAILED_NAME}" != "signal" ]; then

Alternatively, ensure FAILED_STATUS is always numeric by initializing it to 0 instead of "" at line 79.

♻️ Duplicate comments (4)
test/docker_tests/configurations/mount-tests/docker-compose.mount-test.api_no-mount.yml (1)

38-40: Same tmpfs ownership concern as db_ramdisk configuration.

The removal of explicit uid/gid from tmpfs mounts follows the same pattern as other mount test configurations. Refer to the verification comment on docker-compose.mount-test.db_ramdisk.yml regarding tmpfs accessibility with PUID/PGID.

test/docker_tests/configurations/mount-tests/docker-compose.mount-test.active_config_no-mount.yml (1)

16-16: Same tmpfs ownership concern as other mount test configurations.

The CHOWN capability addition and mode=1700 tmpfs specification have the same implications as in other mount test configurations: the tmpfs mount defaults to root:root ownership and requires explicit chowning in the startup sequence for the application user to access it after privilege drop.

Also applies to: 34-34

test/docker_tests/configurations/mount-tests/docker-compose.mount-test.api_mounted.yml (1)

16-16: Same tmpfs ownership concern as other mount test configurations.

The CHOWN capability addition and mode=1700 tmpfs specifications follow the same pattern as other mount test configurations, with the same requirement that startup scripts must chown these tmpfs mounts for non-root user access.

Also applies to: 42-44

test/docker_tests/configurations/mount-tests/docker-compose.mount-test.active_config_mounted.yml (1)

16-16: Same tmpfs ownership concern as other mount test configurations.

The CHOWN capability addition and mode=1700 tmpfs specification follow the same pattern as other mount test configurations, requiring startup scripts to chown the tmpfs mount for non-root user access after privilege drop.

Also applies to: 38-38

🧹 Nitpick comments (20)
install/production-filesystem/README.md (1)

89-89: Minor: Clarify PID 1 wording.

Line 89 states "becomes PID 1 and / in the Docker filesystem"—the second part is slightly ambiguous. Consider rephrasing to "becomes PID 1 (the root/init process) in the container" for clarity.

install/production-filesystem/services/start-php-fpm.sh (1)

35-37: Minor: Unusual output format for "done" message.

The echo -ne " done" outputs a leading space with no trailing newline, which seems intended to append to something. However, since line 29 already printed a complete message, this " done" appears on its own line when php-fpm exits. Consider either:

  • Adding a newline: echo " done" (or echo "done")
  • Removing the leading space if standalone output is intended

Also, -e is unnecessary here since there are no escape sequences.

🔎 Suggested fix
-echo -ne " done"
+echo "done"
install/production-filesystem/entrypoint.d/35-apply-conf-override.sh (1)

7-10: Consider validating NETALERTX_CONFIG is set for clearer diagnostics.

If NETALERTX_CONFIG is unset or empty, mkdir will fail but the error message may be unclear. Adding explicit validation would provide better diagnostics for this critical misconfiguration.

🔎 Proposed validation
 # Ensure config directory exists
+if [ -z "$NETALERTX_CONFIG" ]; then
+    >&2 echo "ERROR: NETALERTX_CONFIG environment variable is not set"
+    exit 1
+fi
+
 mkdir -p "$NETALERTX_CONFIG" || {
     >&2 echo "ERROR: Failed to create config directory $NETALERTX_CONFIG"
     exit 1
 }
test/docker_tests/configurations/mount-tests/docker-compose.mount-test.db_noread.yml (1)

15-15: Potentially unnecessary capabilities when not starting as root.

The configuration adds SETUID and SETGID capabilities (lines 19-20), which are typically used to drop privileges from root to a non-root user. However, since the container starts as user 20211:20211 (line 11) rather than root, these capabilities won't have any practical effect—only processes running as UID 0 can successfully use setuid/setgid system calls.

The CHOWN capability (line 15) appears necessary for the test's stated purpose of chowning /data/db, but SETUID/SETGID may be redundant in this configuration.

If this is intentional to test the capability handling logic regardless of effectiveness, consider adding a comment to clarify the rationale.

Also applies to: 19-20

test/docker_tests/configurations/mount-tests/docker-compose.mount-test.log_unwritable.yml (1)

53-53: Unused volume definition.

The test_system_services_active_config volume is defined but not mounted in this service. Line 44 uses tmpfs for /tmp/nginx/active-config instead.

If this volume definition is for consistency with other test configurations or template purposes, it's fine. Otherwise, consider removing it to keep the configuration minimal.

install/production-filesystem/entrypoint.d/15-mounts.py (1)

382-384: Consider using ASCII dash for terminal compatibility.

The HEAVY MINUS SIGN () at line 384 may render inconsistently across terminals or logging systems. The TODO at lines 8-10 already notes plans to replace emoji indicators with plain text.

🔎 Suggested fix for consistent rendering
     CHECK_SYMBOL = "✅"
     CROSS_SYMBOL = "❌"
-    BLANK_SYMBOL = "➖"
+    BLANK_SYMBOL = "—"  # em-dash, or use "--" for pure ASCII
install/production-filesystem/entrypoint.d/30-mandatory-folders.sh (1)

31-31: The || true suppression is intentional design, but consider logging failures for observability.

This chmod error suppression is a consistent pattern across all production entrypoint scripts (30-mandatory-folders.sh, 25-first-run-db.sh, 05-data-migration.sh, root-entrypoint.sh), indicating deliberate handling for scenarios where chmod may fail due to container context constraints or filesystem limitations—while allowing the container to proceed since the directory was successfully created. However, silently ignoring the failure means permission hardening issues go undetected. Consider logging the chmod failure (without halting startup) to aid troubleshooting, especially given the complex UID/GID alignment requirements documented in the deployment guides.

Dockerfile (2)

29-29: Remove redundant hadolint ignore directive.

Based on learnings, DL3018 (Alpine package pinning) is already globally ignored in .hadolint.yaml. This inline directive is redundant and can be removed.

🔎 Proposed cleanup
-# hadolint ignore=DL3018
 RUN apk add --no-cache \

45-46: Consider pinning pip/setuptools/wheel versions for reproducible builds.

While the hadolint ignore allows unpinned upgrades, pinning these base tool versions would improve build reproducibility and prevent potential breakage from upstream changes.

🔎 Example with pinned versions
-# hadolint ignore=DL3013 
-RUN python -m pip install --no-cache-dir --upgrade pip setuptools wheel && \
+RUN python -m pip install --no-cache-dir --upgrade pip==24.0 setuptools==69.0.0 wheel==0.42.0 && \
     pip install --prefer-binary --no-cache-dir -r /tmp/requirements.txt && \

Note: Use current stable versions appropriate for your Python 3.12/Alpine 3.22 environment.

test/docker_tests/test_docker_compose_unit.py (1)

25-30: Suppress false-positive unused argument warnings.

The unused args and kwargs parameters are intentional to match the subprocess.run signature for mocking. Add a suppression comment to silence the static analysis warnings.

🔎 Proposed suppression
-    def fake_run(*args, **kwargs):
+    def fake_run(*args, **kwargs):  # noqa: ARG001
         try:
             return cps.pop(0)
test/docker_tests/test_puid_pgid.py (1)

101-114: Unnecessary env.pop call.

Line 103 removes SKIP_TESTS from env, but the env dict comes from the parametrized test data which doesn't contain SKIP_TESTS. The pop with default None is harmless but adds confusion.

🔎 Proposed simplification
 def test_invalid_puid_pgid_rejected(env: dict[str, str], expected: str) -> None:
-    env = {**env}
-    env.pop("SKIP_TESTS", None)
     result = _run_root_entrypoint(env=env)
install/production-filesystem/root-entrypoint.sh (1)

89-95: Condition at line 90 is always true.

Since PUID and PGID have defaults set at lines 23-24, the condition [ -n "${PUID:-}" ] || [ -n "${PGID:-}" ] is always true. The message will always print when running as non-root.

This may be intentional (always inform the user), but if the intent is to only warn when the user explicitly set PUID/PGID, you'd need to check before applying defaults.

test/docker_tests/test_mount_diagnostics_pytest.py (3)

67-76: Audit stream helper lacks timeout handling.

The Popen subprocess streams logs indefinitely with no mechanism to detect if the process hangs or becomes unresponsive. Consider adding a health check or documenting expected cleanup behavior.

Also note: This function signature differs from the one in test_docker_compose_scenarios.py (single container name vs list). If both are intentional, the naming similarity could cause confusion.


685-689: Silent exception swallowing during cleanup.

The bare except Exception: pass silently swallows all errors during audit process termination. While cleanup errors are often non-fatal, logging them would aid debugging per project guidelines.

🔎 Proposed fix
         if audit_proc:
             try:
                 audit_proc.terminate()
-            except Exception:
-                pass
+            except Exception as e:
+                print(f"[cleanup] audit_proc.terminate() failed: {e}")

799-804: Complex assertion condition could be simplified.

The multi-line or chain for checking capability-related messages is fragile and hard to read. Consider using a list-based approach for clarity.

🔎 Proposed simplification
-        assert (
-            "cap_chown" in logs.lower() or "cap chown" in logs.lower() or "cap_chown" in logs or "capabilities (chown" in logs.lower()
-        )
+        chown_indicators = ["cap_chown", "cap chown", "capabilities (chown"]
+        assert any(indicator in logs.lower() for indicator in chown_indicators), (
+            f"Expected CAP_CHOWN related message in logs. Got: {logs[:500]}"
+        )
test/docker_tests/test_docker_compose_scenarios.py (3)

44-63: Duplicate helper function across test files.

This capture_project_mandatory_required_audit_stream function is nearly identical to the one in test_mount_diagnostics_pytest.py but accepts a list of container names. Consider extracting to a shared test utilities module to reduce duplication.

Note: The function in test_mount_diagnostics_pytest.py (lines 67-76) accepts a single string while this accepts a list - both serve similar purposes but have different signatures.


652-656: Silent exception swallowing during process cleanup.

Same pattern as the other file - bare except Exception: pass silently swallows errors. Consider logging for consistency with the mandatory logging philosophy.

🔎 Proposed fix
     for proc in audit_streams:
         try:
             proc.terminate()
-        except Exception:
-            pass
+        except Exception as e:
+            print(f"[cleanup] audit stream terminate failed: {e}")

1027-1059: NET_RAW capability test mirrors NET_ADMIN test.

Both tests share nearly identical structure. If more capability-specific tests are added, consider parameterizing them.

These two tests could be combined using @pytest.mark.parametrize with the compose file and test name as parameters.

test/docker_tests/test_container_environment.py (2)

467-536: New test for root-to-user transition is well-structured.

The test properly validates the volume permission workflow:

  1. Phase 1: Initialize volume as root
  2. Phase 2: Restart as user 20211
  3. Cleanup in finally block with detailed logging on failure

The cleanup logic at lines 500-533 has multiple nested try-except blocks that silently pass. While cleanup errors are often non-fatal, this pattern appears frequently and could benefit from a shared cleanup helper.

Consider extracting the cleanup pattern into a reusable helper:

def _safe_cleanup_container(container_name: str | None) -> None:
    """Best-effort container cleanup with logging."""
    if not container_name:
        return
    try:
        subprocess.run(["docker", "rm", "-f", container_name], 
                      check=False, stdout=subprocess.DEVNULL, 
                      stderr=subprocess.DEVNULL, timeout=15)
    except Exception as e:
        print(f"[cleanup] Failed to remove {container_name}: {e}")

684-689: Silent exception when setting container_name attribute.

The try-except silently ignores failures when attaching container_name to the result object. This is defensive but the comment "unexpectedly frozen" suggests uncertainty about when this could happen.

🔎 Proposed improvement with logging
     try:
         result.container_name = name  # type: ignore[attr-defined]
-    except Exception:
-        # Be resilient if CompletedProcess is unexpectedly frozen.
-        pass
+    except Exception as e:
+        print(f"[warning] Could not attach container_name to result: {e}")

adamoutler and others added 2 commits January 2, 2026 20:36
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/PUID_PGID_SECURITY.md (1)

1-23: Consider consolidating duplicated security guidance.

This file duplicates most of the content from docs/docker-troubleshooting/PUID_PGID_SECURITY.md (lines 1–23 are nearly identical). If the intent is to provide both a root-level quick reference and a detailed troubleshooting guide, consider making the root version significantly shorter (e.g., a one-paragraph summary with a link to the full guide), or consolidate into a single authoritative file with appropriate cross-references.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 348002c and 850d93e.

📒 Files selected for processing (2)
  • docs/PUID_PGID_SECURITY.md
  • docs/docker-troubleshooting/PUID_PGID_SECURITY.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Devcontainer philosophy: brutal simplicity. One user, everything writable, completely idempotent. No permission checks, no conditional logic, no sudo needed.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: docker_dev
🔇 Additional comments (2)
docs/docker-troubleshooting/PUID_PGID_SECURITY.md (1)

1-43: Documentation is clear and well-structured.

The security rationale is clearly explained with good operator guidance and practical YAML examples. The past grammar correction (hyphenation of "environment-variable-based") has been applied correctly.

docs/PUID_PGID_SECURITY.md (1)

26-26: Verify the referenced external document exists.

Line 26 references docs/docker-troubleshooting/file-permissions.md, but based on the PR objectives, the exact filename and location of the permission/capability troubleshooting documentation are not fully clear. Please confirm this file exists or will be created before merging.

@adamoutler adamoutler marked this pull request as draft January 3, 2026 01:42
@adamoutler adamoutler marked this pull request as ready for review January 3, 2026 23:32
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
test/docker_tests/test_mount_diagnostics_pytest.py (1)

67-77: Function signature inconsistency and missing timeout consideration.

This function shares the same name as capture_project_mandatory_required_audit_stream in test_docker_compose_scenarios.py but has a different signature (single container vs. list, different parameters). Consider:

  1. Signature alignment: Rename this function (e.g., capture_single_container_audit_stream) or align the signatures for consistency.
  2. Timeout consideration: While subprocess.Popen doesn't accept a timeout parameter, document that this audit stream is intentionally long-running and will be terminated in test cleanup (as seen in the finally block at lines 686-689).
Proposed solution: Rename and document
-def capture_project_mandatory_required_audit_stream(container_name: str) -> subprocess.Popen[str]:
-    """Stream container logs to stdout for auditing; required to stay enabled."""
+def capture_single_container_audit_stream(container_name: str) -> subprocess.Popen[str]:
+    """
+    Stream container logs to stdout for auditing; required to stay enabled.
+    
+    Note: This process runs indefinitely and must be terminated by the caller.
+    See test cleanup (finally blocks) for proper termination.
+    """

And update the call site at line 650:

-        audit_proc = capture_project_mandatory_required_audit_stream(container_name)
+        audit_proc = capture_single_container_audit_stream(container_name)

As per coding guidelines, subprocess calls should have explicit timeouts, but this is an exception since the audit stream must run continuously during the test.

install/production-filesystem/root-entrypoint.sh (2)

61-85: Capability checking logic is correct.

The function correctly extracts and validates Linux capability bits using POSIX-compliant methods. The warning for missing SETUID/SETGID alongside NET_* capabilities is a helpful diagnostic.

Line 78 exceeds 120 characters. Consider splitting the condition for readability:

🔎 Optional refactor for line length
-    if [ $((cap_dec & (1 << 10))) -ne 0 ] || [ $((cap_dec & (1 << 12))) -ne 0 ] || [ $((cap_dec & (1 << 13))) -ne 0 ]; then
+    # Check for CAP_NET_BIND_SERVICE(10), CAP_NET_ADMIN(12), CAP_NET_RAW(13)
+    if [ $((cap_dec & (1 << 10))) -ne 0 ] || \
+       [ $((cap_dec & (1 << 12))) -ne 0 ] || \
+       [ $((cap_dec & (1 << 13))) -ne 0 ]; then
         has_net_caps=1
     fi

102-119: Path priming logic is correct with non-fatal error handling.

The best-effort approach (using || true on all permission operations) correctly implements the design goal of not blocking startup on permission issues.

Line 104 is extremely long (>200 characters) and difficult to read/maintain. Consider using a multi-line array:

🔎 Optional refactor for readability
 _prime_paths() {
     runtime_root="${NETALERTX_RUNTIME_BASE:-/tmp}"
-    paths="/tmp ${NETALERTX_DATA:-/data} ${NETALERTX_CONFIG:-/data/config} ${NETALERTX_DB:-/data/db} ${NETALERTX_LOG:-${runtime_root}/log} ${NETALERTX_PLUGINS_LOG:-${runtime_root}/log/plugins} ${NETALERTX_API:-${runtime_root}/api} ${SYSTEM_SERVICES_RUN:-${runtime_root}/run} ${SYSTEM_SERVICES_RUN_TMP:-${runtime_root}/run/tmp} ${SYSTEM_SERVICES_RUN_LOG:-${runtime_root}/run/logs} ${SYSTEM_SERVICES_ACTIVE_CONFIG:-${runtime_root}/nginx/active-config} ${runtime_root}/nginx"
+    paths="
+        /tmp
+        ${NETALERTX_DATA:-/data}
+        ${NETALERTX_CONFIG:-/data/config}
+        ${NETALERTX_DB:-/data/db}
+        ${NETALERTX_LOG:-${runtime_root}/log}
+        ${NETALERTX_PLUGINS_LOG:-${runtime_root}/log/plugins}
+        ${NETALERTX_API:-${runtime_root}/api}
+        ${SYSTEM_SERVICES_RUN:-${runtime_root}/run}
+        ${SYSTEM_SERVICES_RUN_TMP:-${runtime_root}/run/tmp}
+        ${SYSTEM_SERVICES_RUN_LOG:-${runtime_root}/run/logs}
+        ${SYSTEM_SERVICES_ACTIVE_CONFIG:-${runtime_root}/nginx/active-config}
+        ${runtime_root}/nginx
+    "
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 850d93e and f549db3.

📒 Files selected for processing (17)
  • .devcontainer/Dockerfile
  • .devcontainer/devcontainer.json
  • .devcontainer/resources/devcontainer-Dockerfile
  • .devcontainer/scripts/setup.sh
  • .gitignore
  • install/production-filesystem/README.md
  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • install/production-filesystem/root-entrypoint.sh
  • test/docker_tests/configurations/docker-compose.missing-caps.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.cap_chown_missing.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.db_ramdisk.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.db_unwritable.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.log_mounted.yml
  • test/docker_tests/conftest.py
  • test/docker_tests/test_container_environment.py
  • test/docker_tests/test_docker_compose_unit.py
  • test/docker_tests/test_mount_diagnostics_pytest.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (6)
  • .devcontainer/resources/devcontainer-Dockerfile
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.cap_chown_missing.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.db_unwritable.yml
  • test/docker_tests/configurations/docker-compose.missing-caps.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.db_ramdisk.yml
  • test/docker_tests/configurations/mount-tests/docker-compose.mount-test.log_mounted.yml
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Use mylog(level, [message]) for logging; levels are: none/minimal/verbose/debug/trace. Use none for most important messages that should always appear, such as exceptions.
Always set explicit timeouts on subprocess calls. Default to 60s minimum unless plugin config specifies otherwise. Nested subprocess calls need their own timeout.
Never hardcode ports or secrets; always use get_setting_value() to retrieve configuration values.
Use environment variables (NETALERTX_DB, NETALERTX_LOG, etc.) everywhere instead of hardcoding paths like /data/db or relative paths.
Use helper.py functions (timeNowDB, normalize_mac, sanitizers) for time/MAC/string operations. Validate MACs before DB writes.
Add/modify settings via ccd() in server/initialise.py or per-plugin manifest. Define config once and read it via helpers everywhere.
Always leave logging enabled. If there is a possibility it will be difficult to debug with current logging, add more logging.

Files:

  • test/docker_tests/test_mount_diagnostics_pytest.py
  • test/docker_tests/conftest.py
  • test/docker_tests/test_docker_compose_unit.py
  • test/docker_tests/test_container_environment.py
.devcontainer/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Never chmod or chown during operations in devcontainer. Everything is already writable. If you need permissions, the devcontainer setup is broken - fix .devcontainer/scripts/setup.sh or .devcontainer/resources/devcontainer-Dockerfile instead.

Files:

  • .devcontainer/Dockerfile
  • .devcontainer/scripts/setup.sh
  • .devcontainer/devcontainer.json
🧠 Learnings (20)
📓 Common learnings
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Devcontainer philosophy: brutal simplicity. One user, everything writable, completely idempotent. No permission checks, no conditional logic, no sudo needed.
📚 Learning: 2025-11-01T19:02:10.635Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1263
File: install/production-filesystem/entrypoint.sh:60-86
Timestamp: 2025-11-01T19:02:10.635Z
Learning: In the NetAlertX project (install/production-filesystem/entrypoint.sh), when fail-fast behavior is adopted for entrypoint checks in the future, be lenient during review because tests will be removed or changed to accommodate that behavior. The current continue-on-failure approach exists to support the existing test suite, but this is expected to change along with corresponding test adjustments.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • install/production-filesystem/README.md
  • install/production-filesystem/root-entrypoint.sh
📚 Learning: 2025-10-26T15:39:36.707Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1235
File: install/production-filesystem/services/scripts/check-first-run-config.sh:0-0
Timestamp: 2025-10-26T15:39:36.707Z
Learning: In NetAlertX startup scripts, critical initialization failures (e.g., unable to create config directory or copy default config files) should exit with non-zero status to fail fast and provide clear error messages, rather than continuing in a broken state.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • install/production-filesystem/README.md
  • install/production-filesystem/root-entrypoint.sh
📚 Learning: 2025-12-08T01:59:00.903Z
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Applies to **/*.py : Use environment variables (`NETALERTX_DB`, `NETALERTX_LOG`, etc.) everywhere instead of hardcoding paths like `/data/db` or relative paths.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • .devcontainer/scripts/setup.sh
📚 Learning: 2025-10-26T16:45:41.247Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1235
File: install/production-filesystem/services/scripts/check-ramdisk.sh:0-0
Timestamp: 2025-10-26T16:45:41.247Z
Learning: In NetAlertX check scripts (install/production-filesystem/services/scripts/check-*.sh), not all checks should exit with non-zero status. Some checks, like check-ramdisk.sh, are warning-only and exit 0 even when issues are detected, allowing the application to start despite suboptimal configuration.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • install/production-filesystem/README.md
📚 Learning: 2025-09-20T14:09:29.159Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/scripts/setup.sh:103-116
Timestamp: 2025-09-20T14:09:29.159Z
Learning: In NetAlertX devcontainer setup, the netalertx user has write permissions to /var/log/nginx/ directory as it's explicitly chowned to netalertx:www-data in the Dockerfile, so setup.sh can write to nginx log files without sudo.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • .devcontainer/Dockerfile
  • install/production-filesystem/README.md
  • .devcontainer/scripts/setup.sh
  • install/production-filesystem/root-entrypoint.sh
  • .devcontainer/devcontainer.json
📚 Learning: 2025-09-20T14:08:48.256Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/scripts/setup.sh:90-96
Timestamp: 2025-09-20T14:08:48.256Z
Learning: In the NetAlertX devcontainer setup, the setup.sh script intentionally removes user_notifications.json from the API directory during development environment initialization to prevent notification clutter that accumulates during container launches and development work.

Applied to files:

  • install/production-filesystem/entrypoint.d/35-apply-conf-override.sh
  • .devcontainer/scripts/setup.sh
  • .devcontainer/devcontainer.json
📚 Learning: 2025-09-20T03:01:19.912Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:18-19
Timestamp: 2025-09-20T03:01:19.912Z
Learning: In the NetAlertX repository, .devcontainer/Dockerfile is auto-generated and should not be reviewed directly. Review comments about dependencies and build steps should be directed at the root Dockerfile where the actual source commands are located.

Applied to files:

  • .devcontainer/Dockerfile
  • test/docker_tests/test_container_environment.py
  • .devcontainer/devcontainer.json
📚 Learning: 2025-09-20T02:56:24.501Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/devcontainer.json:5-8
Timestamp: 2025-09-20T02:56:24.501Z
Learning: In the NetAlertX devcontainer setup, the final .devcontainer/Dockerfile is generated by combining the root Dockerfile with .devcontainer/resources/devcontainer-Dockerfile using the generate-dockerfile.sh script. The devcontainer.json should reference the generated file, not the root Dockerfile.

Applied to files:

  • .devcontainer/Dockerfile
  • install/production-filesystem/README.md
  • .devcontainer/devcontainer.json
📚 Learning: 2025-12-08T01:59:00.903Z
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Applies to .devcontainer/**/* : Never `chmod` or `chown` during operations in devcontainer. Everything is already writable. If you need permissions, the devcontainer setup is broken - fix `.devcontainer/scripts/setup.sh` or `.devcontainer/resources/devcontainer-Dockerfile` instead.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/scripts/setup.sh
  • test/docker_tests/test_container_environment.py
📚 Learning: 2025-12-08T01:59:00.903Z
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Devcontainer philosophy: brutal simplicity. One user, everything writable, completely idempotent. No permission checks, no conditional logic, no sudo needed.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/scripts/setup.sh
  • test/docker_tests/test_container_environment.py
📚 Learning: 2025-10-26T17:09:18.621Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1235
File: .devcontainer/scripts/setup.sh:146-148
Timestamp: 2025-10-26T17:09:18.621Z
Learning: In `.devcontainer/scripts/setup.sh` and other devcontainer setup scripts for NetAlertX, chmod 666 on /var/run/docker.sock is acceptable because devcontainer environments are single-user development contexts where convenience can take priority over strict permission hardening.

Applied to files:

  • .devcontainer/Dockerfile
  • install/production-filesystem/README.md
  • .devcontainer/scripts/setup.sh
  • install/production-filesystem/root-entrypoint.sh
  • test/docker_tests/test_container_environment.py
  • .devcontainer/devcontainer.json
📚 Learning: 2025-09-20T14:13:33.635Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:11-16
Timestamp: 2025-09-20T14:13:33.635Z
Learning: In the NetAlertX devcontainer setup, the `python -m venv /opt/venv` command works successfully on Alpine 3.22 despite the typical Alpine behavior of not providing a /usr/bin/python symlink by default. The build completes successfully and pytest runs without issues.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/scripts/setup.sh
📚 Learning: 2025-10-19T01:40:57.095Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1230
File: .devcontainer/resources/devcontainer-overlay/services/config/nginx/netalertx.conf.template:1-2
Timestamp: 2025-10-19T01:40:57.095Z
Learning: In the NetAlertX repository, .devcontainer/resources/devcontainer-overlay/services/config/nginx/netalertx.conf.template is an auto-generated file that is intentionally committed to source control. It cannot be regenerated automatically outside the devcontainer environment and is required for the devcontainer to start, creating a bootstrap dependency.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/devcontainer.json
📚 Learning: 2025-10-10T22:16:02.770Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1214
File: install/ubuntu24/uninstall.sh:129-141
Timestamp: 2025-10-10T22:16:02.770Z
Learning: NetAlertX uninstall procedures should only remove files from the repository (specifically /app files) and should not touch system packages like PHP, nginx, avahi, or other shared system components to avoid damaging user systems.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/scripts/setup.sh
📚 Learning: 2025-09-20T14:13:33.635Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:11-16
Timestamp: 2025-09-20T14:13:33.635Z
Learning: In the NetAlertX repository with Alpine 3.22 base image, the `python -m venv` command works correctly in the devcontainer setup, likely due to symlink creation in the root Dockerfile that makes `python` available as an alias to `python3`.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/devcontainer.json
📚 Learning: 2025-10-19T15:29:46.423Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1230
File: front/plugins/dhcp_servers/script.py:44-44
Timestamp: 2025-10-19T15:29:46.423Z
Learning: In the NetAlertX dhcp_servers plugin (front/plugins/dhcp_servers/script.py), the nmap command uses both 'sudo' and '--privileged' flag to maintain cross-platform compatibility. While the hardened Docker image stubs sudo and uses capabilities, hardware installations (Debian 12, Ubuntu 24) and the Debian Dockerfile require sudo for raw socket access. This approach ensures the plugin works across all deployment targets.

Applied to files:

  • .devcontainer/Dockerfile
  • .devcontainer/devcontainer.json
📚 Learning: 2025-11-24T00:56:58.733Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1304
File: .hadolint.yaml:1-2
Timestamp: 2025-11-24T00:56:58.733Z
Learning: In the NetAlertX repository, the .hadolint.yaml config file is intentionally configured to only globally ignore DL3018 (Alpine package pinning). Rules DL3006 (explicit image tagging) and SC2114 (deleting system directories) are handled via inline `# hadolint ignore` directives in specific Dockerfiles where needed, rather than being globally ignored, to ensure new Dockerfiles are still checked for these rules.

Applied to files:

  • .devcontainer/Dockerfile
📚 Learning: 2025-09-20T14:08:44.292Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/scripts/stream-logs.sh:5-6
Timestamp: 2025-09-20T14:08:44.292Z
Learning: The .devcontainer/scripts/stream-logs.sh script in NetAlertX is designed as a diagnostic tool for troubleshooting devcontainer startup issues. When log files don't exist, this indicates that the executable/services didn't start properly, which is valuable diagnostic information. Pre-creating missing files would mask this diagnostic behavior.

Applied to files:

  • install/production-filesystem/README.md
  • .devcontainer/devcontainer.json
📚 Learning: 2025-09-20T14:08:44.152Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:82-92
Timestamp: 2025-09-20T14:08:44.152Z
Learning: In devcontainer builds, source repository files are not available during the Docker build phase. The source code gets mounted into the container after it's built and started, so COPY commands referencing source files will fail. Configuration files need to be handled at runtime (e.g., in setup scripts) rather than during the build stage.

Applied to files:

  • .devcontainer/devcontainer.json
🧬 Code graph analysis (3)
test/docker_tests/test_mount_diagnostics_pytest.py (2)
test/docker_tests/test_docker_compose_scenarios.py (1)
  • capture_project_mandatory_required_audit_stream (45-63)
front/js/modal.js (1)
  • result (564-564)
test/docker_tests/test_docker_compose_unit.py (1)
test/docker_tests/test_docker_compose_scenarios.py (1)
  • _run_docker_compose (394-658)
test/docker_tests/test_container_environment.py (1)
test/docker_tests/test_ports_available.py (2)
  • _run_container (110-171)
  • _assert_contains (174-182)
🪛 Ruff (0.14.10)
test/docker_tests/test_mount_diagnostics_pytest.py

70-70: subprocess call: check for execution of untrusted input

(S603)


71-71: Starting a process with a partial executable path

(S607)


591-591: subprocess call: check for execution of untrusted input

(S603)


592-592: Consider [*base_cmd, "down", "-v"] instead of concatenation

Replace with [*base_cmd, "down", "-v"]

(RUF005)


602-614: Consider iterable unpacking instead of concatenation

Replace with iterable unpacking

(RUF005)


615-615: subprocess call: check for execution of untrusted input

(S603)


623-623: subprocess call: check for execution of untrusted input

(S603)


624-624: Starting a process with a partial executable path

(S607)


634-634: Consider [*base_cmd, "up", "-d"] instead of concatenation

Replace with [*base_cmd, "up", "-d"]

(RUF005)


638-638: subprocess call: check for execution of untrusted input

(S603)


658-658: subprocess call: check for execution of untrusted input

(S603)


659-659: Starting a process with a partial executable path

(S607)


680-680: subprocess call: check for execution of untrusted input

(S603)


681-681: Consider [*base_cmd, "down", "-v"] instead of concatenation

Replace with [*base_cmd, "down", "-v"]

(RUF005)


688-689: try-except-pass detected, consider logging the exception

(S110)


688-688: Do not catch blind exception: Exception

(BLE001)


749-749: subprocess call: check for execution of untrusted input

(S603)


754-754: subprocess call: check for execution of untrusted input

(S603)


755-755: Starting a process with a partial executable path

(S607)


768-768: subprocess call: check for execution of untrusted input

(S603)


781-781: subprocess call: check for execution of untrusted input

(S603)


782-782: Starting a process with a partial executable path

(S607)


789-789: subprocess call: check for execution of untrusted input

(S603)


790-790: Starting a process with a partial executable path

(S607)


808-808: subprocess call: check for execution of untrusted input

(S603)

test/docker_tests/conftest.py

35-35: subprocess call: check for execution of untrusted input

(S603)


58-58: Unused function argument: cleanup_artifacts

(ARG001)

test/docker_tests/test_container_environment.py

302-302: subprocess call: check for execution of untrusted input

(S603)


303-303: Starting a process with a partial executable path

(S607)


314-314: subprocess call: check for execution of untrusted input

(S603)


315-315: Starting a process with a partial executable path

(S607)


340-340: subprocess call: check for execution of untrusted input

(S603)


398-398: subprocess call: check for execution of untrusted input

(S603)


411-411: subprocess call: check for execution of untrusted input

(S603)


412-424: Starting a process with a partial executable path

(S607)


505-505: subprocess call: check for execution of untrusted input

(S603)


506-506: Starting a process with a partial executable path

(S607)


517-518: try-except-pass detected, consider logging the exception

(S110)


517-517: Do not catch blind exception: Exception

(BLE001)


525-525: subprocess call: check for execution of untrusted input

(S603)


525-525: Starting a process with a partial executable path

(S607)


526-527: try-except-pass detected, consider logging the exception

(S110)


526-526: Do not catch blind exception: Exception

(BLE001)


532-532: subprocess call: check for execution of untrusted input

(S603)


532-532: Starting a process with a partial executable path

(S607)


533-534: try-except-pass detected, consider logging the exception

(S110)


533-533: Do not catch blind exception: Exception

(BLE001)


661-661: subprocess call: check for execution of untrusted input

(S603)


688-690: try-except-pass detected, consider logging the exception

(S110)


688-688: Do not catch blind exception: Exception

(BLE001)


719-725: Avoid specifying long messages outside the exception class

(TRY003)


1264-1264: Probable insecure usage of temporary file or directory: "/tmp/nginx:uid=20211,gid=20211,mode=755"

(S108)


1321-1321: Probable insecure usage of temporary file or directory: "/tmp/nginx:uid=20211,gid=20211,mode=755"

(S108)


1429-1429: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: docker_dev
🔇 Additional comments (23)
test/docker_tests/test_mount_diagnostics_pytest.py (5)

168-179: LGTM!

Clean helper function that enforces the presence of troubleshooting URLs in diagnostic output. The implementation correctly iterates through all possible URLs and provides a clear failure message.


597-619: LGTM!

The volume pre-initialization correctly sets up the unreadable mount scenarios. The use of chmod 0300 properly creates write-execute-only permissions, and the implementation includes proper timeout and error handling as required.


637-690: LGTM! Audit streaming properly integrated.

The audit streaming integration is well-implemented:

  • Properly captures and prints diagnostic output for debugging
  • Ensures cleanup in the finally block
  • The broad exception handling (lines 686-689) is acceptable for cleanup logic where we don't want cleanup failures to mask test failures

The static analysis warnings (S110, BLE001) about broad exception handling are false positives in this cleanup context.


728-812: LGTM! Good coverage of FAIL-SOFT behavior.

This test correctly validates that the container warns but continues running when CAP_CHOWN is dropped. The multiple assertion conditions on lines 801-804 appropriately handle variations in diagnostic message formatting, ensuring robust validation of the capability warning output.


59-64: All referenced troubleshooting documentation files already exist in the repository at docs/docker-troubleshooting/. The hardcoded URLs will resolve correctly.

install/production-filesystem/entrypoint.d/35-apply-conf-override.sh (1)

7-10: LGTM!

The change from creating the parent directory to creating the config directory itself is appropriate and simplifies the logic. The updated error message correctly reflects the actual path being created.

test/docker_tests/test_docker_compose_unit.py (1)

4-41: LGTM! Previous review concern addressed.

The test now correctly mocks the ps command to return valid container entries (Line 21), which aligns with the expected behavior of _run_docker_compose. The test structure properly validates that the function returns a CompletedProcess with combined output.

Note: The Ruff warnings about unused arguments in fake_run (Line 25) are false positives—the function signature must match subprocess.run's signature for monkeypatching to work correctly.

test/docker_tests/test_container_environment.py (8)

7-25: Excellent documentation for mandatory logging requirements.

This documentation block clearly communicates the critical nature of logging in Docker tests and aligns perfectly with the project's coding guidelines. It effectively prevents well-meaning refactors that would remove essential diagnostic output.


302-432: Consistent implementation of mandatory logging for subprocess calls.

The changes to capture output and print stdout/stderr for Docker volume operations align with the file's mandatory logging requirements. This ensures all Docker operations are visible in CI logs for debugging.


468-537: Well-structured test for root-to-user transition scenario.

This test properly validates the important use case of reusing a root-initialized volume when restarting as user 20211. The multi-phase approach with comprehensive cleanup and error logging is appropriate for this complex scenario.

The cleanup logic (lines 500-536) correctly:

  • Surfaces container logs on failure
  • Performs best-effort container removal
  • Removes the test volume

539-692: Valuable enhancements to _run_container for debugging and cleanup.

The additions improve test reliability and debuggability:

  • rm_on_exit parameter (Line 551) allows preserving failed containers for inspection
  • Attaching container_name to the result (Lines 686-690) enables callers to perform cleanup
  • Enhanced logging documentation (Lines 656-680) reinforces the mandatory logging requirements

708-726: Useful helper for resilient test assertions.

The _assert_contains_any function makes tests more maintainable by accepting multiple acceptable message variants. This is particularly valuable when diagnostic messages may evolve while maintaining the same semantic meaning.


879-886: Good use of _assert_contains_any for message resilience.

The test now accepts multiple message variants for capability warnings, making it resilient to wording changes while still validating the essential functionality.


1290-1293: Clear documentation of warning vs error behavior.

The notes correctly clarify that dataloss warnings do not cause non-zero exit codes—the container continues running after displaying warnings. This is important context for understanding the test assertions.

Also applies to: 1347-1350


1362-1383: Document the userns workaround rationale.

The helper function removes --userns host to avoid potential hangs in the devcontainer environment (Line 1366 comment). While this workaround appears functional, it changes the test's execution context.

Consider documenting why the devcontainer environment experiences hangs with --userns host and whether this affects the test's ability to validate the production behavior accurately. If this is a known limitation, a brief comment explaining the trade-off would help future maintainers.

.devcontainer/scripts/setup.sh (2)

50-52: Good addition of required subdirectories.

Creating /tmp/run/tmp and /tmp/log/plugins immediately after tmpfs mount ensures these directories exist before services start, preventing potential startup failures.


91-91: Simplified permission handling aligns with devcontainer philosophy.

The consolidated chmod -R 777 on both paths simplifies the setup script and aligns with the "brutal simplicity" principle for devcontainer environments. This approach is appropriate for single-user development contexts.

Based on learnings, devcontainer philosophy emphasizes simplicity and idempotency over granular permission control.

.devcontainer/devcontainer.json (2)

15-16: Appropriate addition of NET_BIND_SERVICE capability.

Adding NET_BIND_SERVICE alongside existing NET_RAW and NET_ADMIN capabilities enables the devcontainer to bind privileged ports (e.g., UDP 137 for NetBIOS), which aligns with the PR's enhanced network scanning capabilities.


51-55: Improved user-facing messages.

The updated messages provide clearer information about the devcontainer startup process and test container build status.

.devcontainer/Dockerfile (1)

278-280: LGTM - Build-time chmod is appropriate for devcontainer.

The chmod operations ensure entrypoint scripts remain executable in the devcontainer context, preventing 126 errors. These are build-time operations (not runtime operations), which aligns with the devcontainer philosophy.

Based on learnings, this file is auto-generated by .devcontainer/scripts/generate-configs.sh, but the changes are appropriate for ensuring script executability.

test/docker_tests/conftest.py (1)

58-58: Static analysis false positive - fixture dependency is intentional.

The cleanup_artifacts parameter is correctly used as a pytest fixture dependency to ensure cleanup runs before image building. The "unused argument" warning from Ruff (ARG001) is a false positive in this context.

install/production-filesystem/README.md (1)

89-149: Excellent documentation of the new startup architecture.

The documentation clearly explains the root-entrypoint.sh behavior, PUID/PGID handling, and the multi-step boot flow. The distinction between fatal (malformed PUID/PGID) and non-fatal (permission errors) failures is well-articulated, which will help operators troubleshoot startup issues.

install/production-filesystem/root-entrypoint.sh (1)

123-130: Excellent resilient privilege-drop implementation.

The fallback logic (lines 124-129) gracefully handles scenarios where su-exec cannot drop privileges, setting diagnostic flags (NETALERTX_PRIVDROP_FAILED, NETALERTX_CHECK_ONLY) and continuing as the current user. This resilience is appropriate for first-run scenarios and diverse host environments.

Copy link
Member Author

@adamoutler adamoutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

@jokob-sk jokob-sk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adamoutler ! 2 small things - one maybe unrelated.

PORT: 9999 # Use non-default port to test all paths
APP_CONF_OVERRIDE: 20212
PORT: ${PORT:-9999} # Use non-default port to test all paths
APP_CONF_OVERRIDE: ${GRAPHQL_PORT:-26212}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test where port must be changed but active conf is unwritable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but the correct syntax for the APP_CONF_OVERRIDE is:

APP_CONF_OVERRIDE={"GRAPHQL_PORT":"26212"}

so I assume this should be something like this (unsure if supported)

      APP_CONF_OVERRIDE: ${APP_CONF_OVERRIDE:-{"GRAPHQL_PORT":"26212"}}

image: netalertx-test
container_name: netalertx-test-mount-api_noread
entrypoint: ["sh", "-lc", "sleep infinity"]
user: "20211:20211"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the user be hard coded since we are switching to a configurable value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is a test file whose permissions are set inappropriately for access.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Dockerfile (1)

191-191: The entrypoint.sh script uses only POSIX sh-compatible syntax and does not require bash.

The script contains no bash-specific features (no arrays, bash-specific builtins, or process substitution). It uses only standard POSIX sh syntax including variable expansion, conditionals, and built-in commands. While bash will run the script without issues since it's installed, the change from /bin/sh to /bin/bash is unnecessary and adds overhead in a minimal container context. Consider reverting to /bin/sh for consistency with Alpine's lightweight design philosophy.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f549db3 and e359ea0.

📒 Files selected for processing (1)
  • Dockerfile
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-09-20T14:13:33.635Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:11-16
Timestamp: 2025-09-20T14:13:33.635Z
Learning: In the NetAlertX devcontainer setup, the `python -m venv /opt/venv` command works successfully on Alpine 3.22 despite the typical Alpine behavior of not providing a /usr/bin/python symlink by default. The build completes successfully and pytest runs without issues.

Applied to files:

  • Dockerfile
📚 Learning: 2025-09-20T14:09:29.159Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/scripts/setup.sh:103-116
Timestamp: 2025-09-20T14:09:29.159Z
Learning: In NetAlertX devcontainer setup, the netalertx user has write permissions to /var/log/nginx/ directory as it's explicitly chowned to netalertx:www-data in the Dockerfile, so setup.sh can write to nginx log files without sudo.

Applied to files:

  • Dockerfile
📚 Learning: 2025-10-10T22:16:02.770Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1214
File: install/ubuntu24/uninstall.sh:129-141
Timestamp: 2025-10-10T22:16:02.770Z
Learning: NetAlertX uninstall procedures should only remove files from the repository (specifically /app files) and should not touch system packages like PHP, nginx, avahi, or other shared system components to avoid damaging user systems.

Applied to files:

  • Dockerfile
📚 Learning: 2025-09-20T14:13:33.635Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1184
File: .devcontainer/Dockerfile:11-16
Timestamp: 2025-09-20T14:13:33.635Z
Learning: In the NetAlertX repository with Alpine 3.22 base image, the `python -m venv` command works correctly in the devcontainer setup, likely due to symlink creation in the root Dockerfile that makes `python` available as an alias to `python3`.

Applied to files:

  • Dockerfile
📚 Learning: 2025-10-19T15:29:46.423Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1230
File: front/plugins/dhcp_servers/script.py:44-44
Timestamp: 2025-10-19T15:29:46.423Z
Learning: In the NetAlertX dhcp_servers plugin (front/plugins/dhcp_servers/script.py), the nmap command uses both 'sudo' and '--privileged' flag to maintain cross-platform compatibility. While the hardened Docker image stubs sudo and uses capabilities, hardware installations (Debian 12, Ubuntu 24) and the Debian Dockerfile require sudo for raw socket access. This approach ensures the plugin works across all deployment targets.

Applied to files:

  • Dockerfile
📚 Learning: 2025-10-26T17:09:18.621Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1235
File: .devcontainer/scripts/setup.sh:146-148
Timestamp: 2025-10-26T17:09:18.621Z
Learning: In `.devcontainer/scripts/setup.sh` and other devcontainer setup scripts for NetAlertX, chmod 666 on /var/run/docker.sock is acceptable because devcontainer environments are single-user development contexts where convenience can take priority over strict permission hardening.

Applied to files:

  • Dockerfile
📚 Learning: 2025-11-24T00:56:58.733Z
Learnt from: adamoutler
Repo: jokob-sk/NetAlertX PR: 1304
File: .hadolint.yaml:1-2
Timestamp: 2025-11-24T00:56:58.733Z
Learning: In the NetAlertX repository, the .hadolint.yaml config file is intentionally configured to only globally ignore DL3018 (Alpine package pinning). Rules DL3006 (explicit image tagging) and SC2114 (deleting system directories) are handled via inline `# hadolint ignore` directives in specific Dockerfiles where needed, rather than being globally ignored, to ensure new Dockerfiles are still checked for these rules.

Applied to files:

  • Dockerfile
📚 Learning: 2025-12-08T01:59:00.903Z
Learnt from: CR
Repo: jokob-sk/NetAlertX PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-08T01:59:00.903Z
Learning: Applies to .devcontainer/**/* : Never `chmod` or `chown` during operations in devcontainer. Everything is already writable. If you need permissions, the devcontainer setup is broken - fix `.devcontainer/scripts/setup.sh` or `.devcontainer/resources/devcontainer-Dockerfile` instead.

Applied to files:

  • Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: docker-tests
  • GitHub Check: docker_dev
🔇 Additional comments (4)
Dockerfile (4)

29-29: LGTM - hadolint ignores follow project conventions.

The inline hadolint ignores are appropriately scoped and consistent with the project's approach of using inline directives rather than global suppressions.

Also applies to: 45-45, 169-169


137-137: LGTM - su-exec is the appropriate tool for Alpine privilege dropping.

Adding su-exec enables the root-entrypoint.sh to safely drop privileges after performing setup tasks. This is the standard Alpine approach for privilege management.


224-225: LGTM - root-entrypoint.sh properly integrated into immutability pattern.

The ownership and permissions for /root-entrypoint.sh correctly follow the established "ownership-as-a-lock" hardening pattern, ensuring the root entrypoint cannot be modified at runtime.


242-243: Backward compatibility verified — no action needed.

The entrypoint change is properly implemented with full backward compatibility:

  1. Proper chaining confirmed: root-entrypoint.sh invokes /entrypoint.sh across all code paths (non-root bypass at line 89, root/PUID=0 at line 97, su-exec with fallback at line 124).

  2. Docker-compose compatibility: Existing configurations continue to work without modification; environment variables (PUID, PGID, NETALERTX_*) are preserved and no explicit entrypoint overrides are required.

  3. Bypass mechanisms validated:

    • Non-root containers skip priming and directly invoke /entrypoint.sh
    • PUID=0 (root mode) also bypasses priming
    • su-exec failures gracefully fall through to continue as the current user

The non-fatal permission priming design ensures resilience on restricted hosts (e.g., Synology) while maintaining security through environment variable validation and proper privilege dropping via su-exec.

Comment on lines +238 to +240
# Preserve root and system identities so hardened entrypoint never needs to patch /etc/passwd or /etc/group at runtime.
printf '#!/bin/sh\n"$@"\n' > /usr/bin/sudo && chmod +x /usr/bin/sudo
USER "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

git ls-files | head -20

Repository: jokob-sk/NetAlertX

Length of output: 801


🏁 Script executed:

# Explore the structure and find Dockerfile
find . -name "Dockerfile*" -type f | head -10

Repository: jokob-sk/NetAlertX

Length of output: 121


🏁 Script executed:

# Check for security documentation
find . -name "*SECURITY*" -o -name "*security*" -o -name "*PUID*" -o -name "*entrypoint*" | grep -E "\.(md|sh)$"

Repository: jokob-sk/NetAlertX

Length of output: 286


🏁 Script executed:

# Check Dockerfile line count
wc -l Dockerfile

Repository: jokob-sk/NetAlertX

Length of output: 76


🏁 Script executed:

# Read Dockerfile around lines 238-240
sed -n '230,250p' Dockerfile

Repository: jokob-sk/NetAlertX

Length of output: 836


🏁 Script executed:

# Read PUID_PGID_SECURITY.md to check documentation
cat docs/PUID_PGID_SECURITY.md

Repository: jokob-sk/NetAlertX

Length of output: 2038


🏁 Script executed:

# Read root-entrypoint.sh to verify privilege dropping
cat install/production-filesystem/root-entrypoint.sh

Repository: jokob-sk/NetAlertX

Length of output: 5319


PUID_PGID_SECURITY.md should expand to explain the root-startup architecture and privilege-dropping mechanism.

The hardened stage's USER "0" (line 240) and the call to /root-entrypoint.sh for privilege dropping is already documented in PUID_PGID_SECURITY.md, but the documentation focuses primarily on input validation. To provide operators with a complete security picture, expand the documentation to address:

  • Why the container starts as root (best-effort remedy for host volume ownership issues on restricted systems like Synology)
  • The privilege-dropping mechanism (su-exec to PUID/PGID after path priming)
  • Required capabilities for full functionality (CAP_SETUID, CAP_SETGID, CAP_CHOWN)
  • Security trade-offs and mitigation (non-fatal chown/chmod, numeric-only validation to prevent injection)

The implementation in root-entrypoint.sh includes comprehensive comments explaining this design; consolidate that rationale into PUID_PGID_SECURITY.md for operators.

🤖 Prompt for AI Agents
In Dockerfile around lines 238 to 240, expand PUID_PGID_SECURITY.md to fully
document the root-startup architecture and privilege-dropping mechanism: add
sections explaining why the hardened stage starts as root (to handle host volume
ownership on restricted systems like Synology), describe the path-priming then
su-exec flow in root-entrypoint.sh that drops to PUID/GID, list required Linux
capabilities (CAP_SETUID, CAP_SETGID, CAP_CHOWN) for full functionality, and
enumerate security trade-offs and mitigations (non-fatal chown/chmod,
numeric-only PUID/PGUID validation to prevent injection); reference the comments
in root-entrypoint.sh and include examples of recommended runtime flags or
policies for operators.

@jokob-sk jokob-sk merged commit 3de61dc into netalertx:main Jan 4, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants