Skip to content

[fix] Add Windows stability check to file watcher to reduce false positives#14174

Merged
lukasmasuch merged 4 commits intodevelopfrom
lukasmasuch/win-file-watcher
Mar 24, 2026
Merged

[fix] Add Windows stability check to file watcher to reduce false positives#14174
lukasmasuch merged 4 commits intodevelopfrom
lukasmasuch/win-file-watcher

Conversation

@lukasmasuch
Copy link
Copy Markdown
Collaborator

Describe your changes

Fixes #13954

On Windows, background processes (Windows Defender, Search Indexer, OneDrive) can trigger spurious file change events through the ReadDirectoryChangesW API used by the watchdog library. This can cause false "File change" notifications in Streamlit even when no actual modifications have been made.

This PR adds a Windows-specific stability check to the event-based file watcher:

  • After detecting a potential file change (MD5 differs), wait 50ms on Windows
  • Re-read the file's MD5 to verify the change is real
  • If the MD5 reverts to the original value, treat it as a spurious event and ignore it
  • If the MD5 stays changed, proceed with the normal change notification

The fix only affects Windows (no latency impact on macOS/Linux) and only runs when an MD5 difference is detected (not on every event).

Testing Plan

  • Unit Tests - Added 3 new tests:
    • test_windows_stability_check_filters_transient_changes - verifies transient changes are ignored
    • test_windows_stability_check_allows_real_changes - verifies real changes are still detected
    • test_non_windows_skips_stability_check - verifies no impact on non-Windows platforms
  • Manual testing on Windows (not available to author - requesting community verification)

On Windows, background processes (Windows Defender, Search Indexer,
OneDrive) can trigger spurious file change events through the
ReadDirectoryChangesW API used by watchdog. This causes false "File
change" notifications even when no actual modifications were made.

This adds a Windows-specific stability check:
- After detecting an MD5 change, wait 50ms on Windows
- Re-read the file's MD5 to verify the change is real
- If MD5 reverts to original, treat as spurious event and ignore
- If MD5 stays changed, proceed with normal change notification

The fix only affects Windows (no latency impact on macOS/Linux) and
only runs when an MD5 difference is detected (not on every event).

Fixes #13954
Copilot AI review requested due to automatic review settings March 2, 2026 16:29
@lukasmasuch lukasmasuch added change:bugfix PR contains bug fix implementation impact:users PR changes affect end users ai-review If applied to PR or issue will run AI review workflow labels Mar 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 2, 2026

✅ PR preview is ready!

Name Link
📦 Wheel file https://core-previews.s3-us-west-2.amazonaws.com/pr-14174/streamlit-1.55.0-py3-none-any.whl
📦 @streamlit/component-v2-lib Download from artifacts
🕹️ Preview app pr-14174.streamlit.app (☁️ Deploy here if not accessible)

@snyk-io
Copy link
Copy Markdown
Contributor

snyk-io bot commented Mar 2, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions github-actions bot removed the ai-review If applied to PR or issue will run AI review workflow label Mar 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 2, 2026

Consolidated Code Review

Summary

This PR adds a Windows-specific stability check to the event-based file watcher (EventBasedPathWatcher) to reduce false-positive "file changed" notifications. On Windows, background processes (Windows Defender, Search Indexer, OneDrive) can trigger spurious filesystem events via the ReadDirectoryChangesW API used by watchdog. The fix detects these transient events by waiting 50ms after an MD5 change is detected and re-reading the file. If the MD5 reverts to the original value, the event is ignored as spurious. The check only runs on Windows and only after an MD5 difference has already been detected, so there is no impact on other platforms.

Files changed:

  • lib/streamlit/watcher/event_based_path_watcher.py — Added stability check logic (lines 498-530)
  • lib/tests/streamlit/watcher/event_based_path_watcher_test.py — Added 3 new unit tests (lines 652-791)

Code Quality

Both reviewers agreed the implementation is clean, well-scoped, and correctly positioned in the existing event handling flow. The stability check runs only after the mtime check and initial MD5 check have been exhausted, minimizing unnecessary work.

Agreed minor issues:

  1. Slightly misleading import comment (lines 507-509): The comment says "Import at function level to avoid circular imports," but streamlit.env_util has no circular dependency on the watcher module. A more accurate rationale would reference lazy-loading or that this code path is rarely executed. Both reviewers flagged this as a minor style nit.

  2. Narrating comment (line 529): # Use the verified MD5 as the new value describes what the next line does rather than why. Per the project's coding guidelines (which explicitly discourage obvious, redundant comments), this could be removed since new_md5 = verification_md5 is self-explanatory.

  3. The import time inside the if env_util.IS_WINDOWS: block is slightly unusual but defensible. Since time is a standard library module with negligible import cost, this is a minor style preference.

  4. Optional improvement (raised by opus-4.6-thinking): Extracting the magic number 0.05 into a module-level constant (e.g., _WINDOWS_STABILITY_DELAY_SECS = 0.05) would improve discoverability and tunability.

Test Coverage

Both reviewers agreed the three new tests provide good coverage:

  • test_windows_stability_check_filters_transient_changes: Verifies transient MD5 changes (reverts after delay) are correctly ignored. Asserts time.sleep(0.05) was called and calc_md5_with_blocking_retries was called twice.
  • test_windows_stability_check_allows_real_changes: Verifies genuine changes (MD5 stays different) still trigger the callback on Windows.
  • test_non_windows_skips_stability_check: Verifies the stability check is not executed on non-Windows platforms (only one calc_md5 call).

The mocking strategy is correct — @mock.patch("streamlit.env_util.IS_WINDOWS", True) patches the module-level constant, and @mock.patch("time.sleep") intercepts the runtime import.

Agreed residual gap: No real Windows integration/e2e validation exists for watchdog behavior. Both reviewers noted this but agreed it is outside the scope of this PR.

Minor gap (opus-4.6-thinking): Tests could additionally verify that the stored changed_path_info.md5 is correctly updated to the verified value after a real change on Windows, but this is implicitly covered by the callback being triggered.

Backwards Compatibility

Both reviewers agreed: no backwards compatibility concerns. The change is additive and Windows-only. No public API, protobuf, or configuration changes. macOS/Linux behavior is completely unchanged.

Security & Risk

Both reviewers assessed this as low risk. Agreed risk factors:

  1. 50ms delay may be insufficient for particularly slow antivirus scans, but it's a reasonable starting point that can be tuned.
  2. Extra calc_md5_with_blocking_retries call on Windows could add up to ~500ms in worst-case retry scenarios, but this only occurs when an actual content change is detected.
  3. Zero regression risk for non-Windows due to the if env_util.IS_WINDOWS: guard.

Accessibility

Not applicable — entirely backend/server-side with no frontend or UI changes.

Reviewer Agreement

Both reviewers (gpt-5.3-codex-high and opus-4.6-thinking) are in full agreement on all major points:

  • The fix is well-scoped, low-risk, and adequately tested
  • No blocking issues found
  • Minor style nits are non-blocking
  • Both recommend APPROVED

No areas of disagreement were identified between the two reviews. The opus-4.6-thinking review provided more granular analysis (e.g., the magic number constant suggestion, the changed_path_info.md5 update verification gap), while gpt-5.3-codex-high raised the broader point about adding Windows CI/integration test coverage.

Recommendations

  1. Update the import comment at lines 507-509: Replace "to avoid circular imports" with a more accurate rationale (e.g., "to keep module-level imports minimal since this module is lazy-loaded and this code path is rarely executed"). (Non-blocking)
  2. Remove the narrating comment at line 529 (# Use the verified MD5 as the new value). (Non-blocking)
  3. Consider extracting 0.05 into a module-level constant like _WINDOWS_STABILITY_DELAY_SECS for discoverability. (Optional)
  4. Consider adding Windows CI/integration smoke tests for watchdog behavior in a follow-up. (Future work)

Verdict

APPROVED: Both reviewers unanimously approve. The fix is well-scoped, addresses a real Windows-specific issue (linked to #13954), has clear and adequate test coverage, and poses no risk to other platforms. All noted issues are minor style suggestions and non-blocking.


Consolidated review by opus-4.6-thinking. Based on 2 of 2 expected reviews (all models completed successfully).


📋 Review by `gpt-5.3-codex-high`

Summary

This PR adds a Windows-only stability recheck in the event-based file watcher to reduce false positive change notifications caused by transient ReadDirectoryChangesW events. It introduces a second MD5 verification after a short delay and adds targeted unit tests for transient, real, and non-Windows behavior.

Code Quality

The implementation in lib/streamlit/watcher/event_based_path_watcher.py is clear and localized, with behavior gated behind env_util.IS_WINDOWS and no API surface changes. The new logic at lib/streamlit/watcher/event_based_path_watcher.py:498-530 is readable and uses existing MD5-based change detection paths consistently; no blocking maintainability issues found.

Test Coverage

Unit coverage is good for the new behavior:

  • test_windows_stability_check_filters_transient_changes
  • test_windows_stability_check_allows_real_changes
  • test_non_windows_skips_stability_check

These tests in lib/tests/streamlit/watcher/event_based_path_watcher_test.py:652-791 validate both positive and negative branches and confirm the Windows-only gating.
Residual gap: there is still no real Windows integration/e2e validation for watchdog behavior, so environment-specific edge cases may remain.

Backwards Compatibility

No public API or config changes are introduced. Behavior changes are intentionally limited to Windows and only after an MD5 mismatch is detected, so macOS/Linux behavior remains unchanged.

Security & Risk

No direct security concerns identified.
Primary regression risk is operational: the added time.sleep(0.05) in the event handler path on Windows can add latency under high event throughput, but the scope is constrained to changed-content events and appears acceptable for this bug fix.

Accessibility

No frontend/UI changes in this PR, so accessibility impact is not applicable.

Recommendations

  1. Add a Windows CI/integration test path (or nightly smoke test) that exercises real filesystem events, since mocked unit tests cannot fully reproduce watchdog/OS event quirks.

Verdict

APPROVED: The fix is well-scoped, adequately unit-tested for the intended logic, and appears safe to merge with only minor residual platform-validation risk.


This is an automated AI review by gpt-5.3-codex-high. Please verify the feedback and use your judgment.

📋 Review by `opus-4.6-thinking`

Summary

This PR adds a Windows-specific stability check to the event-based file watcher to reduce false-positive "file changed" notifications. On Windows, background processes (Windows Defender, Search Indexer, OneDrive) can trigger spurious filesystem events via the ReadDirectoryChangesW API used by watchdog. The fix detects these transient events by waiting 50ms after an MD5 change is detected and re-reading the file. If the MD5 reverts to the original value, the event is ignored as spurious. The check only runs on Windows and only when an MD5 difference has already been detected, so there is no performance impact on other platforms.

Files changed:

  • lib/streamlit/watcher/event_based_path_watcher.py — Added stability check logic and module docstring update
  • lib/tests/streamlit/watcher/event_based_path_watcher_test.py — Added 3 new unit tests

Code Quality

The implementation is clean, well-scoped, and correctly placed in the existing event handling flow. The stability check runs only after all faster rejection paths (mtime check, initial MD5 check) have been exhausted, minimizing unnecessary work.

A few minor observations:

  1. Slightly misleading import comment (line 507-509 of event_based_path_watcher.py): The comment says "Import at function level to avoid circular imports," but streamlit.env_util has no dependencies on the watcher module — there is no circular import risk. The real reason is that this module is lazy-loaded (only when watchdog is installed), and keeping imports minimal at module level is a reasonable pattern for that. Consider updating the comment to reflect the actual rationale.

  2. Narrating comment (line 529): # Use the verified MD5 as the new value describes what the next line does rather than why. Per the project's coding guidelines, this could be removed since new_md5 = verification_md5 is self-explanatory.

  3. The import time placement inside the if env_util.IS_WINDOWS: block is slightly unusual but defensible — it avoids importing a module that will never be used on non-Windows platforms in this code path. Since time is a standard library module with negligible import cost, moving it to the top level would also be acceptable.

  4. The production code correctly updates new_md5 = verification_md5 after the stability check so the stored MD5 reflects the most recent verified state, which is important for detecting subsequent changes.

Test Coverage

The three new tests provide good coverage of the stability check:

  • test_windows_stability_check_filters_transient_changes: Verifies that a transient MD5 change (reverts after 50ms) is correctly ignored. Checks that time.sleep(0.05) was called and that calc_md5_with_blocking_retries was called twice.
  • test_windows_stability_check_allows_real_changes: Verifies that genuine changes (MD5 stays different) still trigger the callback on Windows.
  • test_non_windows_skips_stability_check: Verifies that the stability check is not executed on non-Windows platforms by asserting calc_md5 is called only once.

The mocking strategy is correct:

  • @mock.patch("streamlit.env_util.IS_WINDOWS", True) correctly patches the module-level constant that the function-level import will read.
  • @mock.patch("time.sleep") correctly intercepts the time.sleep call from the runtime import time inside the function.

Existing tests are unaffected because IS_WINDOWS defaults to False on Linux CI, so the new code path is never reached in non-Windows-specific tests.

Minor gap: The tests could additionally verify that the stored changed_path_info.md5 is correctly updated to the verified value after a real change on Windows, but this is covered implicitly by the callback being triggered with the correct path.

Backwards Compatibility

No backwards compatibility concerns. The change is additive and only affects Windows:

  • On non-Windows platforms, the behavior is completely unchanged — no new code paths are executed.
  • On Windows, the only behavioral change is that spurious file events are now filtered out. Legitimate file changes are still detected (with at most a 50ms delay per event, only when an MD5 change is detected).
  • No public API changes, no protobuf changes, no configuration changes.

Security & Risk

Low risk. The main risks are:

  1. 50ms delay may be insufficient: For particularly slow antivirus scans, the 50ms stability window might not be long enough to catch all transient events. However, this is a reasonable starting point and can be tuned if needed. The current value is conservative (short enough to not noticeably delay real change detection).

  2. Extra calc_md5_with_blocking_retries call on Windows: This function itself has up to 5 retries with 100ms delays. In the worst case, a legitimate change on Windows could see an additional ~500ms delay from the stability check's MD5 calculation. This is acceptable since it only occurs when an actual content change is detected.

  3. No regression risk for non-Windows: The if env_util.IS_WINDOWS: guard ensures zero impact on macOS and Linux.

Accessibility

Not applicable — this change is entirely backend/server-side with no frontend or UI changes.

Recommendations

  1. Update the import comment at line 507-509 of event_based_path_watcher.py: Replace "to avoid circular imports" with a more accurate rationale, e.g., "to keep module-level imports minimal since this module is lazy-loaded and this code path is rarely executed."

  2. Remove the narrating comment at line 529 (# Use the verified MD5 as the new value) — the assignment new_md5 = verification_md5 is self-explanatory.

  3. Consider adding a module-level constant for the stability delay (e.g., _WINDOWS_STABILITY_DELAY_SECS = 0.05) to make it easier to find and tune in the future. This is optional but would improve discoverability.

These are all minor suggestions and none are blocking.

Verdict

APPROVED: A well-scoped, low-risk fix that addresses a real Windows-specific issue with clear test coverage and no impact on other platforms. The minor style suggestions above are non-blocking.


This is an automated AI review by opus-4.6-thinking.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Windows-only “stability” verification step to the event-based file watcher to reduce false-positive change notifications caused by spurious filesystem events on Windows.

Changes:

  • After an MD5 mismatch on Windows, sleep briefly and re-hash to confirm the change is stable before notifying.
  • Expands module documentation with Windows-specific rationale and behavior.
  • Adds unit tests covering transient vs real Windows changes and the non-Windows behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
lib/streamlit/watcher/event_based_path_watcher.py Implements the Windows stability re-check and documents the Windows-specific motivation.
lib/tests/streamlit/watcher/event_based_path_watcher_test.py Adds unit tests validating Windows stability filtering and ensuring non-Windows behavior is unchanged.

- Add blank line before bullet list for proper reST formatting
- Add try/except for verification MD5 to prevent dropping real changes
- Extract stability delay to _WINDOWS_STABILITY_DELAY_SECS constant
Copy link
Copy Markdown
Collaborator

@sfc-gh-lwilby sfc-gh-lwilby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukasmasuch lukasmasuch added the ai-review If applied to PR or issue will run AI review workflow label Mar 23, 2026
@lukasmasuch lukasmasuch added ai-review If applied to PR or issue will run AI review workflow and removed ai-review If applied to PR or issue will run AI review workflow labels Mar 24, 2026
@github-actions github-actions bot removed the ai-review If applied to PR or issue will run AI review workflow label Mar 24, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds a Windows-specific stability check to the event-based file watcher (EventBasedPathWatcher) to reduce false-positive "file changed" notifications caused by background processes (Windows Defender, Search Indexer, OneDrive). When an MD5 content change is detected on Windows, the watcher waits 50ms and re-reads the file hash. If the hash reverts to the original value, the event is treated as spurious and suppressed. No behavioral changes on macOS/Linux.

Changes span two files:

  • lib/streamlit/watcher/event_based_path_watcher.py — stability check logic + docstring update
  • lib/tests/streamlit/watcher/event_based_path_watcher_test.py — three new unit tests

All three reviewers (claude-4.6-opus-high-thinking, gemini-3.1-pro, gpt-5.3-codex-high) unanimously approved the PR.

Code Quality

The implementation is clean, well-scoped, and follows existing patterns in the file watcher module. All reviewers agreed on the high quality of the code:

  • The stability check is correctly positioned in the event handling pipeline — after the initial MD5 change is detected but before the callback fires.
  • Error handling is appropriate: if the verification MD5 calculation fails (StreamlitMaxRetriesError), the code gracefully falls back to using the initially computed MD5 rather than dropping the event.
  • The _WINDOWS_STABILITY_DELAY_SECS constant is properly typed as Final and prefixed with _.
  • Logging uses the existing _LOGGER at appropriate debug levels.
  • The new_md5 = verification_md5 update on line 548 correctly tracks the most recent known file state.

One reviewer (claude-4.6-opus-high-thinking) noted that time.sleep(0.05) blocks the watchdog observer's event dispatch thread, which could compound during bulk file operations (e.g., git checkout switching many files). For typical use cases this is negligible; this is a minor behavioral consideration, not a blocker.

Test Coverage

Three unit tests are added covering the primary scenarios — all reviewers agreed these are adequate:

  1. Transient change filtered — MD5 changes then reverts → callback suppressed
  2. Real change detected — MD5 changes and stays changed → callback fires
  3. Non-Windows bypass — stability check skipped entirely on Linux/macOS

Tests properly verify side effects (time.sleep called with correct delay, calc_md5 call counts, callback invocation).

Gap (raised by claude-4.6-opus-high-thinking, not mentioned by the other two reviewers): The StreamlitMaxRetriesError exception path in the verification step (lines 529–538) is not covered by any test. This is the path where the re-verification MD5 calculation fails and the code falls back to using the initially computed new_md5. A test for this path would improve confidence in the graceful degradation behavior. This is a nice-to-have rather than a blocker.

Backwards Compatibility

No breaking changes. All reviewers concur:

  • The stability check is gated behind env_util.IS_WINDOWS — no impact on macOS/Linux.
  • Only triggered after an MD5 difference is detected — no latency on events where content hasn't changed.
  • The 50ms delay is minimal and only affects the watchdog observer thread on Windows.
  • No public API or protocol changes.

Security & Risk

No security concerns identified. All reviewers agree. The changes are purely internal to the file watcher's event handling logic and do not affect WebSocket handling, authentication, session management, file serving, CORS, CSP, or any other security-sensitive surface.

Regression risk is low — the change is narrowly scoped to Windows file change event handling and has a safe fallback (proceed with the change) on any error.

External test recommendation

  • Recommend external_test: No
  • Triggered categories: None
  • Key evidence:
    • lib/streamlit/watcher/event_based_path_watcher.py: Changes are confined to the internal file watcher event handler. No routing, auth, embedding, asset serving, WebSocket, CORS, CSP, storage, or SiS runtime changes.
    • lib/tests/streamlit/watcher/event_based_path_watcher_test.py: Test-only changes.
  • Suggested external test focus areas: None
  • Confidence: High (all three reviewers independently reached the same conclusion)
  • Assumptions and gaps: None

Accessibility

No frontend changes — not applicable. All reviewers agree.

Recommendations

  1. Add a test for the StreamlitMaxRetriesError fallback path in the stability check (lines 529–538). This would verify that when the verification MD5 calculation fails, the callback still fires with the initial new_md5 value. (Raised by 1 of 3 reviewers; nice-to-have.)

  2. Update the function-level import comment (line 513) to remove the "circular imports" justification, since env_util is a leaf module with only stdlib imports. The "rarely executed" rationale is sufficient. (Cosmetic.)

Verdict

APPROVED: Well-scoped, low-risk Windows bugfix with appropriate error handling and good test coverage. All three AI reviewers unanimously approved. The only notable gap is a missing test for the StreamlitMaxRetriesError fallback path, which is a nice-to-have rather than a blocker.


This is a consolidated AI review by claude-4.6-opus-high-thinking, synthesizing reviews from claude-4.6-opus-high-thinking, gemini-3.1-pro, and gpt-5.3-codex-high.

This review also includes 2 inline comment(s) on specific code lines.

@lukasmasuch lukasmasuch enabled auto-merge (squash) March 24, 2026 01:35
@lukasmasuch lukasmasuch merged commit 0f4ddea into develop Mar 24, 2026
56 checks passed
@lukasmasuch lukasmasuch deleted the lukasmasuch/win-file-watcher branch March 24, 2026 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:bugfix PR contains bug fix implementation impact:users PR changes affect end users

Projects

None yet

Development

Successfully merging this pull request may close these issues.

False "File change" detection on Windows with no actual file modifications

3 participants