Fix race condition causing permanent loss of active rule files#10238
Fix race condition causing permanent loss of active rule files#10238
Conversation
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR attempts to fix a rule-file update race by keeping path_to_rules populated while repository update processing runs asynchronously.
Concerns
- The new clone-based flow still allows overlapping update tasks to process stale independent snapshots and overwrite each other when they reinsert results, so rule additions, deletions, or content updates can still be lost.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| } | ||
|
|
||
| let existing_rules = me.path_to_rules.remove(&path_clone); | ||
| let existing_rules = me.path_to_rules.get(&path_clone).cloned(); |
There was a problem hiding this comment.
When a repository update arrives, the stream handler removes rules from the path_to_rules HashMap and re-inserts them asynchronously after processing. If a second update arrives during this async gap (e.g. from a git checkout or atomic file save), the remove returns None and the update is silently dropped. If the first update was a deletion and the dropped second update was the corresponding addition, the rules are permanently lost until client restart. Fix: clone the rules instead of removing them so they remain available in the HashMap during async processing, and serialize update processing per path by tracking in-flight tasks and queuing updates that arrive during the async gap. When a task completes, queued updates are drained and processed sequentially against the latest state, preventing both the permanent-loss bug and stale-snapshot overwrites. Co-Authored-By: Oz <[email protected]>
9dc7fb2 to
3520824
Compare
| Self::process_repository_updates(update, current_rules, repo_path.clone()) | ||
| .await; | ||
| current_rules = updated_rules; | ||
| combined_delta |
There was a problem hiding this comment.
Looks good overall, nice find! I wonder if there's a little potential for trouble in the fact that we break updates into discovered and deleted rules? This loses the ordering of those updates so an (add, delete) of the same rule ends up looking just like a (delete, add) in the emitted event. Might be better before adding to either list to dedupe?
Description
Fix a race condition in
ProjectContextModel's file watcher stream handler that can permanently lose active rule files until client restart.Root cause: When a repository update arrives, the stream handler calls
path_to_rules.remove()synchronously, then re-inserts the rules asynchronously afterprocess_repository_updatescompletes. If a second update arrives during this async gap (e.g. from a git checkout, rebase, or an editor's atomic save), theremove()returnsNoneand the update is silently dropped. When the first update is a deletion and the dropped second update is the corresponding re-addition, the rules are permanently lost.Fix: Change
remove()toget().cloned()so rules remain in the HashMap during async processing. Queries see slightly stale (pre-update) rules during the gap instead of no rules at all. Also derivesCloneonRuleAtPathandProjectRulesto support this.Linked Issue
N/A - discovered via debugging
Testing
Manual verification of the race condition scenario. The change is minimal and mechanical:
remove→get().cloned().Agent Mode
Conversation link
Co-Authored-By: Oz [email protected]