Skip to content

Fix race condition causing permanent loss of active rule files#10238

Open
szgupta wants to merge 1 commit intomasterfrom
suraj/fix-rule-files-race-condition
Open

Fix race condition causing permanent loss of active rule files#10238
szgupta wants to merge 1 commit intomasterfrom
suraj/fix-rule-files-race-condition

Conversation

@szgupta
Copy link
Copy Markdown
Member

@szgupta szgupta commented May 6, 2026

Description

Fix a race condition in ProjectContextModel's file watcher stream handler that can permanently lose active rule files until client restart.

Root cause: When a repository update arrives, the stream handler calls path_to_rules.remove() synchronously, then re-inserts the rules asynchronously after process_repository_updates completes. If a second update arrives during this async gap (e.g. from a git checkout, rebase, or an editor's atomic save), the remove() returns None and the update is silently dropped. When the first update is a deletion and the dropped second update is the corresponding re-addition, the rules are permanently lost.

Fix: Change remove() to get().cloned() so rules remain in the HashMap during async processing. Queries see slightly stale (pre-update) rules during the gap instead of no rules at all. Also derives Clone on RuleAtPath and ProjectRules to support this.

Linked Issue

N/A - discovered via debugging

Testing

Manual verification of the race condition scenario. The change is minimal and mechanical: removeget().cloned().

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Conversation link

Co-Authored-By: Oz [email protected]

@cla-bot cla-bot Bot added the cla-signed label May 6, 2026
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 6, 2026

@szgupta

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@szgupta szgupta requested review from cephalonaut and jefflloyd May 6, 2026 05:44
Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR attempts to fix a rule-file update race by keeping path_to_rules populated while repository update processing runs asynchronously.

Concerns

  • The new clone-based flow still allows overlapping update tasks to process stale independent snapshots and overwrite each other when they reinsert results, so rule additions, deletions, or content updates can still be lost.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread crates/ai/src/project_context/model.rs Outdated
}

let existing_rules = me.path_to_rules.remove(&path_clone);
let existing_rules = me.path_to_rules.get(&path_clone).cloned();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Cloning the current rules lets overlapping update tasks process independent stale snapshots, and whichever task finishes last overwrites the others when it reinserts the map entry. This still loses rule additions, deletions, or content updates; serialize update processing for this path or merge each update into the latest state before reinserting.

When a repository update arrives, the stream handler removes rules from
the path_to_rules HashMap and re-inserts them asynchronously after
processing. If a second update arrives during this async gap (e.g. from
a git checkout or atomic file save), the remove returns None and the
update is silently dropped. If the first update was a deletion and the
dropped second update was the corresponding addition, the rules are
permanently lost until client restart.

Fix: clone the rules instead of removing them so they remain available
in the HashMap during async processing, and serialize update processing
per path by tracking in-flight tasks and queuing updates that arrive
during the async gap. When a task completes, queued updates are drained
and processed sequentially against the latest state, preventing both the
permanent-loss bug and stale-snapshot overwrites.

Co-Authored-By: Oz <[email protected]>
@szgupta szgupta force-pushed the suraj/fix-rule-files-race-condition branch from 9dc7fb2 to 3520824 Compare May 6, 2026 07:01
Self::process_repository_updates(update, current_rules, repo_path.clone())
.await;
current_rules = updated_rules;
combined_delta
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, nice find! I wonder if there's a little potential for trouble in the fact that we break updates into discovered and deleted rules? This loses the ordering of those updates so an (add, delete) of the same rule ends up looking just like a (delete, add) in the emitted event. Might be better before adding to either list to dedupe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants