Fix WriteOnceBlock race condition where ReceiveAsync returns null#123435
Fix WriteOnceBlock race condition where ReceiveAsync returns null#123435stephentoub merged 6 commits intomainfrom
Conversation
…null Co-authored-by: stephentoub <[email protected]>
src/libraries/System.Threading.Tasks.Dataflow/src/Blocks/WriteOnceBlock.cs
Show resolved
Hide resolved
Co-authored-by: stephentoub <[email protected]>
src/libraries/System.Threading.Tasks.Dataflow/tests/Dataflow/WriteOnceBlockTests.cs
Show resolved
Hide resolved
|
Tagging subscribers to this area: @dotnet/area-system-threading-tasks |
Co-authored-by: stephentoub <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical race condition in WriteOnceBlock<T> where ReceiveAsync could intermittently return null instead of the posted value when called concurrently with Post. The root cause was that _header was being set before _value, allowing readers to observe HasValue == true while _value was still default.
Changes:
- Reordered field assignments in
OfferMessageto set_valuebefore_header - Added
Interlocked.MemoryBarrier()between the writes to enforce ordering and prevent CPU/compiler reordering - Added comprehensive regression test with 10,000 iterations to verify the fix
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| WriteOnceBlock.cs | Fixed race condition by swapping assignment order and adding memory barrier to ensure _value is visible before _header.IsValid becomes true |
| WriteOnceBlockTests.cs | Added regression test that reproduces the concurrent Post/ReceiveAsync race condition scenario |
|
/azp list |
|
/azp run runtime-libraries-coreclr outerloop-windows |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-libraries-coreclr outerloop-windows |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/ba-g failure is unrelated timeout issue. |
| // sees HasValue as true but reads a default value because _value hasn't been set yet. | ||
| // The memory barrier ensures the write to _value is visible before _header.IsValid becomes true. | ||
| _value = messageValue; | ||
| Interlocked.MemoryBarrier(); |
There was a problem hiding this comment.
Volatile.WriteBarrier(); would be sufficient here.
Description
WriteOnceBlock<T>.ReceiveAsyncintermittently returnsnullwhen called concurrently withPost.Root cause: In
OfferMessage,_headerwas set before_value:LinkTochecksHasValue(_header.IsValid) inside a lock but reads_valueoutside it. A concurrent reader can observeHasValue == truewhile_valueis stilldefault(T).Fix: Assign
_valuebefore_header, with a memory barrier to enforce the ordering and prevent CPU/compiler reordering.Changes
_valueis set before_header.IsValidbecomes true, and addInterlocked.MemoryBarrier()between the writes to enforce the orderingTestConcurrentPostAndReceiveAsyncregression test (10K iterations), marked as[OuterLoop]since it's a stress testOriginal prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.