-
Notifications
You must be signed in to change notification settings - Fork 871
Fixed potential torn reads in EventCounters #3223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The backing fields for the counters are `long`, so there was potential torn read on 32-bit platforms. `Volatile.Read` ensures that the value is read in a atomic way, so no torn reads can occur.
YohDeadfall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thank you!
|
Cherry picked to 4.1.6 via 3db17ea. |
|
I'm not aware of any atomicity guarantee provided by Volatile.Read - as opposed to Interlocked.Read which very clearly documents the 32-bit torn value case:
Do you have any resource explicitly pointing to Volatile.Read as a guard against torn values? I did do a quick sharplab test to see what assembly gets emitted, and on .NET Framework 32-bit Volatile.Read indeed emits a call to Interlocked.CompareExchange, making it totally safe. Unfortunately, I don't know what happens under .NET Core 32-bit (sharplab doesn't support that). Another interesting point is that Interlocked.Read emits cmpxchgon 64-bit, presumably for 8-byte values crossing cache lines (which could get therefore get torn even on 64-bit systems, thanks @damageboy). This means we'd take a theoretical, slight perf hit, but the reading of these counters happens very infrequently, so this is insignificant. |
|
It depends on the architecture (source): [Intrinsic]
[NonVersionable]
public static long Read(ref long location) =>
#if TARGET_64BIT
(long)Unsafe.As<long, VolatileIntPtr>(ref location).Value;
#else
// On 32-bit machines, we use Interlocked, since an ordinary volatile read would not be atomic.
Interlocked.CompareExchange(ref location, 0, 0);
#endif |
|
... and it's marked as
I assume that the CPU's cache protocol won't allow torn-reads, but I can't say for 100% sure. Under the assumption it can be torn, quite a lot of vector operations (say SSE 128bit data crossing a cache line or more often AVX 256bit data crossing the cache line boundary) would suffer from torn-reads too, and I'm not aware that this happens. If so, quite a lot of coreclr-code would be broken too. |
|
The ECMA CLI spec I.12.6.6 (page 128) states
so a ordinary read of This also means that "random 64-bit reads" aren't atomic by this guarantee. It's possible to allocate unmanaged memory and read non-atomically a long from there. |
|
I agree that implementation-wise, Volatilre.Read seems to generate Interlocked.CompareExchange in x86 (although I've yet to see/test this actually happening on 32-bit .NET Core). The main question in my mind, is whether this behavior is specified or documented anywhere; coming from other languages, volatile does not necessarily have any sort of atomicity guarantees - it's about caching, staleness and lifting to register (which is also a form of caching). This is why I'd rather see Interlocked.Read for atomicity - at the very least, it documents what's being done better. Re the ECMA CLI spec etc., I'm definitely not worried about 64-bit platforms here. As @damageboy helped me understand, as long as an 8-byte value fits within a cache line, operations on it are indeed guaranteed to be atomic; note, however, that for long values which straddle a cache line boundary, tearing may still occur, in which case Interlocked.Read can ensure atomicity, but, importantly, not Volatile.Read (that alone is a good reason to avoid Volatile.Read for atomicity). This doesn't concern us here because these long values are laid out by the CLI and are guaranteed to be aligned and within the cache line. To sum it up, when the goal is to guarantee atomicity, I prefer using the API which expressly does that (Interlocked.Read), rather than Volatile.Read - even if implementation-wise it may be implemented atomically. |
|
Disclaimer: I just want to share my thoughts, so if there's any misconception I'm happy to be corrected so I'll learn something.
Yep, that's a different matter. I'd say volatile has nothing to do with atomicity, just as you pointed out.
Following the ECMA spec in a strict sense, a torn-read of a long might also happen if the address isn't aligned to 8-byte boundary even if it doesn't cross a cache line boundary.
This is consistence and make sense. Notes about this -- in general: reads that cross a cache line boundary -- are stated in the Intel Developer manual, and I assume arm as well other processor won't be any different. Side note: When looking at usages of *(long*) across GH-repos there are quite a lot, it looks like the majority doesn't respect cache lines, but it's OK as the memory location from which is read isn't accessed concurrently, so torn reads won't occur even if the reading thread will be interrupted. |
I totally agree. I originally thought Volatile.Read really doesn't help with tearing at all, and was surprised to see it emit CompareExchange in 32-bit. This seems like a .NET particularity.
I agree that the words say that, and also that hardware-wise I don't think it's possible to produce tearing when the entire value is contained in a single cache-line. In that sense the ECMA wording may provide weaker guarantees than what the hardware actually provides.
Yeah, to hit an actual issue you must both do unprotected concurrency (no locking/interlocked) and read unaligned data straddling cache line boundaries. Following the conversation above, @gfoidl does it make sense for you to replace the Volatile.Read occurrences introduced by this PR with Interlocked.Exchange (and amend dotnet/efcore#22923)? |
I'm "torn" back and forth. It's a very interesting conversation, and so I'd say to use 'Interlocked.Read` makes it obvious what happens, especially as potential perf-benefits of Volatile won't show up here. So a change would be good. On the other side dotnet/aspnetcore#26630 was done just before and there it got merged with |
🤣
Agreed, I've also learned a lot from it - so thanks! To be sure, I agree that this is largely about documentation/obviousness, and has little actual impact. Let's wait a bit as you suggest. (@YohDeadfall what's your reason for preferring Volatile.Read over Interlocked.Read?) |
|
There is a difference between these two method below for x64, but obviously the first one is faster. Can't provide how much because there is no latency specified by Intel in docs, but to be sure ask @damageboy. public static void V(ref long x)
{
Volatile.Read(ref x);
}
public static void I(ref long x)
{
Interlocked.Read(ref x);
}; Volatile(Int64 ByRef)
L0000: mov rax, [rcx]
L0003: ret
; Interlocked(Int64 ByRef)
L0000: xor edx, edx
L0002: xor eax, eax
L0004: lock cmpxchg [rcx], rdx
L0009: retPlus I'm aware of the implementation for a long time thankfully for Reference Source and IL Spy. Right now .NET is fully open and anyone may take a look at the source in case of uncertainty. |
Is that meaningful in any way given that the reading happens when the counter is polled (i.e. once per second)? |
|
Related PR to fix the docs on Interlocked.Read: dotnet/dotnet-api-docs#4962 |
|
Since the .NET team decided to use |
|
I'll send (another) PR to change this to Thanks again for the intersting discussion about this topic! |
|
FYI, the runtime uses |
The backing fields for the counters are
long, so there was potential torn read on 32-bit platforms.Volatile.Readensures that the value is read in a atomic way, so no torn reads can occur.