-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Removing unsafe code in BufferExtensions #57417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Algorithmically this look fine; I want to quickly investigate the performance impact of not eliding the bounds check - will do that today; if that turns out to be relevant, there is a third option that removes the |
|
some speculative benchmarks, just intended to compare the optimized 1-3 character paths, ignoring the custom
results (code here): BenchmarkDotNet v0.14.0, Windows 11 (10.0.26120.1350)
AMD Ryzen 9 7900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.203
[Host] : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
DefaultJob : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
| Method | Number | Mean | Error | StdDev | Median |
|------- |------- |----------:|----------:|----------:|----------:|
| Fixed | 8 | 0.6012 ns | 0.0042 ns | 0.0037 ns | 0.6014 ns |
| Span | 8 | 0.5640 ns | 0.0113 ns | 0.0206 ns | 0.5706 ns |
| RefAdd | 8 | 0.5567 ns | 0.0110 ns | 0.0263 ns | 0.5670 ns |
| Fixed | 16 | 1.0129 ns | 0.0087 ns | 0.0082 ns | 1.0096 ns |
| Span | 16 | 0.7861 ns | 0.0068 ns | 0.0063 ns | 0.7857 ns |
| RefAdd | 16 | 0.7861 ns | 0.0042 ns | 0.0039 ns | 0.7868 ns |
| Fixed | 143 | 1.4058 ns | 0.0130 ns | 0.0122 ns | 1.4102 ns |
| Span | 143 | 1.2048 ns | 0.0153 ns | 0.0143 ns | 1.2042 ns |
| RefAdd | 143 | 1.1970 ns | 0.0100 ns | 0.0089 ns | 1.1949 ns |I welcome eyeballs / re-runs of these measures (especially on net9, sorry only had net8 to hand), but based on this I'm tempted to say "let's go the third way" ( Specifically, this uses managed reference addition: ref byte start = ref MemoryMarshal.GetReference(span);
// ...
start = (byte)(((uint)number) + AsciiDigitStart);
buffer.Advance(1);
// ...
start = (byte)(tens + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 1) = (byte)(val - (tens * 10) + AsciiDigitStart);
buffer.Advance(2);
// ...
start = (byte)(digit0 + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 1) = (byte)(digits01 - (digit0 * 10) + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 2) = (byte)(val - (digits01 * 10) + AsciiDigitStart);
buffer.Advance(3);Thoughts? |
|
I had similar benchmark results, I can share later today the exact numbers, but the Span one was slightly better jitted (code size was smaller). Happy to use Unsafe.AddByteOffset, however there the results are within the error to Span (correction: marginally better). I can take a look at jitted code to compare later. |
|
If it is a compromise between JIT size and performance, then my default reaction would be "choose performance"... maybe others disagree, though? |
|
For completeness, I also investigated an | Method | Number | Mean | Error | StdDev |
|------- |------- |----------:|----------:|----------:|
| Fixed | 8 | 0.6104 ns | 0.0078 ns | 0.0073 ns |
| Span | 8 | 0.5672 ns | 0.0110 ns | 0.0143 ns |
| RefAdd | 8 | 0.5653 ns | 0.0112 ns | 0.0193 ns |
| Evil | 8 | 0.9200 ns | 0.0184 ns | 0.0461 ns |
| Fixed | 16 | 1.0245 ns | 0.0133 ns | 0.0125 ns |
| Span | 16 | 0.7898 ns | 0.0038 ns | 0.0036 ns |
| RefAdd | 16 | 0.7887 ns | 0.0067 ns | 0.0060 ns |
| Evil | 16 | 0.9596 ns | 0.0069 ns | 0.0065 ns |
| Fixed | 143 | 1.4168 ns | 0.0138 ns | 0.0123 ns |
| Span | 143 | 1.2000 ns | 0.0144 ns | 0.0134 ns |
| RefAdd | 143 | 1.1963 ns | 0.0133 ns | 0.0125 ns |
| Evil | 143 | 1.1853 ns | 0.0094 ns | 0.0083 ns | |
|
@mgravell I agree, just curious on the reasons behind it. |
|
These are the results and code on my machine (I dropped the non-fast path branch compared to your tests): EDIT: To me the JITTED code for the I have updated the PR as you suggested. |
|
You both know more about this than I do, but my outside opinion is that the perf looks about the same (between span and addref), so I'd pick whichever results in more readable code. |
Further: when the aim is to remove unsafe code, we shouldn't move to When there's some perf regression that should get investigated and fixed (in runtime). |
|
@gfoidl @amcasey yes absolutely the simpler usage should be preferred by default, but in this case I'd argue that the Sorry @ladeak - not trying to mess you around with "do, undo" - maybe hold off on editing anything while we reach a consensus, but it sounds like the simpler code might be preferred. |
|
@mgravell , no need to be sorry, I am happy with both ways. |
|
@mgravell I'm in ❤️ with unsafe, hacks, whatever gives perf also, but from a .NET-PoV the JIT should be teached to avoid / elide these checks, so that the whole .NET ecosystem can profit from these improvements. So I'd go with span, log an issue in runtime so that the perf-discrepancy can be investigated and fixed over there. |
|
So I did a bench with | Method | Number | Mean | Error | StdDev |
|------- |------- |----------:|----------:|----------:|
| Fixed | 8 | 0.5952 ns | 0.0042 ns | 0.0040 ns |
| Span | 8 | 0.5462 ns | 0.0108 ns | 0.0272 ns |
| Span2 | 8 | 0.5513 ns | 0.0110 ns | 0.0190 ns |
| Fixed | 16 | 1.0047 ns | 0.0056 ns | 0.0050 ns |
| Span | 16 | 0.7688 ns | 0.0035 ns | 0.0032 ns |
| Span2 | 16 | 0.7709 ns | 0.0036 ns | 0.0032 ns |
| Fixed | 143 | 1.3807 ns | 0.0053 ns | 0.0050 ns |
| Span | 143 | 1.1812 ns | 0.0081 ns | 0.0071 ns |
| Span2 | 143 | 1.1831 ns | 0.0056 ns | 0.0052 ns |so not an improvement, but both are better than the original |
This reverts commit 1d1d790.
|
Reverted to the original proposal. |
|
Is there anything else I may follow up in this PR? cc. @mgravell |
|
@BrennanConroy and @mgravell , I suppose it might be busy times. Do you have a blocker for this PR or awaiting RC release etc.? |
|
@BrennanConroy could you please review? |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
Thanks @ladeak, nice to see removal of unsafe and a bit of perf improvement at the same time. |
Removing unsafe code in BufferExtensions
Removing unsafe code, using Span to write numbers directly to the buffer
Description
unsafekeyword required for pinningfixedkeyword doing the pinningSpan<T>to write directly to the bufferFixes #56534