Removing unsafe code in BufferExtensions #57417

ladeak · 2024-08-19T20:29:06Z

Removing unsafe code in BufferExtensions

Removing unsafe code, using Span to write numbers directly to the buffer

Description

Removing the unsafe keyword required for pinning
Removing the fixed keyword doing the pinning
Using Span<T> to write directly to the buffer
Adding a testcase for values smaller to 1000, 100 and 10.
As a result for values <1000, the JIT generates smaller code
I only seem to find usage of this in KnowHeaders's generated code

mgravell · 2024-08-20T09:46:54Z

Algorithmically this look fine; I want to quickly investigate the performance impact of not eliding the bounds check - will do that today; if that turns out to be relevant, there is a third option that removes the fixed and unsafe, but still elides; checking...

src/Shared/ServerInfrastructure/BufferExtensions.cs

mgravell · 2024-08-20T10:24:00Z

some speculative benchmarks, just intended to compare the optimized 1-3 character paths, ignoring the custom BufferWriter<> etc

Fixed = the original code
Span = the code proposed in this PR
RefAdd = the third way I mentioned above

results (code here):

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26120.1350)
AMD Ryzen 9 7900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.203
  [Host]     : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  DefaultJob : .NET 8.0.7 (8.0.724.31311), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI


| Method | Number | Mean      | Error     | StdDev    | Median    |
|------- |------- |----------:|----------:|----------:|----------:|
| Fixed  | 8      | 0.6012 ns | 0.0042 ns | 0.0037 ns | 0.6014 ns |
| Span   | 8      | 0.5640 ns | 0.0113 ns | 0.0206 ns | 0.5706 ns |
| RefAdd | 8      | 0.5567 ns | 0.0110 ns | 0.0263 ns | 0.5670 ns |
| Fixed  | 16     | 1.0129 ns | 0.0087 ns | 0.0082 ns | 1.0096 ns |
| Span   | 16     | 0.7861 ns | 0.0068 ns | 0.0063 ns | 0.7857 ns |
| RefAdd | 16     | 0.7861 ns | 0.0042 ns | 0.0039 ns | 0.7868 ns |
| Fixed  | 143    | 1.4058 ns | 0.0130 ns | 0.0122 ns | 1.4102 ns |
| Span   | 143    | 1.2048 ns | 0.0153 ns | 0.0143 ns | 1.2042 ns |
| RefAdd | 143    | 1.1970 ns | 0.0100 ns | 0.0089 ns | 1.1949 ns |

I welcome eyeballs / re-runs of these measures (especially on net9, sorry only had net8 to hand), but based on this I'm tempted to say "let's go the third way" (RefAdd) - this is justified by the existing checks against the span length. It seems marginally faster than the other two options, avoids explicit fixed / unsafe, and achieves bounds-check elision.

Specifically, this uses managed reference addition:

ref byte start = ref MemoryMarshal.GetReference(span);
// ...
start = (byte)(((uint)number) + AsciiDigitStart);
buffer.Advance(1);
// ...
start = (byte)(tens + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 1) = (byte)(val - (tens * 10) + AsciiDigitStart);
buffer.Advance(2);
// ...
start = (byte)(digit0 + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 1) = (byte)(digits01 - (digit0 * 10) + AsciiDigitStart);
Unsafe.AddByteOffset(ref start, 2) = (byte)(val - (digits01 * 10) + AsciiDigitStart);
buffer.Advance(3);

Thoughts?

ladeak · 2024-08-20T10:30:05Z

I had similar benchmark results, I can share later today the exact numbers, but the Span one was slightly better jitted (code size was smaller).

Happy to use Unsafe.AddByteOffset, however there the results are within the error to Span (correction: marginally better). I can take a look at jitted code to compare later.

mgravell · 2024-08-20T10:38:25Z

If it is a compromise between JIT size and performance, then my default reaction would be "choose performance"... maybe others disagree, though?

mgravell · 2024-08-20T10:40:36Z

For completeness, I also investigated an Evil approach that used punning to ushort to reduce two writes to one ushort write in the 2/3 digit cases; it didn't go well - bizarrely impacting even the single-digit bench:

| Method | Number | Mean      | Error     | StdDev    |
|------- |------- |----------:|----------:|----------:|
| Fixed  | 8      | 0.6104 ns | 0.0078 ns | 0.0073 ns |
| Span   | 8      | 0.5672 ns | 0.0110 ns | 0.0143 ns |
| RefAdd | 8      | 0.5653 ns | 0.0112 ns | 0.0193 ns |
| Evil   | 8      | 0.9200 ns | 0.0184 ns | 0.0461 ns |
| Fixed  | 16     | 1.0245 ns | 0.0133 ns | 0.0125 ns |
| Span   | 16     | 0.7898 ns | 0.0038 ns | 0.0036 ns |
| RefAdd | 16     | 0.7887 ns | 0.0067 ns | 0.0060 ns |
| Evil   | 16     | 0.9596 ns | 0.0069 ns | 0.0065 ns |
| Fixed  | 143    | 1.4168 ns | 0.0138 ns | 0.0123 ns |
| Span   | 143    | 1.2000 ns | 0.0144 ns | 0.0134 ns |
| RefAdd | 143    | 1.1963 ns | 0.0133 ns | 0.0125 ns |
| Evil   | 143    | 1.1853 ns | 0.0094 ns | 0.0083 ns |

ladeak · 2024-08-20T10:41:01Z

@mgravell I agree, just curious on the reasons behind it.

ladeak · 2024-08-20T20:06:40Z

These are the results and code on my machine (I dropped the non-fast path branch compared to your tests):

BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4037/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-1255U, 1 CPU, 12 logical and 10 physical cores
.NET SDK 9.0.100-preview.7.24407.12
  [Host]     : .NET 9.0.0 (9.0.24.40507), X64 RyuJIT AVX2
  DefaultJob : .NET 9.0.0 (9.0.24.40507), X64 RyuJIT AVX2


| Method     | Value | Mean      | Error     | StdDev    | Code Size | Allocated |
|----------- |------ |----------:|----------:|----------:|----------:|----------:|
| TestUnsafe | 6     | 0.1193 ns | 0.0160 ns | 0.0133 ns |     215 B |         - |
| TestSpan   | 6     | 0.1114 ns | 0.0105 ns | 0.0093 ns |     186 B |         - |
| TestRef    | 6     | 0.1130 ns | 0.0133 ns | 0.0125 ns |     186 B |         - |
| TestUnsafe | 17    | 0.3625 ns | 0.0168 ns | 0.0149 ns |     212 B |         - |
| TestSpan   | 17    | 0.1063 ns | 0.0090 ns | 0.0075 ns |     183 B |         - |
| TestRef    | 17    | 0.1078 ns | 0.0091 ns | 0.0081 ns |     183 B |         - |
| TestUnsafe | 123   | 0.6305 ns | 0.0146 ns | 0.0137 ns |     218 B |         - |
| TestSpan   | 123   | 0.3738 ns | 0.0116 ns | 0.0103 ns |     181 B |         - |
| TestRef    | 123   | 0.3548 ns | 0.0087 ns | 0.0077 ns |     181 B |         - |

EDIT: To me the JITTED code for the Span and for the AddByteOffset looks identical.

I have updated the PR as you suggested.

amcasey · 2024-08-20T23:04:52Z

You both know more about this than I do, but my outside opinion is that the perf looks about the same (between span and addref), so I'd pick whichever results in more readable code.

gfoidl · 2024-08-21T09:02:50Z

perf looks about the same (between span and addref), so I'd pick whichever results in more readable code.

Further: when the aim is to remove unsafe code, we shouldn't move to Unsafe which isn't any safer than unsafe. In fact the "by ref arithmetics" can lead to hard to find and debug GC holes (happened in runtime repo a few times), so the move should be to Span<T>.

When there's some perf regression that should get investigated and fixed (in runtime).
So not only ASP.NET Core can profit from "better span", but rather all consumers of .NET.

mgravell · 2024-08-21T09:14:52Z

@gfoidl @amcasey yes absolutely the simpler usage should be preferred by default, but in this case I'd argue that the .Length checks justify the limited use of Unsafe, and that this is a pretty common set of elisions for such scenarios - but if people prefer the code as proposed; that's fine too. My views may be biased by absurd library code ;p

Sorry @ladeak - not trying to mess you around with "do, undo" - maybe hold off on editing anything while we reach a consensus, but it sounds like the simpler code might be preferred.

ladeak · 2024-08-21T09:26:47Z

@mgravell , no need to be sorry, I am happy with both ways.

gfoidl · 2024-08-21T09:37:45Z

@mgravell I'm in ❤️ with unsafe, hacks, whatever gives perf also, but from a .NET-PoV the JIT should be teached to avoid / elide these checks, so that the whole .NET ecosystem can profit from these improvements.

So I'd go with span, log an issue in runtime so that the perf-discrepancy can be investigated and fixed over there.
If this PR is on hold until the JIT got fixed or not is another story (I'd keep it on hold, so that there's no regression in shipped product).

src/Shared/ServerInfrastructure/BufferExtensions.cs

mgravell · 2024-08-21T10:26:43Z

So I did a bench with bytesLeftInBlock removed and the span.Length checks moved inline with the if / else if tests (Span2):

| Method | Number | Mean      | Error     | StdDev    |
|------- |------- |----------:|----------:|----------:|
| Fixed  | 8      | 0.5952 ns | 0.0042 ns | 0.0040 ns |
| Span   | 8      | 0.5462 ns | 0.0108 ns | 0.0272 ns |
| Span2  | 8      | 0.5513 ns | 0.0110 ns | 0.0190 ns |
| Fixed  | 16     | 1.0047 ns | 0.0056 ns | 0.0050 ns |
| Span   | 16     | 0.7688 ns | 0.0035 ns | 0.0032 ns |
| Span2  | 16     | 0.7709 ns | 0.0036 ns | 0.0032 ns |
| Fixed  | 143    | 1.3807 ns | 0.0053 ns | 0.0050 ns |
| Span   | 143    | 1.1812 ns | 0.0081 ns | 0.0071 ns |
| Span2  | 143    | 1.1831 ns | 0.0056 ns | 0.0052 ns |

so not an improvement, but both are better than the original fixed version, so: let's just go with the original PR proposal with the simple indexed usage?

This reverts commit 1d1d790.

ladeak · 2024-08-21T15:55:03Z

Reverted to the original proposal.

src/Shared/ServerInfrastructure/BufferExtensions.cs

src/Servers/Kestrel/Core/test/PipelineExtensionTests.cs

src/Shared/ServerInfrastructure/BufferExtensions.cs

ladeak · 2024-08-27T16:15:31Z

Is there anything else I may follow up in this PR? cc. @mgravell

ladeak · 2024-09-02T19:25:42Z

@BrennanConroy and @mgravell , I suppose it might be busy times. Do you have a blocker for this PR or awaiting RC release etc.?

ladeak · 2024-09-24T16:47:58Z

@BrennanConroy could you please review?

BrennanConroy · 2024-10-02T17:01:19Z

/azp run

azure-pipelines · 2024-10-02T17:01:38Z

Azure Pipelines successfully started running 3 pipeline(s).

BrennanConroy · 2024-10-04T21:54:04Z

Thanks @ladeak, nice to see removal of unsafe and a bit of perf improvement at the same time.

Using safe code in BufferExtensions

3fbe9b3

ladeak requested review from BrennanConroy, JamesNK, halter73 and mgravell as code owners August 19, 2024 20:29

ghost added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Aug 19, 2024

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Aug 19, 2024

build-analysis bot mentioned this pull request Aug 19, 2024

Helix failure EPT_S_NOT_REGISTERED #57416

Open

mgravell reviewed Aug 20, 2024

View reviewed changes

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

Review feedback

1d1d790

gfoidl reviewed Aug 21, 2024

View reviewed changes

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

Revert "Review feedback"

47c27c3

This reverts commit 1d1d790.

gfoidl reviewed Aug 21, 2024

View reviewed changes

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

build-analysis bot mentioned this pull request Aug 21, 2024

There are no more endpoints available from the endpoint mapper. dotnet/dnceng#3844

Open

3 tasks

- uint conversion and length comparison in branch

75f9d07

gfoidl reviewed Aug 22, 2024

View reviewed changes

src/Servers/Kestrel/Core/test/PipelineExtensionTests.cs Outdated Show resolved Hide resolved

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

Cleanup usings, explicit casting

5bb4321

ladeak requested review from gfoidl and mgravell August 22, 2024 21:20

gfoidl reviewed Aug 26, 2024

View reviewed changes

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

src/Shared/ServerInfrastructure/BufferExtensions.cs Outdated Show resolved Hide resolved

Renaming parameters and locals, removing simpleWrite variable.

ce1c125

gfoidl approved these changes Aug 27, 2024

View reviewed changes

dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Sep 9, 2024

mkArtakMSFT assigned adityamandaleeka Sep 9, 2024

BrennanConroy approved these changes Oct 2, 2024

View reviewed changes

dotnet-policy-service bot removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Oct 2, 2024

build-analysis bot mentioned this pull request Oct 2, 2024

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

3 tasks

BrennanConroy merged commit 9dfdca4 into dotnet:main Oct 4, 2024

dotnet-policy-service bot added this to the 10.0-preview1 milestone Oct 4, 2024

captainsafia pushed a commit that referenced this pull request Dec 31, 2024

Removing unsafe code in BufferExtensions (#57417)

96ab868

captainsafia pushed a commit that referenced this pull request Feb 11, 2025

Removing unsafe code in BufferExtensions (#57417)

bb11b1f

Removing unsafe code in BufferExtensions #57417

Removing unsafe code in BufferExtensions #57417

Uh oh!

Conversation

ladeak commented Aug 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Removing unsafe code in BufferExtensions

Description

Uh oh!

mgravell commented Aug 20, 2024

Uh oh!

Uh oh!

mgravell commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ladeak commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgravell commented Aug 20, 2024

Uh oh!

mgravell commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ladeak commented Aug 20, 2024

Uh oh!

ladeak commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amcasey commented Aug 20, 2024

Uh oh!

gfoidl commented Aug 21, 2024

Uh oh!

mgravell commented Aug 21, 2024

Uh oh!

ladeak commented Aug 21, 2024

Uh oh!

gfoidl commented Aug 21, 2024

Uh oh!

Uh oh!

mgravell commented Aug 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ladeak commented Aug 21, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ladeak commented Aug 27, 2024

Uh oh!

ladeak commented Sep 2, 2024

Uh oh!

ladeak commented Sep 24, 2024

Uh oh!

BrennanConroy commented Oct 2, 2024

Uh oh!

azure-pipelines bot commented Oct 2, 2024

Uh oh!

BrennanConroy commented Oct 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ladeak commented Aug 19, 2024 •

edited

Loading

mgravell commented Aug 20, 2024 •

edited

Loading

ladeak commented Aug 20, 2024 •

edited

Loading

mgravell commented Aug 20, 2024 •

edited

Loading

ladeak commented Aug 20, 2024 •

edited

Loading

mgravell commented Aug 21, 2024 •

edited

Loading