-
Notifications
You must be signed in to change notification settings - Fork 5.3k
try to port ASCIIUtility.NarrowUtf16ToAscii to x-plat intrinsics #73064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-text-encoding Issue DetailsFor both x64 and ARM64 I am observing 10-15% regression. x64DetailsBenchmarkDotNet=v0.13.1.1828-nightly, OS=Windows 11 (10.0.22000.795/21H2)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=7.0.100-preview.6.22352.1
[Host] : .NET 7.0.0 (7.0.22.32404), X64 RyuJIT AVX2
Method=GetBytes
Here I was able to use VTune (I also tried uProf but it leaves a lot to desire): From what I can see the regression comes from additional 3 It seems that the first one comes from the new ARM64DetailsBenchmarkDotNet=v0.13.1.1828-nightly, OS=ubuntu 20.04
Unknown processor
.NET SDK=7.0.100-rc.1.22379.1
[Host] : .NET 7.0.0 (7.0.22.37802), Arm64 RyuJIT AdvSIMD
Method=GetBytes
|
|
|
||
| ref byte asciiBuffer = ref *pAsciiBuffer; | ||
| Vector128<byte> asciiVector = ExtractAsciiVector(utf16VectorFirst, utf16VectorFirst); | ||
| Vector128<byte> asciiVector = Vector128.Narrow(utf16VectorFirst, utf16VectorFirst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamsitnik Narrow doesn't know that utf16VectorFirst is already ASCII at this point (from my understanding) so it applies a mask via AND to cut anything above 0xFF (not needed in this case). Consider this:

So for this case for better perf you probably want to keep ExtractAsciiVector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log an issue here for Narrow. There is possibly some codegen improvements we can make in .NET 8
| const ushort asciiMask = ushort.MaxValue - 127; // 0x7F80 | ||
| Vector128<ushort> zeroIsAscii = utf16Vector & Vector128.Create(asciiMask); | ||
| // If a non-ASCII bit is set in any WORD of the vector, we have seen non-ASCII data. | ||
| return !Vector128.EqualsAll(zeroIsAscii, Vector128<ushort>.Zero); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said that it's expected to have a redundant AND here - why? 🙂
|
@tannergooding Since I was not able to get the same perf with |
|
Improvements - dotnet/perf-autofiling-issues#7322 |

For both x64 and ARM64 I am observing 10-15% regression.
x64
Details
Here I was able to use VTune (I also tried uProf but it leaves a lot to desire):
From what I can see the regression comes from additional 3
vpandinstructions:It seems that the first one comes from the new
VectorContainsNonAsciiCharimplementation (this is expected). But the other two fromVector128.Narrow? @tannergooding is that expected?ARM64
Details