-
Notifications
You must be signed in to change notification settings - Fork 5.3k
JIT: Optimize Memmove unrolling for constant src #108576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr pgo |
|
Azure Pipelines successfully started running 3 pipeline(s). |
Co-authored-by: Jakob Botsch Nielsen <[email protected]>
|
PTAL @jakobbotsch @dotnet/jit-contrib |
|
@MihuBot -arm64 |
|
@EgorBo Regressions on arm64?
EgorBot/runtime-utils#110 (comment)
|
load might be faster than up to 4 instructions to compose a 64bit constant for "FALS" or "TRUE" utf16 strings on arm64, but it's tricky in the real world, the load might be under contention. Anyway, we'll see once dotnet/performance run finishes |
LowerCallMemmovealready does a good job unrolling Memmove. Unfortunately, it leaves some opportunities on the table when the source can be a constant (esp, GPR constant), so we can skip a memory load. The reason is thatLowerCallMemmovecannot rely on VN for that as it's too late.Codegen diff:
; Method Benchmarks:Test_utf16(System.Span`1[ushort]):this (FullOpts) sub rsp, 40 mov rax, bword ptr [rdx] mov ecx, dword ptr [rdx+0x08] cmp ecx, 4 jl SHORT G_M39687_IG04 - mov rcx, 0x21A00204FFC - mov rdx, qword ptr [rcx] - mov qword ptr [rax], rdx + mov rcx, 0x65007500720054 + mov qword ptr [rax], rcx add rsp, 40 ret G_M39687_IG04: call [System.ThrowHelper:ThrowArgumentException_DestinationTooShort()] int3 -; Total bytes of code: 43 +; Total bytes of code: 40 ; Method Benchmarks:Test_utf8(System.Span`1[ubyte]):this (FullOpts) sub rsp, 40 mov rax, bword ptr [rdx] mov ecx, dword ptr [rdx+0x08] cmp ecx, 4 jl SHORT G_M38560_IG04 - mov rcx, 0x2C391CC5380 ; static handle - mov edx, dword ptr [rcx] - mov dword ptr [rax], edx + mov dword ptr [rax], 0x65757254 add rsp, 40 ret G_M38560_IG04: call [System.ThrowHelper:ThrowArgumentException_DestinationTooShort()] int3 -; Total bytes of code: 41 +; Total bytes of code: 33Diffs