-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Labels
Priority:3Work that is nice to haveWork that is nice to havearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIoptimization
Milestone
Description
As per https://github.com/dotnet/runtime/pull/53578/files#r645282551, we are currently emitting a pslddq followed by a psrldq to zero the 4th element.
This is decent codegen for the SSE/SSE2 baseline but is inefficient compared to other patterns we could generate for modern hardware.
We should likely replace this with the relevant logic from vector.WithElement(3, 0.0f), which can then generate a insertps xmm, xmm, 0b00_00_1000 which will preserve all existing values in xmm and zero the third element.
category:cq
theme:codegen
skill-level:expert
cost:small
impact:small
Metadata
Metadata
Assignees
Labels
Priority:3Work that is nice to haveWork that is nice to havearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIoptimization