[release/8.0-staging] Ensure we consistently broadcast the result of simd dot product#106031
Conversation
|
/azp run runtime-coreclr jitstress-isas-x86 |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
@carlossanlop, looks like I can't run any outerloop CI legs due to a permissions issue. Is that expected? This is impacting downlevel hardware, so running the stress job to explicitly validate those scenarios is important. |
What should I be looking at? |
|
The All builds in it were cancelled with
|
|
I shared this problem to FR. I tagged you. |
|
CC. @JulieLeeMSFT, @jeffschwMSFT for servicing approval As per the template, this was a regression introduced in .NET 8 against a small subset of hardware that supports SSE3 but not SSE4.1 |
JulieLeeMSFT
left a comment
There was a problem hiding this comment.
Approved. cc @jeffschwMSFT for servicing consider.
jeffschwMSFT
left a comment
There was a problem hiding this comment.
lgtm. we will take for consideration in 8.0.x
|
@tannergooding did this get approved by Tactics? I'm about to close the branch for the September Release. If this is not merged now, we will have to wait until October. |
|
Hasn't been taken to tactics yet, to my knowledge. |
Backport of #105888 to release/8.0-staging
/cc @tannergooding
Customer Impact
Users with hardware that supports SSE3-SSSE3 (2004-2007) and using Vector2 can experience incorrect results in some cases as the result is not correctly replicated to all elements of the vector. This is not an issue with hardware that only supports SSE2 (our baseline ISA) or with hardware that supports SSE4.1+ (2007).
Regression
The bug was introduced in #81335 when an optimization was added to avoid redundant broadcast. It was missed that for SSE3-SSSE3 and for Vector2 in particular (but not for
Vector3/4orVector64/128/256/512<T>), a slightly different code path was gone down under which the broadcast was not redundant.Testing
Explicit tests covering the bug were added. Additional tests covering the repro for the other vector types were also added.
Risk
Low. This impacts a single type and only on relatively old hardware. Users can also see the failure on new hardware if they are doing testing and set the
DOTNET_EnableSSE41=0environment variable.The effective fix here was to execute the same code path as was already being used by SSE2 hardware (the same baseline that NAOT, Crossgen, and ReadyToRun default to).