[mono] [llvm-aot] Fixed storing Vector3 into memory#111000
[mono] [llvm-aot] Fixed storing Vector3 into memory#111000jkurdek merged 2 commits intodotnet:mainfrom
Conversation
|
/azp run runtime-llvm |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-extra-platforms |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-llvm, runtime-extra-platforms |
|
Azure Pipelines successfully started running 2 pipeline(s). |
| const int mask_values [] = { 0, 1, 2 }; | ||
|
|
||
| LLVMValueRef truncatedVec3 = LLVMBuildShuffleVector ( | ||
| builder, | ||
| lhs, | ||
| LLVMGetUndef (t), | ||
| create_const_vector_i32 (mask_values, 3), | ||
| "truncated_vec3" | ||
| ); |
There was a problem hiding this comment.
Couldn't this be replaced by something like?
LLVMValueRef truncatedVec3 = LLVMBuildTrunc (builder, lhs, LLVMVectorType (LLVMFloatType (), 3), "truncated_vec3);
However, I haven't tried it locally so I'm unsure whether it would make the same and be more efficient or not.
There was a problem hiding this comment.
I think the trunc instruction does something different. Looking at the LLVM reference it seems to be casting each element of a vector to a different type - not casting a vector and dropping its last element.
There was a problem hiding this comment.
I see, I thought it works the same as on primitives. Thank you for the explanation.
|
/azp run runtime-extra-platforms |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/ba-g failures unrelated |
|
/backport to release/9.0-staging |
|
Started backporting to release/9.0-staging: https://github.com/dotnet/runtime/actions/runs/12600960089 |
| "truncated_vec3" | ||
| ); | ||
|
|
||
| mono_llvm_build_aligned_store (builder, truncatedVec3, dest, FALSE, 1); |
There was a problem hiding this comment.
Why is the alignment being set to 1? A Vector3 is composed of 3x float fields and so is presumed to be 4 byte aligned unless the code is explicitly using something like Unsafe.ReadUnaligned or Unsafe.WriteUnaligned (which use the IL unaligned. prefix).
There was a problem hiding this comment.
This change reuses existing logic. This code path was previously taken for all VectorX writes so I kept it consistent. I agree that this might be incorrect. We should investigate this separately.
| if (mono_class_value_size (ins->klass, NULL) == 12) { | ||
| const int mask_values [] = { 0, 1, 2 }; | ||
|
|
||
| LLVMValueRef truncatedVec3 = LLVMBuildShuffleVector ( |
There was a problem hiding this comment.
Why are we using ShuffleVector here?
In other places, we instead do this via doing LLVMBuildExtractElement + LLVMBuildInsertValue (or Store) 3x. This alternative correctly represents the primitive steps and allows LLVM to optimize it as best fit for the underlying architecture.
Fixes #110820
Vector3 is sometimes handled as Vector4 (Vector3 + 0) for perf purposes. On mono llvm aot it was incorrectly stored into memory with the trailing 0. This change fixes that behaviour.