[X86] Manage atomic store of fp -> int promotion in DAG by jofrn · Pull Request #197166 · llvm/llvm-project

jofrn · 2026-05-12T12:10:54Z

When lowering atomic store <1 x T> vector types with floats (i.e. during scalarization in the selection DAG), selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.

Store-side counterpart to #148895. Stacked on top of #197165; and below of #197618.

llvmorg-github-actions · 2026-05-12T12:11:47Z

@llvm/pr-subscribers-backend-x86

Author: jofrn

Changes

When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.

Store-side counterpart to #148895. Stacked on top of #197165.

Full diff: https://github.com/llvm/llvm-project/pull/197166.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+4)
(modified) llvm/test/CodeGen/X86/atomic-load-store.ll (+134)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index aa5b864df5936..fea1caf0854f5 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -2753,6 +2753,10 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
   setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f32, MVT::i32);
   setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f64, MVT::i64);
 
+  setOperationPromotedToType(ISD::ATOMIC_STORE, MVT::f16, MVT::i16);
+  setOperationPromotedToType(ISD::ATOMIC_STORE, MVT::f32, MVT::i32);
+  setOperationPromotedToType(ISD::ATOMIC_STORE, MVT::f64, MVT::i64);
+
   // We have target-specific dag combine patterns for the following nodes:
   setTargetDAGCombine({ISD::VECTOR_SHUFFLE,
                        ISD::SCALAR_TO_VECTOR,
diff --git a/llvm/test/CodeGen/X86/atomic-load-store.ll b/llvm/test/CodeGen/X86/atomic-load-store.ll
index 8e74276019e17..549015c37e2d0 100644
--- a/llvm/test/CodeGen/X86/atomic-load-store.ll
+++ b/llvm/test/CodeGen/X86/atomic-load-store.ll
@@ -222,6 +222,140 @@ define void @store_atomic_vec1_ptr(ptr %x, <1 x ptr> %v) nounwind {
   ret void
 }
 
+define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) {
+; CHECK-SSE-O3-LABEL: store_atomic_vec1_bfloat:
+; CHECK-SSE-O3:       # %bb.0:
+; CHECK-SSE-O3-NEXT:    pextrw $0, %xmm0, %eax
+; CHECK-SSE-O3-NEXT:    movw %ax, (%rdi)
+; CHECK-SSE-O3-NEXT:    retq
+;
+; CHECK-AVX-O3-LABEL: store_atomic_vec1_bfloat:
+; CHECK-AVX-O3:       # %bb.0:
+; CHECK-AVX-O3-NEXT:    vpextrw $0, %xmm0, %eax
+; CHECK-AVX-O3-NEXT:    movw %ax, (%rdi)
+; CHECK-AVX-O3-NEXT:    retq
+;
+; CHECK-SSE-O0-LABEL: store_atomic_vec1_bfloat:
+; CHECK-SSE-O0:       # %bb.0:
+; CHECK-SSE-O0-NEXT:    pushq %rax
+; CHECK-SSE-O0-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SSE-O0-NEXT:    movq %rdi, (%rsp) # 8-byte Spill
+; CHECK-SSE-O0-NEXT:    pextrw $0, %xmm0, %eax
+; CHECK-SSE-O0-NEXT:    movw %ax, %cx
+; CHECK-SSE-O0-NEXT:    # implicit-def: $eax
+; CHECK-SSE-O0-NEXT:    movw %cx, %ax
+; CHECK-SSE-O0-NEXT:    shll $16, %eax
+; CHECK-SSE-O0-NEXT:    movd %eax, %xmm0
+; CHECK-SSE-O0-NEXT:    callq __truncsfbf2@PLT
+; CHECK-SSE-O0-NEXT:    movq (%rsp), %rdi # 8-byte Reload
+; CHECK-SSE-O0-NEXT:    pextrw $0, %xmm0, %eax
+; CHECK-SSE-O0-NEXT:    # kill: def $ax killed $ax killed $eax
+; CHECK-SSE-O0-NEXT:    movw %ax, (%rdi)
+; CHECK-SSE-O0-NEXT:    popq %rax
+; CHECK-SSE-O0-NEXT:    .cfi_def_cfa_offset 8
+; CHECK-SSE-O0-NEXT:    retq
+;
+; CHECK-AVX-O0-LABEL: store_atomic_vec1_bfloat:
+; CHECK-AVX-O0:       # %bb.0:
+; CHECK-AVX-O0-NEXT:    pushq %rax
+; CHECK-AVX-O0-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-AVX-O0-NEXT:    movq %rdi, (%rsp) # 8-byte Spill
+; CHECK-AVX-O0-NEXT:    vpextrw $0, %xmm0, %eax
+; CHECK-AVX-O0-NEXT:    movw %ax, %cx
+; CHECK-AVX-O0-NEXT:    # implicit-def: $eax
+; CHECK-AVX-O0-NEXT:    movw %cx, %ax
+; CHECK-AVX-O0-NEXT:    shll $16, %eax
+; CHECK-AVX-O0-NEXT:    vmovd %eax, %xmm0
+; CHECK-AVX-O0-NEXT:    callq __truncsfbf2@PLT
+; CHECK-AVX-O0-NEXT:    movq (%rsp), %rdi # 8-byte Reload
+; CHECK-AVX-O0-NEXT:    vpextrw $0, %xmm0, %eax
+; CHECK-AVX-O0-NEXT:    # kill: def $ax killed $ax killed $eax
+; CHECK-AVX-O0-NEXT:    movw %ax, (%rdi)
+; CHECK-AVX-O0-NEXT:    popq %rax
+; CHECK-AVX-O0-NEXT:    .cfi_def_cfa_offset 8
+; CHECK-AVX-O0-NEXT:    retq
+  store atomic <1 x bfloat> %v, ptr %x release, align 2
+  ret void
+}
+
+define void @store_atomic_vec1_half(ptr %x, <1 x half> %v) {
+; CHECK-SSE-O3-LABEL: store_atomic_vec1_half:
+; CHECK-SSE-O3:       # %bb.0:
+; CHECK-SSE-O3-NEXT:    pextrw $0, %xmm0, %eax
+; CHECK-SSE-O3-NEXT:    movw %ax, (%rdi)
+; CHECK-SSE-O3-NEXT:    retq
+;
+; CHECK-AVX-O3-LABEL: store_atomic_vec1_half:
+; CHECK-AVX-O3:       # %bb.0:
+; CHECK-AVX-O3-NEXT:    vpextrw $0, %xmm0, %eax
+; CHECK-AVX-O3-NEXT:    movw %ax, (%rdi)
+; CHECK-AVX-O3-NEXT:    retq
+;
+; CHECK-SSE-O0-LABEL: store_atomic_vec1_half:
+; CHECK-SSE-O0:       # %bb.0:
+; CHECK-SSE-O0-NEXT:    pextrw $0, %xmm0, %eax
+; CHECK-SSE-O0-NEXT:    # kill: def $ax killed $ax killed $eax
+; CHECK-SSE-O0-NEXT:    movw %ax, (%rdi)
+; CHECK-SSE-O0-NEXT:    retq
+;
+; CHECK-AVX-O0-LABEL: store_atomic_vec1_half:
+; CHECK-AVX-O0:       # %bb.0:
+; CHECK-AVX-O0-NEXT:    vpextrw $0, %xmm0, %eax
+; CHECK-AVX-O0-NEXT:    # kill: def $ax killed $ax killed $eax
+; CHECK-AVX-O0-NEXT:    movw %ax, (%rdi)
+; CHECK-AVX-O0-NEXT:    retq
+  store atomic <1 x half> %v, ptr %x release, align 2
+  ret void
+}
+
+define void @store_atomic_vec1_float(ptr %x, <1 x float> %v) {
+; CHECK-SSE-O3-LABEL: store_atomic_vec1_float:
+; CHECK-SSE-O3:       # %bb.0:
+; CHECK-SSE-O3-NEXT:    movss %xmm0, (%rdi)
+; CHECK-SSE-O3-NEXT:    retq
+;
+; CHECK-AVX-O3-LABEL: store_atomic_vec1_float:
+; CHECK-AVX-O3:       # %bb.0:
+; CHECK-AVX-O3-NEXT:    vmovss %xmm0, (%rdi)
+; CHECK-AVX-O3-NEXT:    retq
+;
+; CHECK-SSE-O0-LABEL: store_atomic_vec1_float:
+; CHECK-SSE-O0:       # %bb.0:
+; CHECK-SSE-O0-NEXT:    movss %xmm0, (%rdi)
+; CHECK-SSE-O0-NEXT:    retq
+;
+; CHECK-AVX-O0-LABEL: store_atomic_vec1_float:
+; CHECK-AVX-O0:       # %bb.0:
+; CHECK-AVX-O0-NEXT:    vmovss %xmm0, (%rdi)
+; CHECK-AVX-O0-NEXT:    retq
+  store atomic <1 x float> %v, ptr %x release, align 4
+  ret void
+}
+
+define void @store_atomic_vec1_double_align(ptr %x, <1 x double> %v) nounwind {
+; CHECK-SSE-O3-LABEL: store_atomic_vec1_double_align:
+; CHECK-SSE-O3:       # %bb.0:
+; CHECK-SSE-O3-NEXT:    movsd %xmm0, (%rdi)
+; CHECK-SSE-O3-NEXT:    retq
+;
+; CHECK-AVX-O3-LABEL: store_atomic_vec1_double_align:
+; CHECK-AVX-O3:       # %bb.0:
+; CHECK-AVX-O3-NEXT:    vmovsd %xmm0, (%rdi)
+; CHECK-AVX-O3-NEXT:    retq
+;
+; CHECK-SSE-O0-LABEL: store_atomic_vec1_double_align:
+; CHECK-SSE-O0:       # %bb.0:
+; CHECK-SSE-O0-NEXT:    movsd %xmm0, (%rdi)
+; CHECK-SSE-O0-NEXT:    retq
+;
+; CHECK-AVX-O0-LABEL: store_atomic_vec1_double_align:
+; CHECK-AVX-O0:       # %bb.0:
+; CHECK-AVX-O0-NEXT:    vmovsd %xmm0, (%rdi)
+; CHECK-AVX-O0-NEXT:    retq
+  store atomic <1 x double> %v, ptr %x release, align 8
+  ret void
+}
+
 define <2 x i8> @atomic_vec2_i8(ptr %x) {
 ; CHECK-SSE-O3-LABEL: atomic_vec2_i8:
 ; CHECK-SSE-O3:       # %bb.0:

github-actions · 2026-05-12T12:12:00Z

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Developer Policy and LLVM Discourse for more information.

arsenm · 2026-05-12T12:21:16Z

  ret void
 }

+define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) {


The tests look more like they are for vector scalarization. This also isn't removing any of the IR lowering TargetLowering hooks, so I doubt this patch is doing anything as it is

They are for vector scalarization. It casts the types instead of removing hooks. Please check out how it was done for the loads in #148895.

This doesn't make sense, you only changed the rules for the scalar types. This also will not be reached since AtomicExpand will have replaced the float atomics unless you do something to shouldExpandAtomicStoreInIR / shouldCastAtomicStoreInIR

That makes sense. The hooks weren't modified, so we don't need these.

I think the hooks should be modified (and by modified, I mean fully deleted).

I tried doing this a long time ago, I think load and store shouldn't be too much trouble to move the casting into codegen. The tricky one is atomicrmw, since when expanding into cmpxchg loops, putting the cast inside the loop is bad for some targets including x86

The hook in x86 isel lowering (i.e. the base class) only handles scalar floating point types.
virtual AtomicExpansionKind shouldCastAtomicStoreInIR(StoreInst *SI) const { if (SI->getValueOperand()->getType()->isFloatingPointTy()) return AtomicExpansionKind::CastToInteger; return AtomicExpansionKind::None; }
Therefore, we do need these. I understand what you are saying ; it will be much more difficult to modify the base class though. This statement was accurate: #197166 (comment)

OK, this is a convoluted set of circumstances. shouldCastAtomicStoreInIR is buggy, and missed the 1 x vector edge case (it ought to be checking isFPOrFPVectorTy). This patch exclusively fixes the handling of the 1x vector, which is not obvious from the title and description. I'd prefer to fix this by first fixing the buggy cast logic to handle the vector case, and then proceed with the migration to the DAG based promotion

Ok, sure. We can try specializing in it x86.

No, not in x86. In the base implementation (but this is still just pre-work to deleting it)

Right, work prior to the final state is redundant. will leave this as it is then until we get to the final state. thank you.

RKSimon · 2026-05-12T12:23:37Z

  ret void
 }

+define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) {


Suggested change

define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) {

define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) nounwind {

same for others to remove cfi noise

Should it be there for those that don't require it? i.e. those that do not have CFI noise currently.

I tend to blanket add it to most tests - but if that will cause large diffs, just add it to cases that need it

Unaligned atomic vector stores with size >1 are lowered to calls. Adding their tests separately here.

RKSimon

LGTM

`store atomic <1 x T>` is not valid. This change legalizes vector types of atomic store via scalarization in SelectionDAG so that it can, for example, translate from `v1i32` to `i32`.

When lowering atomic <1 x T> vector types with floats, selection can fail since this pattern is unsupported. To support this, floats can be casted to an integer type of the same size.

`store atomic <1 x T>` is not valid. This change legalizes vector types of atomic store via scalarization in SelectionDAG so that it can, for example, translate from `v1i32` to `i32`. This is the store-side counterpart to #148894. Stacked on top of #197372; and below #197166.

Vector types of 2 elements must be widened. This change does this for vector types of atomic store in SelectionDAG so that it can translate aligned vectors of >1 size. Store-side counterpart to #148897. Stacked on top of #197166; and below of #197619.

llvmorg-github-actions Bot added the backend:X86 label May 12, 2026

jofrn requested a review from RKSimon May 12, 2026 12:14

RKSimon requested a review from arsenm May 12, 2026 12:19

arsenm reviewed May 12, 2026

View reviewed changes

RKSimon reviewed May 12, 2026

View reviewed changes

[X86] Add atomic vector store tests for unaligned >1 sizes.

f19c931

Unaligned atomic vector stores with size >1 are lowered to calls. Adding their tests separately here.

jofrn force-pushed the users/jofrn/scalarize-vec-atomic-store branch from 3e365f3 to 3e49a56 Compare May 13, 2026 06:51

jofrn force-pushed the users/jofrn/x86-promote-atomic-store-fp branch from 686ed50 to 47bca23 Compare May 13, 2026 06:59

RKSimon approved these changes May 13, 2026

View reviewed changes

This was referenced May 14, 2026

[SelectionDAG] Widen <2 x T> vector types for atomic store #197618

Merged

[SelectionDAG] Scalarize <1 x T> vector types for atomic store #197165

Merged

jofrn added 2 commits May 19, 2026 06:23

[SelectionDAG] Scalarize <1 x T> vector types for atomic store

bbd3398

`store atomic <1 x T>` is not valid. This change legalizes vector types of atomic store via scalarization in SelectionDAG so that it can, for example, translate from `v1i32` to `i32`.

[X86] Manage atomic store of fp -> int promotion in DAG

21e67f6

When lowering atomic <1 x T> vector types with floats, selection can fail since this pattern is unsupported. To support this, floats can be casted to an integer type of the same size.

jofrn force-pushed the users/jofrn/scalarize-vec-atomic-store branch from 3e49a56 to bbd3398 Compare May 19, 2026 13:28

jofrn force-pushed the users/jofrn/x86-promote-atomic-store-fp branch from 47bca23 to 21e67f6 Compare May 19, 2026 13:28

Base automatically changed from users/jofrn/scalarize-vec-atomic-store to main May 19, 2026 21:12

Merge branch 'main' into users/jofrn/x86-promote-atomic-store-fp

6ce28f9

jofrn merged commit d9a9b3b into main May 19, 2026
6 of 9 checks passed

jofrn deleted the users/jofrn/x86-promote-atomic-store-fp branch May 19, 2026 21:21

adams381 mentioned this pull request Jun 4, 2026

[CIR] Lower constant block addresses for goto #201644

Open

	define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) {
	define void @store_atomic_vec1_bfloat(ptr %x, <1 x bfloat> %v) nounwind {

Conversation

jofrn commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmorg-github-actions Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jofrn May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jofrn May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jofrn May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jofrn commented May 12, 2026 •

edited

Loading

jofrn May 12, 2026 •

edited

Loading

jofrn May 13, 2026 •

edited

Loading

jofrn May 13, 2026 •

edited

Loading