[AtomicExpand] Support non-integer atomic loads.#199310
Conversation
This is arguably an enhancement rather than a bugfix. But AtomicExpandPass already tries to support some non-integer atomic ops using cmpxchg by bitcasting to/from an integer type. We're just missing this one path used by atomic load. Seems easy enough to support it. This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
|
@llvm/pr-subscribers-llvm-transforms Author: Justin Lebar (jlebar) ChangesThis is arguably an enhancement rather than a bugfix. But This bug was found by a large run of Opus 4.7 looking for bugs in LLVM. 2 Files Affected:
diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 7327290f62970..b06f3534b53fc 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -673,12 +673,22 @@ bool AtomicExpandImpl::expandAtomicLoadToCmpXchg(LoadInst *LI) {
Value *Addr = LI->getPointerOperand();
Type *Ty = LI->getType();
- Constant *DummyVal = Constant::getNullValue(Ty);
+
+ // cmpxchg supports only integer and pointer operands. If the load type is
+ // FP or vector, run the cmpxchg on the same-sized integer and bitcast the
+ // result back; mirrors createCmpXchgInstFun.
+ bool NeedBitcast = Ty->isFloatingPointTy() || Ty->isVectorTy();
+ Type *CmpXchgTy = Ty;
+ if (NeedBitcast)
+ CmpXchgTy = Builder.getIntNTy(Ty->getPrimitiveSizeInBits());
+ Constant *DummyVal = Constant::getNullValue(CmpXchgTy);
Value *Pair = Builder.CreateAtomicCmpXchg(
Addr, DummyVal, DummyVal, LI->getAlign(), Order,
AtomicCmpXchgInst::getStrongestFailureOrdering(Order));
Value *Loaded = Builder.CreateExtractValue(Pair, 0, "loaded");
+ if (NeedBitcast)
+ Loaded = Builder.CreateBitCast(Loaded, Ty);
LI->replaceAllUsesWith(Loaded);
LI->eraseFromParent();
diff --git a/llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll b/llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
index 9f973ac5531d1..9d7f8503651b5 100644
--- a/llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
+++ b/llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
@@ -1,10 +1,11 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -S %s -passes='require<libcall-lowering-info>,atomic-expand' -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK64
; RUN: opt -S %s -passes='require<libcall-lowering-info>,atomic-expand' -mtriple=i686-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK32
+; RUN: opt -S %s -passes='require<libcall-lowering-info>,atomic-expand' -mtriple=x86_64-linux-gnu -mattr=+cx16,-avx | FileCheck %s --check-prefixes=CX16
-; This file tests the functions `llvm::convertAtomicLoadToIntegerType` and
-; `llvm::convertAtomicStoreToIntegerType`. If X86 stops using this
-; functionality, please move this test to a target which still is.
+; This file tests AtomicExpand's non-integer atomic type conversions and
+; X86-specific expansion choices. If X86 stops using this functionality, please
+; move this test to a target which still is.
define float @float_load_expand(ptr %ptr) {
; CHECK-LABEL: define float @float_load_expand(
@@ -354,3 +355,42 @@ define <4 x float> @atomic_vec4_float(ptr %x) nounwind {
%ret = load atomic <4 x float>, ptr %x acquire, align 16
ret <4 x float> %ret
}
+
+; Vector atomic loads should expand to a same-sized integer cmpxchg followed by
+; a bitcast back to the original type when X86 chooses the cmpxchg expansion.
+
+define <2 x i64> @load_v2i64_cmpxchg(ptr %p) {
+; CX16-LABEL: define <2 x i64> @load_v2i64_cmpxchg(
+; CX16-SAME: ptr [[P:%.*]]) #[[ATTR0:[0-9]+]] {
+; CX16-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[P]], i128 0, i128 0 seq_cst seq_cst, align 16
+; CX16-NEXT: [[LOADED:%.*]] = extractvalue { i128, i1 } [[TMP1]], 0
+; CX16-NEXT: [[TMP2:%.*]] = bitcast i128 [[LOADED]] to <2 x i64>
+; CX16-NEXT: ret <2 x i64> [[TMP2]]
+;
+ %r = load atomic <2 x i64>, ptr %p seq_cst, align 16
+ ret <2 x i64> %r
+}
+
+define <4 x i32> @load_v4i32_cmpxchg(ptr %p) {
+; CX16-LABEL: define <4 x i32> @load_v4i32_cmpxchg(
+; CX16-SAME: ptr [[P:%.*]]) #[[ATTR0]] {
+; CX16-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[P]], i128 0, i128 0 seq_cst seq_cst, align 16
+; CX16-NEXT: [[LOADED:%.*]] = extractvalue { i128, i1 } [[TMP1]], 0
+; CX16-NEXT: [[TMP2:%.*]] = bitcast i128 [[LOADED]] to <4 x i32>
+; CX16-NEXT: ret <4 x i32> [[TMP2]]
+;
+ %r = load atomic <4 x i32>, ptr %p seq_cst, align 16
+ ret <4 x i32> %r
+}
+
+define <16 x i8> @load_v16i8_cmpxchg(ptr %p) {
+; CX16-LABEL: define <16 x i8> @load_v16i8_cmpxchg(
+; CX16-SAME: ptr [[P:%.*]]) #[[ATTR0]] {
+; CX16-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[P]], i128 0, i128 0 seq_cst seq_cst, align 16
+; CX16-NEXT: [[LOADED:%.*]] = extractvalue { i128, i1 } [[TMP1]], 0
+; CX16-NEXT: [[TMP2:%.*]] = bitcast i128 [[LOADED]] to <16 x i8>
+; CX16-NEXT: ret <16 x i8> [[TMP2]]
+;
+ %r = load atomic <16 x i8>, ptr %p seq_cst, align 16
+ ret <16 x i8> %r
+}
|
|
Hiya, yes! this looks good. It is also part of this commit to remove (The below --fn thing is a typo. Please disregard it.) |
Oh, awesome. It looks like you're on top of it, I'll close this PR in favor of yours! |
|
We can submit this one. I'm going to rebase mine on top of yours. |
|
OK, merged! |
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads to integer at the IR level. Instead, handle this in a pre-legalize DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to integer atomic loads plus a bitcast. This depends on #199310 which adds the necessary cmpxchg support for non-integer atomic loads in AtomicExpand.
This is arguably an enhancement rather than a bugfix. But
AtomicExpandPass already tries to support some non-integer atomic ops
using cmpxchg by bitcasting to/from an integer type. We're just missing
this one path used by atomic load. Seems easy enough to support it.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.