[OpenMP][OMPIRBuilder] Avoid querying SmallPtrSet during removal#198690
Conversation
22922a0 to
5db7a42
Compare
The openmp-cli-fuse02.mlir test can fail nondeterministically when fuseLoops leaves a dead block in the generated function. On AArch64 this was reproduced as omp_omp.loop.cond3 being emitted with no predecessors, which breaks CHECK-NEXT in the MLIR LLVM IR translation test. The issue is in removeUnusedBlocksFromParent. It used SmallPtrSet::remove_if while HasRemainingUses also queried the same set. SmallPtrSet iteration order depends on pointer addresses, which can vary with ASLR, allocation order, or object layout. Erasing one block while walking the set can therefore change the keep/erase decision for blocks visited later. The older make_early_inc_range form had the same underlying defect. Fix this by collecting all blocks that still have external uses before erasing any of them. This evaluates every block against the same snapshot of the erase set and removes the iteration-order dependency. Add a llvm unit test that fuses a loop range and verifies that no non-entry dead blocks remain in the function. Assisted with copilot.
|
@llvm/pr-subscribers-flang-openmp Author: Chi-Chun, Chen (chichunchen) ChangesThe openmp-cli-fuse02.mlir test can fail nondeterministically when The issue is in removeUnusedBlocksFromParent. It used Fix this by collecting all blocks that still have external uses before Add a llvm unit test that fuses a loop range and verifies that no Assisted with copilot. 2 Files Affected:
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 06026582538a2..200e6897fd6ab 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6573,8 +6573,17 @@ static void removeUnusedBlocksFromParent(ArrayRef<BasicBlock *> BBs) {
return false;
};
- while (BBsToErase.remove_if(HasRemainingUses)) {
- // Try again if anything was removed.
+ // Compute the keep set before mutating BBsToErase; otherwise the result can
+ // depend on SmallPtrSet iteration order.
+ bool Changed = true;
+ while (Changed) {
+ Changed = false;
+ SmallVector<BasicBlock *, 4> BBsToKeep;
+ for (BasicBlock *BB : BBsToErase)
+ if (HasRemainingUses(BB))
+ BBsToKeep.push_back(BB);
+ for (BasicBlock *BB : BBsToKeep)
+ Changed |= BBsToErase.erase(BB);
}
SmallVector<BasicBlock *, 7> BBVec(BBsToErase.begin(), BBsToErase.end());
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index acfa993855b2e..a7e572bc427a3 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -8461,4 +8461,55 @@ TEST_F(OpenMPIRBuilderTest, ScopeDirectiveNowait) {
}
}
+TEST_F(OpenMPIRBuilderTest, FuseLoopRangeNoDeadBlocks) {
+ using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.initialize();
+ F->setName("func");
+
+ IRBuilder<> Builder(BB);
+ Type *LCTy = F->getArg(0)->getType();
+ Value *TripCount = F->getArg(0);
+
+ auto LoopBodyGenCB = [&](InsertPointTy CodeGenIP, Value *LC) {
+ Builder.restoreIP(CodeGenIP);
+ createPrintfCall(Builder, "%d\n", {LC});
+ return Error::success();
+ };
+
+ OpenMPIRBuilder::LocationDescription Loc1({Builder.saveIP(), DL});
+ ASSERT_EXPECTED_INIT(
+ CanonicalLoopInfo *, Loop1,
+ OMPBuilder.createCanonicalLoop(Loc1, LoopBodyGenCB, TripCount, "loop1"));
+ Builder.restoreIP(Loop1->getAfterIP());
+
+ Value *TripCount2 = Builder.CreateAdd(TripCount, ConstantInt::get(LCTy, 1));
+ OpenMPIRBuilder::LocationDescription Loc2({Builder.saveIP(), DL});
+ ASSERT_EXPECTED_INIT(
+ CanonicalLoopInfo *, Loop2,
+ OMPBuilder.createCanonicalLoop(Loc2, LoopBodyGenCB, TripCount2, "loop2"));
+ Builder.restoreIP(Loop2->getAfterIP());
+
+ Value *TripCount3 = Builder.CreateAdd(TripCount, ConstantInt::get(LCTy, 2));
+ OpenMPIRBuilder::LocationDescription Loc3({Builder.saveIP(), DL});
+ ASSERT_EXPECTED_INIT(
+ CanonicalLoopInfo *, Loop3,
+ OMPBuilder.createCanonicalLoop(Loc3, LoopBodyGenCB, TripCount3, "loop3"));
+ Builder.restoreIP(Loop3->getAfterIP());
+ Builder.CreateRetVoid();
+
+ CanonicalLoopInfo *Fused = OMPBuilder.fuseLoops(DL, {Loop1, Loop2});
+
+ OMPBuilder.finalize();
+ ASSERT_FALSE(verifyModule(*M, &errs()));
+
+ Fused->assertOK();
+ for (BasicBlock &Block : *F) {
+ if (&Block == &F->getEntryBlock())
+ continue;
+ EXPECT_TRUE(Block.hasNPredecessorsOrMore(1))
+ << "Dead block found: " << Block.getName().str();
+ }
+}
+
} // namespace
|
There was a problem hiding this comment.
I just started hitting this, so thank you for having already posted a patch!
I reviewed your fix first, but then thought to look through some blame, and was confused to see no recent changes here. Then I took a look at SmallPtrSets blame and noticed that the contract (or at least the documentation of the contract) for remove_if changed in #197637 from:
It is safe to read the set inside
the predicate function. However, the predicate must not modify the set
itself, only indicate a removal by returning true.
to:
The predicate must not access the
set being modified: it may inspect the element passed to it and return
true to request removal, but must not read (e.g. count()/find()) or
otherwise mutate the set. If anything is removed, all iterators and
references into the set are invalidated.
Which seems like it is the real root cause? @MaskRay is this kind of fallout possible in other places in the codebase, or is remove_if generally not used in this way? Did the contract actually change, or was the old comment just wrong?
| // Compute the keep set before mutating BBsToErase; otherwise the result can | ||
| // depend on SmallPtrSet iteration order. | ||
| bool Changed = true; | ||
| while (Changed) { | ||
| Changed = false; | ||
| SmallVector<BasicBlock *, 4> BBsToKeep; | ||
| for (BasicBlock *BB : BBsToErase) | ||
| if (HasRemainingUses(BB)) | ||
| BBsToKeep.push_back(BB); | ||
| for (BasicBlock *BB : BBsToKeep) | ||
| Changed |= BBsToErase.erase(BB); | ||
| } |
There was a problem hiding this comment.
| // Compute the keep set before mutating BBsToErase; otherwise the result can | |
| // depend on SmallPtrSet iteration order. | |
| bool Changed = true; | |
| while (Changed) { | |
| Changed = false; | |
| SmallVector<BasicBlock *, 4> BBsToKeep; | |
| for (BasicBlock *BB : BBsToErase) | |
| if (HasRemainingUses(BB)) | |
| BBsToKeep.push_back(BB); | |
| for (BasicBlock *BB : BBsToKeep) | |
| Changed |= BBsToErase.erase(BB); | |
| } | |
| // Compute the keep set before mutating BBsToErase; otherwise the result can | |
| // depend on SmallPtrSet iteration order. | |
| bool Changed = false; | |
| do { | |
| SmallVector<BasicBlock *, 4> BBsToKeep; | |
| for (BasicBlock *BB : BBsToErase) | |
| if (HasRemainingUses(BB)) | |
| BBsToKeep.push_back(BB); | |
| for (BasicBlock *BB : BBsToKeep) | |
| Changed |= BBsToErase.erase(BB); | |
| } while (Changed); |
There was a problem hiding this comment.
You suggestion never resets Change to false, causing an infinity loop.
If we are going to nitpick in this, what I would prefer is
SmallVector<BasicBlock *> BBsToKeep;
while (true) {
bool Changed = false;
BBsToKeep.clear();
for (BasicBlock *BB : BBsToErase)
if (HasRemainingUses(BB))
BBsToKeep.push_back(BB);
for (BasicBlock *BB : BBsToKeep)
Changed |= BBsToErase.erase(BB);
if (!Changed)
break;
}such that Changed remains scoped in the loop body.
Declare vectors outside of the loop: https://llvm.org/docs/ProgrammersManual.html#vector
It is recommended to not specify a small size for vectors on the stack: https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h
There was a problem hiding this comment.
I agree, ignore my original comment here. I was working too late, and didn't check my own suggestion carefully, sorry about that!
|
Can you help me understand the problem? The patch is based on a commit before #197637 where SmallPtrSet::remove_if has nothing that would make the data structure invalid between calls of the predicate. I could not make your test fail without your change in OpenMPIRBuilder. |
Yes. I think this was an oversight, as the line is explicitly mentioned in one of the comments.. The explanation in the PR is wrong. The problem is that during the execution of SmallPtrSet::remove_if(), lookup in the set doesn't work correctly until the rehash at the end of remove_if. |
That makes sense, thank you! It seems like copilot may have helped find this (now) broken use of If the Edit: I'm just now reading your comment on the other change. I will comment there instead |
I'll update the PR descrition and title for the correct information.
5db7a42 to
c06bd06
Compare
|
Since the input is a small array of /// Determine which blocks in \p BBs are reachable from outside and remove the
/// ones that are not reachable from the function.
static void removeUnusedBlocksFromParent(ArrayRef<BasicBlock *> BBs) {
DenseMap<BasicBlock *, unsigned> BB2Index(BBs.size());
for (const auto [I, V] : enumerate(BBs))
BB2Index.insert({V, I});
// Optimistically assume all BBs are dead. We iteratively remove
// immediately-used blocks from this set until we reach a fixed point.
SmallBitVector BBsToErase(BBs.size(), true);
// Accumulates immediately-used blocks for one iteration.
SmallBitVector BBsToKeep(BBs.size());
do {
BBsToKeep.reset();
for (unsigned BBToEraseIndex : BBsToErase.set_bits()) {
for (Use &U : BBs[BBToEraseIndex]->uses()) {
auto *UseInst = dyn_cast<Instruction>(U.getUser());
if (!UseInst)
continue;
BasicBlock *UseBB = UseInst->getParent();
auto UseBBIndexI = BB2Index.find(UseBB);
// If this use is from an external block, or from a block in BBs we have
// already proved is transitively used by an external block, then
// this block is also used.
if (UseBBIndexI == BB2Index.end() ||
!BBsToErase.test(UseBBIndexI->second)) {
BBsToKeep.set(BBToEraseIndex);
break;
}
}
}
BBsToErase &= ~BBsToKeep;
} while (BBsToKeep.any());
SmallVector<BasicBlock *> FinalBBsToErase;
for (unsigned BBToEraseIndex : BBsToErase.set_bits())
FinalBBsToErase.push_back(BBs[BBToEraseIndex]);
DeleteDeadBlocks(FinalBBsToErase);
}
|
… slinder1) Track candidate blocks by stable BB indices instead of mutating pointer sets, avoiding SmallPtrSet tombstone/rehash/iteration-order issues while preserving original BB order for deletion.
There was a problem hiding this comment.
The current version LGTM, and landing something so the buildbots aren't randomly failing would be good ASAP
If there is more review we can do it after. Thanks again!
Edit: I had missed the previous revision when I responded with the SmallBitVector suggestion, but either version LGTM!
Meinersbur
left a comment
There was a problem hiding this comment.
This looks like AI-overcomplexity. BBsToKeep is not even needed because it's finding a fixpoint, just reset that bit in BBsToErase directly. In constrast to SmallPtrSet you can actually do that while iterating over it. Also, building the DenseMap over a small list of an estimated 7 elements max is probably overkill.
Here someting simpler by just inverting BBsToErase to BBsToKeep:
static void removeUnusedBlocksFromParent(ArrayRef<BasicBlock *> BBs) {
SmallPtrSet<BasicBlock *, 6> BBsToKeep(llvm::from_range, BBs);
auto HasRemainingUses = [&BBsToKeep](BasicBlock *BB) {
for (Use &U : BB->uses()) {
auto *UseInst = dyn_cast<Instruction>(U.getUser());
if (!UseInst)
continue;
if (!BBsToKeep.count(UseInst->getParent()))
continue;
return true;
}
return false;
};
while (true) {
bool Changed = false;
for (BasicBlock *BB : BBs) {
if (BBsToKeep.count(BB) || !HasRemainingUses(BB))
continue;
BBsToKeep.insert(BB);
Changed = true;
}
if (!Changed)
break;
}
SmallVector<BasicBlock *, 7> BBVec;
for (BasicBlock *BB : BBs) {
if (!BBsToKeep.count(BB))
BBVec.push_back(BB);
}
DeleteDeadBlocks(BBVec);
}| if (!UseInst) | ||
| continue; | ||
| BasicBlock *UseBB = UseInst->getParent(); | ||
| auto UseBBIndexI = BB2Index.find(UseBB); |
There was a problem hiding this comment.
Do you prefer DenseMap<BasicBlock *, unsigned>::iterator UseBBIndexI?
This isn't about "almost always" using auto, iterator types are the canonical example of why auto is helpful for readability.
No AI used in this case, I'm just overcomplicating things. I was biased towards the likely LLM generated original version, though |
|
I think these are just artifacts of you giving the gist of the approach quickly, but I don't think your code works as-is: I assume this should start empty? Assuming a block is in This is the only use of |
|
I think this works, and is what you were suggesting? Some other incidental changes I made are because the static void removeUnusedBlocksFromParent(ArrayRef<BasicBlock *> BBs) {
SmallPtrSet<BasicBlock *, 8> InternalBBs(from_range, BBs);
SmallPtrSet<BasicBlock *, 8> BBsToKeep;
while (true) {
bool Changed = false;
for (BasicBlock *BB : BBs) {
if (BBsToKeep.contains(BB))
continue;
for (Use &U : BB->uses()) {
auto *UseInst = dyn_cast<Instruction>(U.getUser());
if (!UseInst)
continue;
BasicBlock *UseBB = UseInst->getParent();
if (!InternalBBs.contains(UseBB) || BBsToKeep.contains(UseBB)) {
BBsToKeep.insert(BB);
Changed = true;
break;
}
}
}
if (!Changed)
break;
}
SmallVector<BasicBlock *> BBVec;
for (BasicBlock *BB : BBs)
if (!BBsToKeep.count(BB))
BBVec.push_back(BB);
DeleteDeadBlocks(BBVec);
} |
This is essentially post-commit review for #198690 which was landed quickly to fix nondeterminism in tests introduced in #197637 Change-Id: Ib3603ef3c70dde5bb22d0fc04d9249e62ecccf0c Co-authored-by: @Meinersbur Co-authored-by: @chichunchen
openmp-cli-fuse02.mlir can intermittently leave behind a dead block after loop fusion, causing LLVM IR verification to fail. (#197637 (comment))
This happened because
removeUnusedBlocksFromParentqueriedBBsToErasewhile usingSmallPtrSet::remove_ifon the same set, which made the result depend on the set's internal mutation order during removal.This patch tracks candidate blocks by index with
SmallBitVectorinstead of mutating and querying the sameSmallPtrSet. This also preserves the originalBBsorder when collecting the final dead blocks.The original PR was created with assistance from Copilot. The later analysis and fix were revised based on reviewer feedback.
Co-authored-by: @slinder1