[SmallPtrSet] Don't leave tombstones in small mode#96762
Conversation
When erasing elements in small mode, we currently leave behind tombstones. This means that insertion into the SmallPtrSet also has to check for these, making the operation more expensive. We don't really need the tombstones in small mode, because we can just replace with the last element in the set instead. This changes the order, but SmallPtrSet order is fundamentally unstable anyway. However, not leaving tombstones means that the erase() operation now invalidates iterators. This means that consumers that want to remove elements while iterating over the set have to use remove_if() instead. If they fail to do so, there will be an assertion failure thanks to debug epochs, so any such cases are easy to detect (and I have already fixed any inside llvm at least).
|
@llvm/pr-subscribers-llvm-adt Author: Nikita Popov (nikic) ChangesWhen erasing elements in small mode, we currently leave behind tombstones. This means that insertion into the SmallPtrSet also has to check for these, making the operation more expensive than it really should be. We don't really need the tombstones in small mode, because we can just replace with the last element in the set instead. This changes the order, but SmallPtrSet order is fundamentally unstable anyway. However, not leaving tombstones means that the erase() operation now invalidates iterators. This means that consumers that want to remove elements while iterating over the set have to use remove_if() instead. If they fail to do so, there will be an assertion failure thanks to debug epochs, so any such cases are easy to detect (and I have already fixed all cases inside llvm at least). This gives a compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=6c4c44b50ba3a08106b37fd5a739c387fef0b961&to=49f56eb34c4f9a28230d8534be6f1eabeaae7160&stat=instructions%3Au 2 Files Affected:
diff --git a/llvm/include/llvm/ADT/SmallPtrSet.h b/llvm/include/llvm/ADT/SmallPtrSet.h
index 7bf3d825fd03c..c9c7a10fd018b 100644
--- a/llvm/include/llvm/ADT/SmallPtrSet.h
+++ b/llvm/include/llvm/ADT/SmallPtrSet.h
@@ -127,22 +127,11 @@ class SmallPtrSetImplBase : public DebugEpochBase {
std::pair<const void *const *, bool> insert_imp(const void *Ptr) {
if (isSmall()) {
// Check to see if it is already in the set.
- const void **LastTombstone = nullptr;
for (const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
APtr != E; ++APtr) {
const void *Value = *APtr;
if (Value == Ptr)
return std::make_pair(APtr, false);
- if (Value == getTombstoneMarker())
- LastTombstone = APtr;
- }
-
- // Did we find any tombstone marker?
- if (LastTombstone != nullptr) {
- *LastTombstone = Ptr;
- --NumTombstones;
- incrementEpoch();
- return std::make_pair(LastTombstone, true);
}
// Nope, there isn't. If we stay small, just 'pushback' now.
@@ -161,14 +150,27 @@ class SmallPtrSetImplBase : public DebugEpochBase {
/// that the derived class can check that the right type of pointer is passed
/// in.
bool erase_imp(const void * Ptr) {
- const void *const *P = find_imp(Ptr);
- if (P == EndPointer())
+ if (isSmall()) {
+ for (const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
+ APtr != E; ++APtr) {
+ if (*APtr == Ptr) {
+ *APtr = SmallArray[--NumNonEmpty];
+ incrementEpoch();
+ return true;
+ }
+ }
return false;
+ }
- const void **Loc = const_cast<const void **>(P);
- assert(*Loc == Ptr && "broken find!");
- *Loc = getTombstoneMarker();
+ auto *Bucket = FindBucketFor(Ptr);
+ if (*Bucket != Ptr)
+ return false;
+
+ *const_cast<const void **>(Bucket) = getTombstoneMarker();
NumTombstones++;
+ // Treat this consistently from an API perspective, even if we don't
+ // actually invalidate iterators here.
+ incrementEpoch();
return true;
}
@@ -192,9 +194,9 @@ class SmallPtrSetImplBase : public DebugEpochBase {
return EndPointer();
}
-private:
bool isSmall() const { return CurArray == SmallArray; }
+private:
std::pair<const void *const *, bool> insert_imp_big(const void *Ptr);
const void * const *FindBucketFor(const void *Ptr) const;
@@ -352,7 +354,7 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
}
/// erase - If the set contains the specified pointer, remove it and return
- /// true, otherwise return false.
+ /// true, otherwise return false. Invalidates iterators if it returns true.
bool erase(PtrType Ptr) {
return erase_imp(PtrTraits::getAsVoidPointer(Ptr));
}
@@ -362,6 +364,22 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
template <typename UnaryPredicate>
bool remove_if(UnaryPredicate P) {
bool Removed = false;
+ if (isSmall()) {
+ const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
+ while (APtr != E) {
+ PtrType Ptr = PtrTraits::getFromVoidPointer(const_cast<void *>(*APtr));
+ if (P(Ptr)) {
+ *APtr = *--E;
+ --NumNonEmpty;
+ incrementEpoch();
+ Removed = true;
+ } else {
+ ++APtr;
+ }
+ }
+ return Removed;
+ }
+
for (const void **APtr = CurArray, **E = EndPointer(); APtr != E; ++APtr) {
const void *Value = *APtr;
if (Value == getTombstoneMarker() || Value == getEmptyMarker())
@@ -370,6 +388,7 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
if (P(Ptr)) {
*APtr = getTombstoneMarker();
++NumTombstones;
+ incrementEpoch();
Removed = true;
}
}
diff --git a/llvm/unittests/ADT/SmallPtrSetTest.cpp b/llvm/unittests/ADT/SmallPtrSetTest.cpp
index 1ed9133c2b551..b45318d076a3d 100644
--- a/llvm/unittests/ADT/SmallPtrSetTest.cpp
+++ b/llvm/unittests/ADT/SmallPtrSetTest.cpp
@@ -243,45 +243,6 @@ TEST(SmallPtrSetTest, SwapTest) {
EXPECT_TRUE(a.count(&buf[3]));
}
-void checkEraseAndIterators(SmallPtrSetImpl<int*> &S) {
- int buf[3];
-
- S.insert(&buf[0]);
- S.insert(&buf[1]);
- S.insert(&buf[2]);
-
- // Iterators must still be valid after erase() calls;
- auto B = S.begin();
- auto M = std::next(B);
- auto E = S.end();
- EXPECT_TRUE(*B == &buf[0] || *B == &buf[1] || *B == &buf[2]);
- EXPECT_TRUE(*M == &buf[0] || *M == &buf[1] || *M == &buf[2]);
- EXPECT_TRUE(*B != *M);
- int *Removable = *std::next(M);
- // No iterator points to Removable now.
- EXPECT_TRUE(Removable == &buf[0] || Removable == &buf[1] ||
- Removable == &buf[2]);
- EXPECT_TRUE(Removable != *B && Removable != *M);
-
- S.erase(Removable);
-
- // B,M,E iterators should still be valid
- EXPECT_EQ(B, S.begin());
- EXPECT_EQ(M, std::next(B));
- EXPECT_EQ(E, S.end());
- EXPECT_EQ(std::next(M), E);
-}
-
-TEST(SmallPtrSetTest, EraseTest) {
- // Test when set stays small.
- SmallPtrSet<int *, 8> B;
- checkEraseAndIterators(B);
-
- // Test when set grows big.
- SmallPtrSet<int *, 2> A;
- checkEraseAndIterators(A);
-}
-
// Verify that dereferencing and iteration work.
TEST(SmallPtrSetTest, dereferenceAndIterate) {
int Ints[] = {0, 1, 2, 3, 4, 5, 6, 7};
|
kuhar
left a comment
There was a problem hiding this comment.
Nice improvement! Left some comments regarding the documentation; we should also update the programmers manual: https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallptrset-h
| /// erase - If the set contains the specified pointer, remove it and return | ||
| /// true, otherwise return false. | ||
| /// true, otherwise return false. Invalidates iterators if it returns true. |
There was a problem hiding this comment.
nit: In the first line, the documentation doesn't really explain what this function does. Also, I'd remove the erase - prefix -- not sure where this came from.
| template <typename UnaryPredicate> | ||
| bool remove_if(UnaryPredicate P) { |
There was a problem hiding this comment.
This should probably document what happens to iterators to explain what the difference is compared to (a sequence of) erase.
When erasing elements in small mode, we currently leave behind tombstones. This means that insertion into the SmallPtrSet also has to check for these, making the operation more expensive than it really should be. We don't really need the tombstones in small mode, because we can just replace with the last element in the set instead. This changes the order, but SmallPtrSet order is fundamentally unstable anyway. However, not leaving tombstones means that the erase() operation now invalidates iterators. This means that consumers that want to remove elements while iterating over the set have to use remove_if() instead. If they fail to do so, there will be an assertion failure thanks to debug epochs, so any such cases are easy to detect (and I have already fixed all cases inside llvm at least).
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in llvm#96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on llvm#198982 and llvm#199365.
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`, it invalidates iterators and references obtained before the call. Under the current tombstone-based deletion this is purely an LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually mutated for other entries — but it surfaces stale-iterator-after-erase patterns now rather than when DenseMap's deletion scheme changes. Mirrors the SmallPtrSet change in #96762, which dropped tombstones in small mode and likewise had `erase` invalidate iterators. Depends on #198982 and #199365
When erasing elements in small mode, we currently leave behind tombstones. This means that insertion into the SmallPtrSet also has to check for these, making the operation more expensive than it really should be.
We don't really need the tombstones in small mode, because we can just replace with the last element in the set instead. This changes the order, but SmallPtrSet order is fundamentally unstable anyway.
However, not leaving tombstones means that the erase() operation now invalidates iterators. This means that consumers that want to remove elements while iterating over the set have to use remove_if() instead. If they fail to do so, there will be an assertion failure thanks to debug epochs, so any such cases are easy to detect (and I have already fixed all cases inside llvm at least).
This gives a compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=6c4c44b50ba3a08106b37fd5a739c387fef0b961&to=49f56eb34c4f9a28230d8534be6f1eabeaae7160&stat=instructions%3Au