Fix flush-async races in Tsavorite LargeObject flush tests#1711
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR stabilizes Tsavorite LargeObject flush tests by removing a temporary [Explicit] skip and adjusting synchronization/assertions to account for intentional async and race behavior in read-only shifting and flush completion.
Changes:
- Re-enabled
LargeObjectMultiFlushedPagesby removing[Explicit]and adding a deterministic sync point that waits forFlushedUntilAddressto catch up to the already-establishedReadOnlyAddress. - Updated
LargeObjectLinearizeFlushedPagesto avoid asserting a deterministic outcome for a deliberately racingShiftReadOnlyAddresscall, while preserving post-WhenAllinvariants.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1. LargeObjectMultiFlushedPages (Tsavorite ObjectTests): Remove [Explicit] and add a no-op ShiftReadOnlyAddress(currentROA, wait: true) sync point so the FlushedUntilAddress / ReadOnlyAddress assertions see a quiesced state after the async flush pipeline catches up. MonotonicUpdate is a no-op when passed the current ROA; the built-in wait loop in ShiftReadOnlyAddressWithWait blocks until FUA >= ROA, which is exactly the observation the test wants. 2. LargeObjectLinearizeFlushedPages (Tsavorite ObjectTests): Drop the Is.True assertion on the second ShiftReadOnlyAddress in DoFlush. That call intentionally races with the main thread's ShiftReadOnlyToTail; if the latter wins, MonotonicUpdate correctly refuses to move ROA backwards and returns false. The post- Task.WhenAll invariants (ROA == FUA == TailAddress) still fully verify correctness, and the test continues to exercise the overlapping-shift linearization path. 3. ConfigSetHeapMemorySizeUtilizationTest (Garnet RespConfigTests): Rewrite test body. The old test inserted 4 tiny lists (i++ inside the loop meant only 4 unique keys) against a 3MB budget, so the LogSizeTracker never saw over-budget and PostMemoryTrim was never invoked; trimCompleteEvent.Wait timed out at 90s. The post-wait assertion also expected AllocatedPageCount < MinLogAllocatedPageCount (== 2), which is logically impossible. The fix inserts 128 list objects with meaningful heap (16 * 256-byte items each, ~512 KB total), then issues CONFIG SET memory 8192 to trigger the shrink branch of UpdateTargetSize. TotalSize > highTargetSize causes the resizer to compute an eviction range, shift addresses with waitForEviction, and invoke PostMemoryTrim. The test now waits at most ~30s for the callback and asserts that APC or heap actually decreased. Remove [Explicit]. No product code changes. Validated with many consecutive green runs of each affected test, plus a full pass of the LargeObject and RespConfig* suites. Co-authored-by: Copilot <[email protected]>
62172db to
0839224
Compare
badrishc
added a commit
that referenced
this pull request
Apr 20, 2026
Integrates 4 new dev commits: - #1716: Fix use-after-free in GetAllocationForRetry when page is evicted - #1712: Internalize heap-size tracking into Tsavorite (IRecordTriggers refactor) - #1711: Fix three broken/flaky Tsavorite and Garnet tests - #1710: Fix native linux device Key changes from #1712 affecting bf-tree: - IRecordDisposer replaced by IRecordTriggers with OnEvict/OnFlush/OnDispose/OnDiskRead - GarnetRecordDisposer replaced by GarnetRecordTriggers - OnEvict now takes EvictionSource parameter (MainLog vs ReadCache) - Heap-size tracking internalized into Tsavorite Resolution: merged with -X theirs, then restored bf-tree GarnetRecordTriggers implementation (OnEvict for BfTree disposal, OnFlush for snapshot, OnDiskRead for handle invalidation, OnDispose for delete cleanup, OnCheckpoint/OnRecovery for checkpoint lifecycle). Updated OnEvict signature to accept EvictionSource. Fixed unnecessary using in ReadMethods.cs. All tests pass: 55 RangeIndex, 345 RespTests, 34 MultiDatabase. Co-authored-by: Copilot <[email protected]>
badrishc
added a commit
that referenced
this pull request
Apr 20, 2026
Integrates 4 new dev commits: - #1716: Fix use-after-free in GetAllocationForRetry when page is evicted - #1712: Internalize heap-size tracking into Tsavorite (IRecordTriggers refactor) - #1711: Fix three broken/flaky Tsavorite and Garnet tests - #1710: Fix native linux device Key changes from #1712 affecting bf-tree: - IRecordDisposer replaced by IRecordTriggers with OnEvict/OnFlush/OnDispose/OnDiskRead - GarnetRecordDisposer replaced by GarnetRecordTriggers - OnEvict now takes EvictionSource parameter (MainLog vs ReadCache) - Heap-size tracking internalized into Tsavorite via logSizeTracker - OnDisposeValueObject removed; disposal routed through OnDispose Resolution: took dev's Tsavorite src/test wholesale, then restored bf-tree additions: CheckpointTrigger enum, OnRecovery/OnCheckpoint/OnRecoverySnapshotRead hooks in IRecordTriggers/IStoreFunctions/StoreFunctions, WrongType in UpsertAction, HybridLogCheckpointSMTask and Recovery call sites. Updated GarnetRecordTriggers OnEvict to accept EvictionSource parameter. Re-added BfTreeInterop project ref and WRONGTYPE guard in UpsertMethods. All tests pass: 55 RangeIndex (3 runs), 345 RespTests. Format clean. Co-authored-by: Copilot <[email protected]>
badrishc
added a commit
that referenced
this pull request
Apr 20, 2026
Integrates 4 dev commits (#1716, #1712, #1711, #1710). Key change: #1712 internalized heap-size tracking into Tsavorite and refactored IRecordDisposer → IRecordTriggers with OnEvict(EvictionSource). Resolution: - IRecordTriggers: take dev's version (EvictionSource enum, OnEvict with EvictionSource param, OnDisposeDiskRecord, removed OnDisposeValueObject), add bf-tree checkpoint hooks (OnRecovery, OnCheckpoint, OnRecoverySnapshotRead) - GarnetRecordTriggers: update OnEvict to accept EvictionSource; remove OnDisposeValueObject and heap-tracking from OnDispose (now internal to Tsavorite); keep bf-tree dispatch (OnFlush, OnEvict, OnDiskRead, OnDispose, OnCheckpoint, OnRecovery) - GarnetServer: keep cacheSizeTracker + rangeIndexManager constructor - RMWMethods PostCopyUpdater: keep ClearTreeHandle for RIPROMOTE - Fix unnecessary using in ReadMethods.cs All tests pass: 55 RangeIndex (3 runs), 345 RespTests, 4 CacheSizeTracker. Format clean (Garnet + Tsavorite). Co-authored-by: Copilot <[email protected]>
badrishc
added a commit
that referenced
this pull request
Apr 20, 2026
Integrates 4 dev commits (#1716, #1712, #1711, #1710). Key change: #1712 internalized heap-size tracking into Tsavorite and refactored IRecordDisposer → IRecordTriggers with OnEvict(EvictionSource). Resolution: - IRecordTriggers: take dev's version (EvictionSource enum, OnEvict with EvictionSource param, OnDisposeDiskRecord, removed OnDisposeValueObject), add bf-tree checkpoint hooks (OnRecovery, OnCheckpoint, OnRecoverySnapshotRead) - GarnetRecordTriggers: update OnEvict to accept EvictionSource; remove OnDisposeValueObject and heap-tracking from OnDispose (now internal to Tsavorite); keep bf-tree dispatch (OnFlush, OnEvict, OnDiskRead, OnDispose, OnCheckpoint, OnRecovery) - GarnetServer: keep cacheSizeTracker + rangeIndexManager constructor - RMWMethods PostCopyUpdater: keep ClearTreeHandle for RIPROMOTE - Fix unnecessary using in ReadMethods.cs All tests pass: 55 RangeIndex (3 runs), 345 RespTests, 4 CacheSizeTracker. Format clean (Garnet + Tsavorite). Co-authored-by: Copilot <[email protected]>
badrishc
added a commit
that referenced
this pull request
Apr 20, 2026
Integrates 4 dev commits (#1716, #1712, #1711, #1710). Key change: #1712 internalized heap-size tracking into Tsavorite and refactored IRecordDisposer → IRecordTriggers with OnEvict(EvictionSource). Resolution: - IRecordTriggers: take dev's version (EvictionSource enum, OnEvict with EvictionSource param, OnDisposeDiskRecord, removed OnDisposeValueObject), add bf-tree checkpoint hooks (OnRecovery, OnCheckpoint, OnRecoverySnapshotRead) - GarnetRecordTriggers: update OnEvict to accept EvictionSource; remove OnDisposeValueObject and heap-tracking from OnDispose (now internal to Tsavorite); keep bf-tree dispatch (OnFlush, OnEvict, OnDiskRead, OnDispose, OnCheckpoint, OnRecovery) - GarnetServer: keep cacheSizeTracker + rangeIndexManager constructor - RMWMethods PostCopyUpdater: keep ClearTreeHandle for RIPROMOTE - Fix unnecessary using in ReadMethods.cs All tests pass: 55 RangeIndex (3 runs), 345 RespTests, 4 CacheSizeTracker. Format clean (Garnet + Tsavorite). Co-authored-by: Copilot <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No product code changes. Validated with 10 consecutive green runs of both tests (4 parameterized cases each).