Skip to content

Pause VectorManager cleanup before cluster/other resets of log#1775

Merged
badrishc merged 2 commits into
devfrom
badrishc/reset-pause
May 7, 2026
Merged

Pause VectorManager cleanup before cluster/other resets of log#1775
badrishc merged 2 commits into
devfrom
badrishc/reset-pause

Conversation

@badrishc

@badrishc badrishc commented May 6, 2026

Copy link
Copy Markdown
Collaborator
  1. Cover reset with a boolean Initializing check that is epoch-protected, so that concurrent iterators end gracefully.
  2. Add ability to pause vector cleanup, used before we reset as a second line of defense

Copilot AI review requested due to automatic review settings May 6, 2026 20:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Tsavorite allocator reset behavior against concurrent scan iterators, and adds an explicit pause/resume mechanism to prevent VectorManager’s background cleanup scan from racing with store resets during cluster replica re-attach.

Changes:

  • Add an AllocatorBase.Initializing gate around Reset() + Initialize(), and update scan iterators to acquire epoch earlier and end iteration cleanly if reset/initialization is in progress.
  • Introduce VectorManager.PauseCleanupAsync() / ResumeCleanup() and use them in diskless/diskbased replica attach flows to quiesce the background cleanup scan before storeWrapper.Reset().
  • Remove allocator-specific Reset() overrides now that AllocatorBase.Reset() performs the full reset + initialize sequence.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorBase.cs Adds Initializing flag and restructures reset into a gated Reset() plus ResetCore() cascade.
libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteScanIterator.cs Acquires epoch before sampling allocator state; exits early if Initializing is true; rechecks stop conditions after snapping to BeginAddress.
libs/storage/Tsavorite/cs/src/core/Allocator/ObjectScanIterator.cs Same iterator safety changes as SpanByteScanIterator.
libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteAllocatorImpl.cs Removes redundant Reset() override (base handles reset+initialize).
libs/storage/Tsavorite/cs/src/core/Allocator/ObjectAllocatorImpl.cs Removes redundant Reset() override (base handles reset+initialize).
libs/storage/Tsavorite/cs/src/core/Allocator/TsavoriteLogAllocatorImpl.cs Removes redundant Reset() override (base handles reset+initialize).
libs/server/Resp/Vector/VectorManager.Cleanup.cs Adds cleanupGate and exposes pause/resume APIs; cleanup loop takes the gate per iteration.
libs/server/Resp/Vector/VectorManager.cs Disposes cleanupGate after draining the cleanup task.
libs/cluster/Server/Replication/ReplicaOps/ReplicaDisklessSync.cs Pauses VectorManager cleanup before calling storeWrapper.Reset() in replica attach path.
libs/cluster/Server/Replication/ReplicaOps/ReplicaDiskbasedSync.cs Pauses VectorManager cleanup before calling storeWrapper.Reset() in replica attach path.

Comment thread libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorBase.cs Outdated
Comment thread libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorBase.cs
Comment thread libs/server/Resp/Vector/VectorManager.cs
@badrishc badrishc force-pushed the badrishc/reset-pause branch 4 times, most recently from 9c842f1 to 7f14504 Compare May 7, 2026 02:30
Comment thread test/Garnet.test/VectorCleanupVsResetRaceTests.cs Outdated
@badrishc badrishc force-pushed the badrishc/reset-pause branch from 7f14504 to c78e394 Compare May 7, 2026 18:02
…nitialize() epoch-gate

Fixes a rare CI fatal AccessViolationException at LogRecord.get_Info /
ObjectScanIterator.GetPhysicalAddressAndAllocatedSize during
VectorManager.RunCleanupTaskAsync. Same crash signature as PR #1765
but a different remaining gap.

## Root cause

PR #1765 made AllocatorBase.Reset()'s page-free section epoch-safe
(2-phase BumpCurrentEpoch barrier around OnPagesClosed +
FreeAllAllocatedPages). It did NOT cover the per-allocator Initialize()
that runs after base.Reset() returns. Initialize() non-monotonically
rewinds HeadAddress / BeginAddress / TailPageOffset back to
FirstValidAddress while pagePointers[i] are mostly 0 (only pages 0 and 1
are reallocated). Race window between HeadAddress reset and
TailPageOffset reset:

  - Iterator reads stopAddress = TailAddress = old high (stale)
  - Iterator passes currentAddress < stopAddress check, proceeds
  - Iterator reads headAddress = HeadAddress = new low
  - Iterator chooses in-memory path (currentAddress >= headAddress)
  - Iterator dereferences pagePointers[X] + offset = 0 + offset → AVE

The cluster re-attach paths (ReplicaDisklessSync, ReplicaDiskbasedSync)
call SuspendPrimaryOnlyTasksAsync() AFTER Reset, and even then it only
covers TaskManager-registered tasks. VectorManager.cleanupTask is
started directly in the constructor (VectorManager.cs:110) as a plain
async Task and is invisible to TaskManager.

## Fix

Two complementary fixes:

### A. Tsavorite-side defense-in-depth

- AllocatorBase: added `internal volatile bool Initializing`.
  Refactored Reset() to be non-virtual, wrapping the existing 2-phase
  epoch logic (renamed to ResetCore()) PLUS the per-allocator
  Initialize() call inside Initializing = true / false (try/finally so
  a throw can't leave iterators stuck).
- Per-allocator Reset() overrides removed in ObjectAllocatorImpl,
  SpanByteAllocatorImpl, TsavoriteLogAllocatorImpl — base now does
  ResetCore() + Initialize() directly under the gate.
- ObjectScanIterator + SpanByteScanIterator (InitializeGetNextAndAcquireEpoch):
  - Move epoch.Resume() BEFORE sampling TailAddress (was sampled outside
    epoch protection — preexisting issue).
  - Spin-wait `while (hlogBase.Initializing) { Thread.Yield(); epoch.ProtectAndDrain(); }`
    with `!assumeInMemory` bypass (avoids deadlock with internal
    MemoryPageScan inside OnPagesClosed which constructs assumeInMemory
    iterators on the same thread that is running the Initializing-gated
    deferred action).
- LoadPageIfNeeded: recheck `currentAddress >= stopAddress` after
  snapping forward to BeginAddress (stale snap could place
  currentAddress at TailAddress with an unparseable record).

### B. Caller-side explicit drain (restores Reset's documented contract)

- VectorManager.Cleanup.cs: added PauseCleanupAsync() / ResumeCleanup()
  API. Cleanup loop uses increment-then-check pattern (bump
  cleanupInProgress first, re-read pauseCleanupRequests, back out if
  non-zero) so a pause caller cannot race past an iteration about to
  start.
- ReplicaDisklessSync.cs + ReplicaDiskbasedSync.cs: wrap
  storeWrapper.Reset() in `await vectorManager.PauseCleanupAsync()` /
  `vectorManager.ResumeCleanup()` try/finally.

A is defense-in-depth (any future background reader is safe); B
restores the documented contract for the known case and skips the
per-iteration spin-wait when callers cooperate.

## Validation

- Build: 0 warnings, 0 errors. dotnet format clean.
- Tsavorite tests: 1453/1453 pass.
- Garnet.test Reset/Recovery/Iterate filter: 30/30 pass.
- ClusterMigrate + ClusterVectorSet (52 tests): all pass.
- MigrateVectorSetWhileModifyingAsync (the originally failing cluster
  test): stable across 8 sequential runs (~18s each) with --blame-crash.
- VectorCleanupVsResetRaceTests (PR #1765's regression): pass [Repeat(5)].
- ClusterResetDuringReplicationTests: 2/2 pass.

Co-authored-by: Copilot <[email protected]>
@badrishc badrishc force-pushed the badrishc/reset-pause branch from c78e394 to 8e819a8 Compare May 7, 2026 19:33
@badrishc badrishc merged commit 60fe54d into dev May 7, 2026
2 checks passed
@badrishc badrishc deleted the badrishc/reset-pause branch May 7, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants