Skip to content

Simplified latch-free read cache design with eviction improvements#1860

Merged
TedHartMS merged 20 commits into
mainfrom
badrishc/read-cache-cas-entry
Jun 11, 2026
Merged

Simplified latch-free read cache design with eviction improvements#1860
TedHartMS merged 20 commits into
mainfrom
badrishc/read-cache-cas-entry

Conversation

@badrishc

@badrishc badrishc commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@badrishc badrishc force-pushed the badrishc/read-cache-cas-entry branch from 6ec41a4 to eebb4c6 Compare June 5, 2026 23:59
@badrishc badrishc marked this pull request as ready for review June 6, 2026 00:00
Copilot AI review requested due to automatic review settings June 6, 2026 00:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Tsavorite’s read-cache implementation to a latch-free design where value-changing operations (Upsert/RMW/Delete/CopyToTail) detach (drop) any read-cache prefix via a single CAS on the hash-table entry, instead of splicing at the read-cache/main-log boundary and invalidating read-cache records in place. It also adds/updates documentation and tests to reflect the new chain semantics and to guard against historical heap-tracker leaks.

Changes:

  • Remove boundary-splice + in-place invalidation mechanics; make structural updates CAS only the hash entry to detach the read-cache prefix.
  • Simplify TryCopyToReadCache to a single head-insert CAS, relying on the new single-linearization-point model.
  • Update/add tests (including heap-size tracker parity) and add a detailed read-cache design doc; wire the doc into the website sidebar.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
website/sidebars.js Adds the new read-cache doc page to the Tsavorite developer docs sidebar.
website/docs/dev/tsavorite/readcache.md New detailed documentation of the latch-free read-cache chain structure, linearization rules, eviction behavior, and rationale.
libs/storage/Tsavorite/cs/test/test.session/ReadCacheChainTests.cs Updates chain tests to assert detach-on-update semantics (keys no longer remain in the live read-cache chain after value-changing updates).
libs/storage/Tsavorite/cs/test/test.recordops/RecordLifecycleTests.cs Adds a regression test ensuring read-cache heap-size tracking returns to zero after promote + update + eviction.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/TryCopyToTail.cs Updates comments to reflect detach-on-insert behavior when copying to tail.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/TryCopyToReadCache.cs Replaces splice-race checking with a single head-CAS promotion model; CAS-failure remains best-effort.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/RecordSource.cs Removes lowest-read-cache splice-point tracking fields; updates documentation and debugging output accordingly.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/ReadCache.cs Updates read-cache traversal to no longer maintain splice-point fields; removes splice and tail-check helpers that are no longer used.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/InternalUpsert.cs Removes post-copy invalidation flow; updates comments to reflect hash-entry CAS detach behavior.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/InternalRMW.cs Same as Upsert: rely on hash-entry CAS detach; remove old post-copy invalidation flow.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/InternalDelete.cs Same as Upsert/RMW: rely on hash-entry CAS detach; remove old post-copy invalidation flow.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/Helpers.cs Makes CAS-to-hash-entry the only structural insertion path; removes PostCopyToTail helper that invalidated read-cache records.
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Implementation/ContinuePending.cs Updates a comment reflecting removal of LowestReadCache* preservation.

@badrishc

badrishc commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed both review comments in 51e0b3f (stale lowestReadCache* wording in ReadCache.cs; garbled comment in ContinuePending.cs).

That commit also fixes a read-cache heap-tracker mis-accounting surfaced while auditing the size trackers: an RMW CopyUpdate whose source is a read-cache object record cleared/disposed the source value but decremented the main-log tracker instead of the read-cache tracker (which owns that value heap from promotion). This leaked the read-cache tracker and under-counted the main-log tracker. Now decrements recSrc.AllocatorBase.logSizeTracker (the allocator that owns the source). Pre-existing on main; in scope for this PR's read-cache size-tracker audit. The parity test ReadCacheHeapSizeTrackerReturnsToZeroAfterUpdateAndEvict is now parameterized over {Upsert, RMW} and is proven to catch it (reverting the fix fails the RMW case with exactly 200 bytes leaked). Full test.recordops (146) / test.session (155) / test.stress (48) green.

@badrishc badrishc force-pushed the badrishc/read-cache-cas-entry branch from 51e0b3f to ec7eb6c Compare June 6, 2026 04:11
@badrishc

badrishc commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Did a robustness/performance/correctness/quality review pass over the PR (commit 955002c):

Correctness — verified the latch-free detach, the removed splice-race guards, and the LowestReadCache* localization are faithful and sound. Confirmed elision is gated on HasMainLogSrc in Upsert/RMW/Delete, so a read-cache source is never elided — the RMW CopyUpdate in-place value-clear (fixed earlier) is the only in-place mutation of a read-cache source, so the heap-accounting fix is complete. No stale references remain to the removed private members (the IRecordTriggers.PostCopyToTail trigger is a separate, still-used public API).

Performance — net positive: CASRecordIntoChain drops a hot-path branch (single hash-entry TryCAS), RecordSource is 16 bytes smaller (two long fields removed), and the read-cache walk no longer writes two struct fields per iteration.

Quality cleanup — removed a now write-only test field (readCacheBelowMidChainKeyEvictionAddress, whose only readers were the removed ReadCacheEvict calls; it escaped CS0414 because it was assigned a non-constant) and documented the RMW CopyUpdate read-cache heap-accounting nuance in readcache.md.

Tsavorite test.session (155) and ReadCache chain tests (124) green; dotnet format clean.

@badrishc badrishc force-pushed the badrishc/read-cache-cas-entry branch 3 times, most recently from 8a4803b to 92231f8 Compare June 10, 2026 15:28
badrishc and others added 18 commits June 10, 2026 17:57
Make the only structural CAS for record insertion target the hash-table
entry. CASRecordIntoChain now always head-CAS'es the new record (whose
PreviousAddress is the first main-log address) into the hash entry,
atomically committing it and detaching any read-cache prefix from the
reachable chain. The read-cache/main-log boundary interior-CAS splice
(SpliceIntoHashChainAtReadCacheBoundary) is removed.

This keeps the hash-entry head monotonic (it only advances to the new
record, never back to a superseded read-cache address), which makes the
insert correct without relying on the bucket latch. Dropped read-cache
records are orphaned and reclaimed by ReadCacheEvict when their page is
closed.

Phase 1 baseline: bystander read-cache entries are dropped on update
(restored as fresh re-insertions in a later phase). Eviction is
unchanged (it only removes records, so it is already monotonic-safe).

Co-authored-by: Copilot <[email protected]>
Rewrite ReadCacheEvictChain to be monotonic-head safe and free of
interior-record CAS. When a key's chain links any readcache record in the
eviction range, detach the entire readcache prefix for the bucket with a
single CAS on the hash entry (pointing it at the first main-log address),
rather than surgically unlinking evicted records via interior
PreviousAddress CAS.

Because eviction now only ever advances the head toward the main log and
never restores a superseded readcache address, it cannot mask a concurrent
insert (the lost-update A-B-A that an existing-survivor reattach would
cause), and it is correct without a bucket latch. Surviving readcache
records above the eviction range are dropped too and re-promoted on the
next read.

A key consequence of the latch-free design: no operation mutates an
existing readcache record's PreviousAddress after creation (inserts,
promotions, and eviction only CAS the hash entry), so interior chain
pointers are immutable and the eviction walk only reads them.

All 48 concurrent readcache stress tests pass (1-8 read/write threads,
forced hash collisions, with eviction).

Co-authored-by: Copilot <[email protected]>
Reverts commit 7ee6fae (hash-entry-CAS-only "drop" eviction).

Eviction is *value-preserving* (it removes redundant read-cache copies of
data that still lives in the main log), so its interior CAS on a survivor's
PreviousAddress is safe even lock-free: a reader that observes the old or new
pointer reaches the same value, and the existing kTempInvalidAddress +
RestartChain protocol (not the bucket latch) coordinates with concurrent
walkers. main's ReadCacheEvictChain therefore already works in a latch-free
environment and, unlike the drop rewrite, retains survivors above the
eviction range (better hit rate).

This restores the asymmetric, value-semantics-driven design:
  - Update (value-CHANGING): publish the new value with a single hash-entry
    CAS (detach), since an interior splice would let a lock-free reader miss
    it; bystander retention is via fresh re-insertion (Phase 3).
  - Eviction (value-PRESERVING): main's interior-CAS keep-survivors, head
    untouched (no A-B-A), so it composes safely with the detach update.

Verified: all 48 concurrent read-cache stress tests pass with Phase 1 detach
+ this eviction. The 5 failing chain unit tests fail only on the Phase 1
update-detach (retention), restored by Phase 3.

Co-authored-by: Copilot <[email protected]>
…mantics

The latch-free design detaches (drops) the entire read-cache prefix on a
value-changing update/delete (bystanders are re-promoted lazily on next
read), instead of splicing the new record at the boundary and keeping the
prefix. Update the 5 chain tests that encoded the old splice-retention
semantics:

- Upsert/RMW/DeleteCacheRecordTest: assert the updated/deleted key is
  dropped from the read cache (AssertNotInReadCache) rather than
  retained-and-invalidated; verify values remain correct. Handle the now-
  pending read of a key that a prior update dropped from the cache.
- DeleteHalfOfAllReadCacheRecordsTest: deleting a chain key now drops the
  whole prefix, so mid/high keys are also dropped (not retained); assert
  they are absent from the cache but still read their correct values.
- SpliceInFromRMWTest: the updated key's read-cache entry is dropped, not
  present-and-invalidated.

Add AssertNotInReadCache helper. All read-cache chain tests pass.

Co-authored-by: Copilot <[email protected]>
Add website/docs/dev/tsavorite/readcache.md describing the read cache:
chain structure, the latch-free "all structural CAS on the hash entry"
design and Monotonic-Head invariant, the value-changing (detach) vs
value-preserving (eviction interior-CAS) split and why interior splice
needs a lock, update/eviction coordination, epoch/memory reclamation, and
checkpointing.

Includes a detailed "fresh re-insertion of bystanders" future-optimization
section with rationale (why fresh, not re-splice-existing - the A-B-A),
the self-cleaning correctness argument, an implementation sketch, and the
correctness obligations, so a future implementer can add it if the
collision-on-update hit-rate ever matters.

Register the page in the Tsavorite sidebar.

Co-authored-by: Copilot <[email protected]>
… comments

Cleanup enabled by the latch-free detach design.

Code change (proven safe, verified by chain + stress tests):
- PostCopyToTail no longer invalidates a read-cache *source* record after a
  value-changing update. With the detach design, that source is part of the
  read-cache prefix that the publishing hash-entry CAS orphaned, so it is
  unreachable: its Invalid bit is irrelevant to readers (they cannot reach
  it) and to eviction (which skips on kTempInvalidAddress and no-ops on
  records not in the live chain), and a reader that did reach it grabbed the
  old head before the detach (linearizes before the update -> correctly
  returns the old value). The explicit invalidation was only needed in the
  old splice design, where the prefix was *retained*.

Comment-only fixes (no behavior change):
- Update Upsert/RMW/Delete/CopyToTail insert comments: the new record is
  CAS'd into the hash entry and detaches the read-cache prefix; it is never
  spliced into the read-cache/main-log boundary.
- Clarify that LowestReadCache* / VerifyInMemoryAddresses / the
  EnsureNoNewMainLogRecordWasSpliced + ReadCacheCheckTailAfterSplice guards
  now serve the *read-promotion* path (preventing a stale copy from
  shadowing a newer main-log record, including the escaped-to-disk case);
  these are intentionally KEPT.
- Refresh RecordSource field docs and TryCopyToReadCache "Consistency Notes"
  that referenced the removed splice / the old XLock model.

Co-authored-by: Copilot <[email protected]>
With the latch-free design there is no splice: every structural insert is a
single head-CAS on the hash entry. The "splice" race these guards protected
against - a value-changing record added BELOW the read-cache prefix without
changing the head - can no longer occur. Removed:

- EnsureNoNewMainLogRecordWasSpliced (and the post-CAS block in
  TryCopyToReadCache that called it; its RECORD_ON_DISK status was already
  discarded - the method returns only bool).
- ReadCacheCheckTailAfterSplice and PostCopyToTail (which only called it);
  the PostCopyToTail calls in CreateNewRecord{Upsert,RMW,Delete} are dropped.

Proof of redundancy. The only remaining race is a read-promotion copy
rc_K(V) shadowing a newer record newM_K(V'). Every way newM_K can appear is
already handled without these guards:
 - committed via a head-CAS (update detach / CTT / another promotion) after
   the promotion sampled the head -> the promotion's head-CAS fails;
 - committed in memory before the promotion sampled the head -> the pending
   read's pre-check (TryFindRecordInMemory) finds it -> no promotion;
 - escaped to disk before completion -> the pending read's re-issue path
   (recSrc.LogicalAddress > initialLatestLogicalAddress) re-reads;
 - escaping to disk during the window -> impossible under epoch protection.
Clincher: the no-read-cache-prefix promotion path always SKIPPED
EnsureNoNew and is correct, proving the head-CAS + pending-read pre-check
are sufficient on their own; the with-prefix path has identical core
protections plus only the now-impossible gap-splice check.

TryCopyToReadCache reduces to: head-CAS; on success keep, else abandon.
Verified: full Tsavorite test.session (155) + test.stress (48) pass,
including the concurrent promotion-vs-update/eviction stress with forced
collisions.

A related conservative guard (the read-cache-boundary branch of
VerifyInMemoryAddresses) is now also unconsumed but harmless; left in place
with a comment, pending its own verification.

Co-authored-by: Copilot <[email protected]>
…moryAddresses

Remove the read-cache-prefix-boundary branch of VerifyInMemoryAddresses
(return RETRY_LATER when LowestReadCacheLogicalAddress fell below
readcacheBase.HeadAddress). It only mattered when something dereferenced the
boundary record after allocation - the old splice (CAS into rc1.PreviousAddress)
and EnsureNoNewMainLogRecordWasSpliced - both of which are gone. Nothing now
reads LowestReadCache* after FindInReadCache except its own walk cursor, so the
boundary's in-memory validity is irrelevant.

Proceeding instead of retrying is correct: nothing dereferences the (possibly
freed) boundary, and eviction is value-preserving interior-CAS keep-survivors -
if the whole prefix is evicted the head changes and the operation's hei.TryCAS
fails (retry); if only the bottom is evicted the head and mainlogHead are
unchanged so the CAS is valid. The allocation loops keep their own independent
"new address is above the insert position" checks (BlockAllocate.cs:95,145).

Also fix the stale "This refreshes the HashEntryInfo" header comment - verified
against main that VerifyInMemoryAddresses has never refreshed hei.

VerifyInMemoryAddresses reduces to the in-memory-source-below-HeadAddress check.
Verified: full test.session (155) + test.stress (48) pass.

Co-authored-by: Copilot <[email protected]>
The LowestReadCacheLogicalAddress/LowestReadCachePhysicalAddress fields on
RecordSource are, after the latch-free read-cache cleanup, only used as a
method-internal walk cursor inside FindInReadCache/SkipReadCache (no caller
consumes their final value). Replace them with method-local variables and
remove the two fields (definition, Set() reset, ToString). Update a stale
comment in ContinuePending that referenced the removed fields.

Co-authored-by: Copilot <[email protected]>
Audit of the read-cache LogSizeTracker accounting after the latch-free cleanup.

Read-cache heap is adjusted at exactly two internal sites: +UpdateSize at
TryCopyToReadCache (after a successful head-CAS) and -IncrementSize at
EvictRecordsInRange (page close), which SKIPS Invalid/Null records on the
contract that an invalidated record was already heap-decremented at its
invalidation site. Garnet's CacheSizeTracker does no manual read-cache heap
adjustment; it relies entirely on this internal accounting.

The load-bearing invariant: no read-cache record that has had its heap
accounted may be marked Invalid before eviction. The old splice design broke
this -- PostCopyToTail (value-changing update over a read-cached key) and
ReadCacheCheckTailAfterSplice (promotion-vs-splice race) invalidated a
heap-bearing read-cache record in place via SetInvalidAtomic without
decrementing, so EvictRecordsInRange skipped it and the +UpdateSize was never
matched: a read-cache heap-tracker leak. The latch-free detach design leaves
orphaned read-cache records Valid, so each is decremented exactly once at page
close; the earlier cleanup that removed those in-place invalidations fixed the
leak.

This commit adds a deterministic regression test
(ReadCacheHeapSizeTrackerReturnsToZeroAfterUpdateAndEvict): promote 50
heap-bearing records into the read cache, Upsert each cached key
(value-changing), evict the read cache, and assert the read-cache tracker heap
returns to zero. Proven to catch the leak: temporarily reintroducing the
SetInvalidAtomic-on-read-cache-source behavior fails the test with exactly 200
bytes (50 x 4) leaked. Also documents the parity invariant in readcache.md.

Co-authored-by: Copilot <[email protected]>
…comments

Bug fix (found by GPT-5.5 review of the read-cache size-tracker audit):
When an RMW does a CopyUpdate whose source is a read-cache record holding an
object value, the source value is cleared/disposed in place and its heap was
being decremented from the MAIN-LOG tracker (hlogBase.logSizeTracker). But a
read-cache record's value heap is charged to the READ-CACHE tracker at
promotion (TryCopyToReadCache). Crediting hlog therefore (a) leaked the
read-cache tracker -- the +UpdateSize from promotion was never matched, since
EvictRecordsInRange later recomputes 0 value heap from the cleared record -- and
(b) under-counted the main-log tracker. Decrement the tracker of the allocator
that OWNS the source record (stackCtx.recSrc.AllocatorBase): hlog for a main-log
source, the read cache for a read-cache source. Pre-existing on main; in scope
for this PR's read-cache size-tracker audit. Only the in-memory CopyUpdate path
is affected (the pending-IO DiskLogRecord path has isMemoryLogRecord == false
and is skipped).

Test: parameterized ReadCacheHeapSizeTrackerReturnsToZeroAfterUpdateAndEvict
over {Upsert, RMW}. The RMW case exercises the CopyUpdate value-clear path and
asserts the read-cache tracker returns to zero. Proven to catch the bug:
reverting the fix fails RMW with exactly 200 (50x4) bytes leaked while Upsert
still passes. Widened TrackedObjectValue.Serializer to cast to the base
TestObjectValue, since an RMW CopyUpdater produces a base TestObjectValue for
the new record that can be flushed in this mixed-value test.

PR review comments addressed:
- ReadCache.cs: drop the stale `lowestReadCache*Address` mention (fields removed
  with the latch-free detach design); clarify updaters walk on to find
  latestLogicalAddress.
- ContinuePending.cs: fix garbled comment ("is the from the request").

Co-authored-by: Copilot <[email protected]>
…nuance

Code-quality pass over the read-cache PR.

- ReadCacheChainTests: remove the now write-only field
  `readCacheBelowMidChainKeyEvictionAddress` (and its comment referencing the
  obsolete "outsplicing" concept). All readers were the `store.ReadCacheEvict(...)`
  calls removed when the retention tests were updated to detach/drop semantics; the
  field was still being assigned in CreateChain but never read. It did not trip CS0414
  (assigned a non-constant), so the build stayed green while the code was dead.

- readcache.md: document the one in-place heap-accounting exception — an RMW
  CopyUpdate from a read-cache object source clears the source value in place and must
  decrement the read-cache tracker (recSrc.AllocatorBase), the allocator that owns the
  value heap, not the main-log tracker. Captures the invariant behind the fix in
  ec7eb6c so it is not regressed.

No production-code change in this commit. Full Tsavorite test.session (155) green;
ReadCache chain tests (124) green; format clean.

Co-authored-by: Copilot <[email protected]>
…ddress

The read-cache design doc used "head" to mean the *highest* read-cache address
("rcN (highest / newest, at the head)"), which clashes with log-allocator
terminology where the highest address is the **TailAddress** and the
**HeadAddress** is the *lowest* in-memory address (the eviction frontier).

Fix the Chain-structure bullets to describe addresses by magnitude (rcN = chain
head = highest read-cache address; rc1 = lowest, at the read-cache/main-log
boundary) and add a short "two opposite heads" terminology note:
 - chain head = newest record the hash entry points to = highest RC address,
   nearest the log TailAddress (where new records are allocated);
 - HeadAddress = lowest in-memory address = eviction frontier, the opposite end
   nearest the oldest record rc1.
Clarifies that "head"/"Monotonic-Head" mean the linked-list chain head, while
"HeadAddress" means the allocator's low-address frontier. Docs-only.

Co-authored-by: Copilot <[email protected]>
"insert at head" was correct (head-insertion into the hash chain — the new
record becomes the chain head the hash entry points to), but tighten "head" ->
"chain head" in the heading and body so it cannot be misread as the allocators
low-address HeadAddress. The promoted record is allocated at the log tail
(highest address) and linked at the chain head, per the terminology note.

Co-authored-by: Copilot <[email protected]>
The latch-free read-cache redesign replaced boundary-splice + per-record
invalidation with a single hash-entry CAS that detaches the whole read-cache
prefix. Two stale descriptions still presented the removed splice mechanism (and
the removed ReadCacheCheckTailAfterSplice helper) as current behavior:

- OperationOptions.cs (public ReadCopyTo.MainLog doc): "...or splice into the
  readcache/mainlog boundary..." -> describe the detach (the prefix is dropped
  and re-promoted on a later read).
- locking.md ReadCache section: rewrote the update flow to detach semantics
  (single HashBucketEntry CAS drops the entire prefix; no interior edit, no
  per-record invalidation), fixed the example chain, removed the
  ReadCacheCheckTailAfterSplice paragraph, and linked the read cache design doc.
  Also corrected a stale RecordSource<TKey,TValue> -> <TStoreFunctions,TAllocator>.

Intentional historical references to the old splice design (readcache.md
rationale sections, Helpers.cs "we never splice", the RecordLifecycleTests
regression comment) are left as-is.

Co-authored-by: Copilot <[email protected]>
The latch-free redesign replaced boundary-splice with a single hash-entry CAS
that inserts the new record into the tag chain (detaching any read-cache
prefix). Rename the now-stale "Splice" test identifiers to the faithful
"ChainInsert" (insert into the tag chain), which is accurate for both the
existing-key (detach) and new-key cases:
 - SpliceInFrom{CTT,Upsert,RMW,Delete}Test -> ChainInsertFrom{...}Test
 - VerifySplicedInKey -> VerifyChainInsertedKey
 - SpliceInNewKey/SpliceInExistingKey -> ChainInsertNewKey/ChainInsertExistingKey
 - VerifyAndUnlockSplicedInKey -> VerifyAndUnlockChainInsertedKey (txn tests)
Also fix a stale RMW-test comment ("...invalidated it" -> "...detached/dropped
the readcache prefix"). Test-only; logic unchanged. test.session 155,
test.session.context 99 pass.

Co-authored-by: Copilot <[email protected]>
…bsoluteAddress

Two fixes from review feedback, both verified:

1. SkipReadCache: GetPhysicalAddress already strips the read-cache bit
   internally (GetPageOfAddress -> AbsoluteAddress; GetOffsetOnPage masks the
   low bits, and the RC bit is bit 47), so the explicit AbsoluteAddress(...)
   wrapper was redundant. Pass the raw address, matching FindInReadCache. Saves
   a couple cycles on the read-cache skip path; no behavior change.

2. readcache.md eviction section: the doc claimed concurrent walkers coordinate
   with eviction via the kTempInvalidAddress stamp. They do not. Walker safety
   comes from epoch + HeadAddress: eviction runs as a deferred drain-list action
   (OnPagesClosedWorker) and readcache.HeadAddress is published before that
   action is queued, so a walker either waits+restarts at the boundary via
   ReadCacheNeedToWaitForEviction (SpinWaitUntilRecordIsClosed) or holds an
   epoch that gates the eviction action until it drains. Decisive proof:
   SkipReadCache walks the same chain WITHOUT any kTempInvalidAddress check and
   is correct. The stamp is eviction-internal bookkeeping (ReadCacheEvict skips
   already-unlinked / CAS-failed records), not a signal read by walkers.
   Rewrote reason #2 accordingly and clarified the stamp's real purpose.

Verified: test.session 155, test.session.context 99, test.stress 48 pass;
format clean.

Co-authored-by: Copilot <[email protected]>
…gate FindInReadCache on hei.IsReadCache

1. FindInReadCache: the `if (recordInfo.PreviousAddress <= kTempInvalidAddress)
   { wait; restart; }` guard was redundant. The current record passed
   ReadCacheNeedToWaitForEviction (>= readcache.HeadAddress), and eviction runs
   as a deferred, epoch-gated drain-list action (OnPagesClosedWorker) with
   HeadAddress published before it is queued — so a walker either waits at the
   boundary or holds an epoch that gates the unlink, and can never observe a
   stamped/unlinked current record here. (SkipReadCache walks the same chain
   without the check and is correct.) Replace the branch with a Debug.Assert
   documenting the invariant. Validated by the concurrent-eviction stress suite
   (48 tests): the assert never fires.

2. TryFindRecordInMemory and TryFindRecordForUpdate gated the NoInlining
   FindInReadCache call on `UseReadCache`. Gate on `stackCtx.hei.IsReadCache`
   instead: it implies UseReadCache and is false when the tag chain has no
   readcache prefix, so we skip the call (and the minRC computation) entirely in
   that common case. Behavior is identical — FindInReadCache early-returns false
   without side effects when !hei.IsReadCache. (InternalRead already gates on
   hei.IsReadCache.)

Verified: test.session 155, test.recordops 146, test.session.context 127,
test.stress 48 pass; format clean.

Co-authored-by: Copilot <[email protected]>
badrishc and others added 2 commits June 10, 2026 17:57
…ReadCache)

AggressiveInlining is only a hint, so a UseReadCache call-site guard may still
incur a real SkipReadCache call when the tag chain has no readcache prefix.
Testing hei.IsReadCache costs the same as the UseReadCache field read but is the
tighter, per-chain condition. Crucially, hei.IsReadCache implies UseReadCache, so
gating on it keeps SkipReadCache's `Debug.Assert(UseReadCache)` reachable: an
RC record present while !UseReadCache now trips the assert instead of being
silently skipped. Switched all four callers:
 - FindRecord.cs (TryFindRecordInMainLogForConditionalOperation)
 - ContainsKeyInMemory.cs
 - AllocatorScan.cs (IterateKeyVersions)
 - TsavoriteIterator.cs (IsTailmostMainKvRecord)
Each already has a FindTag'd hei. This matches the FindInReadCache callers, which
already gate on hei.IsReadCache. Behavior is otherwise identical (SkipReadCache
no-ops when !hei.IsReadCache).

Left Helpers.cs:69 (`UseReadCache && entry.IsReadCache`, a CPR new-version check,
not a readcache walk) as-is.

Verified: test.session 155, test.recordops 146, test.session.context 127,
test.stress 48 pass; format clean.

Co-authored-by: Copilot <[email protected]>
…ith detach semantics

GPT-5.5 review of the PR found locking.md still documented the removed
splice/invalidate machinery as current behavior, which contradicts the
latch-free detach invariant (orphaned read-cache records stay Valid until page
close; following the stale "set readcache Invalid" guidance would leak read-cache
heap accounting since EvictRecordsInRange skips Invalid records). Fixed three
spots:

- RecordSource section: removed the "lowest readcache logical/physical addresses
  used for splicing" description (those fields were removed); LatestLogicalAddress
  is the first main-log address below any readcache prefix, and a value-changing
  update CAS's the new record into the hash entry to detach the whole prefix.
- RecordInfo/Invalid bullet: a value-changing update does NOT invalidate individual
  readcache records; it detaches the prefix (orphans stay Valid). Invalid is used
  for abandoned/failed best-effort read-cache inserts, which traversal skips.
- Locking Flow: an in-ReadCache source is not sealed/elided; the update detaches
  the prefix (main-log sources are still sealed/freelisted as before).

Docs-only. Correctness/concurrency, the FindInReadCache assert invariant,
hei.IsReadCache gating, RMW heap-accounting, and checkpoint SkipReadCacheBucket
were all reviewed clean.

Co-authored-by: Copilot <[email protected]>
@badrishc badrishc force-pushed the badrishc/read-cache-cas-entry branch from 92231f8 to 957ba2b Compare June 11, 2026 00:58
@TedHartMS TedHartMS merged commit 358ced7 into main Jun 11, 2026
384 of 385 checks passed
@TedHartMS TedHartMS deleted the badrishc/read-cache-cas-entry branch June 11, 2026 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants