Cache DateTime.UtcNow in AAD authenticator hot path#1855
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reduces per-command overhead in the AAD authentication hot path by avoiding repeated DateTime.UtcNow calls inside GarnetAadAuthenticator.IsAuthorized(). It introduces a process-wide cached UTC tick source and refactors the authenticator to compare precomputed validity ticks against the cached value.
Changes:
- Added
Garnet.common.CachedTime, a static UTC time cache refreshed periodically viaSystem.Threading.Timer. - Updated
GarnetAadAuthenticator.Authenticate()to precompute_validFromTicks/_validToTicksonce per token. - Rewrote
IsAuthorized()to use a singleCachedTime.UtcNowTicksread plus twolongcomparisons.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| libs/server/Auth/GarnetAadAuthenticator.cs | Switch authorization checks from DateTime.UtcNow comparisons to cached-tick comparisons using precomputed token validity window. |
| libs/common/CachedTime.cs | Introduce a process-wide cached UTC tick source to remove wall-clock time queries from hot paths. |
|
You could also look at Environment.Tickcount64 for your coarse time. It's likely not as fast as your Timer solution, but should be a lot simpler in code. |
|
I updated the PR description with benchmark for |
…ll from hot path IsAuthorized() is checked twice per command (AdminCommands.cs:32 and :131) via the IsAuthenticated property. Under AAD auth, DateTime.UtcNow showed ~60% exclusive CPU in cluster traces because it issues a clock_gettime syscall via vDSO on every call. Introduce Garnet.common.CachedTime, a process-wide UTC time cache refreshed every 100ms by a background Timer (modeled after Redis's mstime). Reads are a pure memory load via Volatile.Read of a static long. GarnetAadAuthenticator now precomputes the token validity window in ticks once in Authenticate() and compares against CachedTime.UtcNowTicks in IsAuthorized(). 100ms granularity is operationally invisible against AAD token windows measured in hours. Co-authored-by: Copilot <[email protected]>
Redis's server.mstime is refreshed both from serverCron (hz floor, default 100ms) and from beforeSleep (every event loop iteration), so under load it updates far more often than 100ms. Rephrase as an illustrative example rather than a precise cadence match. Co-authored-by: Copilot <[email protected]>
libs/common consistently uses bare camelCase for private fields; align. Co-authored-by: Copilot <[email protected]>
GarnetAadAuthenticator is per-RespServerSession (single-threaded), so _validFrom/_validateTo are never accessed cross-thread; there's no DateTime atomicity concern and DateTime comparison compiles to the same mask+compare as long comparison. Strip the fields and inline CachedTime.UtcNow into the original DateTime check. Co-authored-by: Copilot <[email protected]>
…docs Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
…ion) Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
…not syscall Co-authored-by: Copilot <[email protected]>
…eUtcNow Mirrors GarnetAadAuthenticator.IsAuthorized() body verbatim plus the raw clock primitives for context. Results on i7-12700K / .NET 10: IsAuthorized_DateTimeUtcNow : 19.34 ns (baseline) IsAuthorized_CoarseUtcNow : 1.35 ns (0.070x — ~14.4x faster) Raw_DateTimeUtcNow : 18.06 ns Raw_CoarseUtcNowValue : 0.51 ns Co-authored-by: Copilot <[email protected]>
… in AAD hot path CoarseDateTime exposes both UtcNow (DateTime) and UtcNowTicks (long) accessors. GarnetAadAuthenticator now caches token validity as long ticks and compares against CoarseDateTime.UtcNowTicks, eliminating DateTime's per-call Kind-bit masking on the hot path. BDN benchmark shows 1.35 ns -> 0.32 ns vs the DateTime variant (~60x speedup vs DateTime.UtcNow baseline). Co-authored-by: Copilot <[email protected]>
…rrides - Remove the parallel long-ticks Volatile cache (UtcNowTicks). Benchmarks showed the heap-allocated Snapshot<DateTimeOffset> reaches parity with direct ticks under PGO, so the dual cache is unnecessary complexity. - Drop the four delegating TimeProvider overrides (GetTimestamp, TimestampFrequency, LocalTimeZone, CreateTimer). The base class defaults already match TimeProvider.System, so the overrides are pure noise. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
The volatile keyword gives reference fields the same acquire/release semantics as Volatile.Read/Write at the call sites, but reads cleaner. The two writes (init + timer tick) and two reads (GetUtcNow + UtcNow) now look like ordinary field access. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
GarnetAadAuthenticator's ctor already defaults TimeProvider to CoarseTimeProvider.Instance when null, so the settings-level passthrough is unnecessary in production. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
DateTime.Ticks ignores Kind, so .Ticks on a non-UTC DateTime would silently miscompare against UtcTicks. Construct DateTimeOffset with TimeSpan.Zero so UTC interpretation is explicit at the assignment, and a Local-kind DateTime would throw instead of silently mismatching. Also trim the verbose XML doc on GarnetAadAuthenticatorTests. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Drop the Stopwatch.GetTimestamp variant and a verbose test comment. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Singleton via Instance is the only intended use; private ctor prevents accidental allocations. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Problem
GarnetAadAuthenticator.IsAuthorized()is on the per-command hot path —IsAuthenticated(which forwards toIsAuthorized()) is checked twice per command viaAdminCommands.cs:32andAdminCommands.cs:131.In a cluster CPU profile under AAD authentication,
DateTime.get_UtcNowshows up as the single hottest method at ~62.7% exclusive CPU.Fix
Three pieces:
Garnet.common.CoarseTimeProvider— an instance class wrapping anyTimeProvider. ExposesUtcNow(DateTime) andUtcNowTicks(long); both read a cached value via a pure memory load. The cache refresh is scheduled viaTimeProvider.CreateTimerso aFakeTimeProviderin tests drives refreshes when its clock is advanced — production and tests therefore exercise the same hot path.CoarseTimeProvider.Systemis the process-wide singleton backed byTimeProvider.System(amortizes the single background Timer across all callers).Garnet.common.CoarseDateTime— kept as a thin static facade overCoarseTimeProvider.Systemfor back-compat / convenience.GarnetAadAuthenticator:CoarseTimeProvider(defaults toCoarseTimeProvider.System).AadAuthenticationSettingsexposes a publicTimeProviderparameter and owns theCoarseTimeProviderlifecycle (reuses.Systemsingleton when no override → zero per-settings Timer cost; owns and disposes a per-settings instance otherwise).Authenticate()precomputes the token validity window in ticks once (_validFromTicks/_validToTicks).IsAuthorized()reads_coarseTime.UtcNowTicksand compareslongagainstlong, skippingDateTimes per-call Kind-bit masking.Microbenchmark
benchmark/BDN.benchmark/Auth/IsAuthorizedBenchmark.cson Windows 11 24H2, i7-12700K, .NET 10, x64, Server GC,DOTNET_TieredPGO=0:IsAuthorized_DateTimeUtcNowIsAuthorized_TickCount64Environment.TickCount64(per @Tornhoof suggestion)IsAuthorized_CoarseDateTimeCoarseDateTime(DateTimevariant)IsAuthorized_CoarseDateTimeTicksEnvironment.TickCount64is a real win overDateTime.UtcNow(~14×) and dramatically simpler, but on Windows it still reads fromKUSER_SHARED_DATAvia a method call and is ~10× slower than the L1-cachedVolatile.Readof a staticlong. On Linux the gap compresses (bothTickCount64andDateTime.UtcNowgo through vDSOclock_gettime) butCoarseDateTimestill wins. For a per-command hot path checked twice per call,CoarseDateTime.UtcNowTicks(0.12 ns ≈ a singlemov) is the right choice.Correctness considerations
CoarseTimeProvidergranularity is 100ms; AAD token validity windows are measured in hours, so the slop is operationally invisible (smaller than typical inter-host clock skew). A token that just expired may be accepted for up to ~100ms past its true deadline. Under sustained ThreadPool starvation the staleness can exceed 100ms; still harmless for hour-long token windows.GarnetAadAuthenticatoris constructed once perRespServerSession(per client connection) andIsAuthorized()/Authenticate()are only ever called on that session''s RESP processing thread, so_validFromTicks/_validToTicksneed no cross-thread synchronization._validFromTicks = long.MaxValue/_validToTicks = long.MinValueso a previously-cached authorized state cannot leak past a failed re-authentication.CoarseTimeProvider.Systeminstance isstatic readonly, so it survives for process lifetime; ~10 callbacks/sec is negligible ThreadPool load. Cached value writes/reads useVolatile.Write/Volatile.Readof alongfield — atomic on 64-bit, no JIT reordering.Testing
dotnet buildclean.dotnet test test/cluster/Garnet.test.cluster --filter "FullyQualifiedName~Aad":ClusterAadAuthTests(×2) — full serverAUTHflow viaAuthenticate()→IsAuthorized()and subsequent commands re-checking viaIsAuthenticated. Pass on net8.0 and net10.0.GarnetAadAuthenticatorTests(×6) — unit-level tests usingFakeTimeProvider(Microsoft.Extensions.TimeProvider.Testing) plumbed throughCoarseTimeProvider. Covers default fallback to.System, valid window, expiry afterAdvance(), staying valid across multipleAdvance()calls, malformed token, and failed re-auth clearing previously authorized state. Because the authenticator always reads_coarseTime.UtcNowTicks(no test-only branch), advancing theFakeTimeProviderfires the sameTimeProvider.CreateTimerrefresh callback the production cache uses — tests exercise the production hot path.dotnet formatclean on the changed files.