Add AAD as a first-class OperationParams variant in BDN benchmarks#1859
Merged
Conversation
572b963 to
9ad76a9
Compare
added 2 commits
June 5, 2026 14:59
Per review feedback: extend OperationsBase so every Operations.* BDN benchmark runs a 4th variant alongside None / ACL / AOF that exercises GarnetAadAuthenticator on the service hot path. - OperationParams gains a useAad flag (mutually exclusive with useACLs) and a corresponding _AAD suffix in ToString(). - OperationsBase.OperationParamsProvider yields the new variant. - On AAD setup: build an in-process AadAuthenticationSettings backed by a symmetric signing key + a 12h JWT, set opts.AuthSettings, and send a one-shot RESP AUTH command immediately after the embedded session is created so subsequent benchmark iterations exercise IsAuthenticated on every command (twice per command via the AAD code path). - Self-contained: no external IdP, no network IO; uses the same GarnetAadAuthenticator code path production runs. - BDN_Benchmark_Config.json intentionally not updated yet (no baselines for the new variant); github-action-benchmark will pick up the new _AAD chart lines automatically. Gates can be added later. Verified on Operations.BasicOperations.InlinePing: None 977ns, ACL 969ns, AOF 1000ns, AAD 1072ns (cost-of-AAD ~1ns per command — matches the microbenchmark in IsAuthorizedBenchmark).
9ad76a9 to
2862019
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the benchmark/BDN.benchmark “Operations.*” BenchmarkDotNet suite to include an additional, self-contained AAD-authenticated run variant, so continuous benchmark charts can catch performance regressions in Garnet’s AAD authentication hot path (IsAuthenticated → GarnetAadAuthenticator.IsAuthorized()).
Changes:
- Added a new
OperationParams.useAadflag and includedAADinOperationParams.ToString()output naming. - Extended
OperationsBase.OperationParamsProvider()to yield a 4th parameter variant withuseAad: true. - Implemented in-process AAD auth setup in
OperationsBase.GlobalSetup()by buildingAadAuthenticationSettings(symmetric key + stub signing token provider) and sending a one-shot RESPAUTH default <jwt>after session creation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| benchmark/BDN.benchmark/Operations/OperationsBase.cs | Adds AAD benchmark variant wiring, creates in-process AAD settings, and issues an initial AUTH to put the session into authenticated state before benchmarking. |
| benchmark/BDN.benchmark/Operations/OperationParams.cs | Adds useAad to benchmark parameter struct and updates parameter naming (ToString) to reflect AAD-enabled runs. |
added 3 commits
June 5, 2026 15:18
…ts/formatting - Add EmbeddedNetworkSender.GetLastResponse() that returns the bytes written by the most recent SendResponse(int, int) call. - EmbeddedRespServer.GetRespSession now accepts an optional sender so callers can hold a reference for response inspection. - OperationsBase.GlobalSetup, on the AAD path, asserts the AUTH response is exactly '+OK\\r\\n' after sending the AUTH command. A failed AUTH would otherwise silently leave the session unauthenticated and the benchmark would measure auth-rejection overhead instead of the authenticated hot path. Verified by deliberately corrupting the JWT: BDN aborts with InvalidOperationException and the AAD result becomes NA — failure is loud, not silent. - Drop the redundant 'matches the production GarnetAadAuthenticator code path exactly' comment and collapse two multi-line constructor invocations to one line per repo formatting preference.
…n comments/formatting" This reverts commit 3367bf9.
- OperationParams.ctor now throws ArgumentException when both useACLs and useAad are true. Per Copilot review (#1859): useAad was documented as mutually exclusive with useACLs but nothing enforced it, so a caller passing (true, false, true) would silently get the ACL path with a confusing 'ACL+AAD' ToString label. - Drop the redundant comment on BuildAadAuthSettings and collapse two multi-line constructor invocations to one line per repo formatting preference (lost in the revert of fb9464c).
tiagonapoli
pushed a commit
that referenced
this pull request
Jun 5, 2026
- OperationParams.ctor now throws ArgumentException when both useACLs and useAad are true. Per Copilot review (#1859): useAad was documented as mutually exclusive with useACLs but nothing enforced it, so a caller passing (true, false, true) would silently get the ACL path with a confusing 'ACL+AAD' ToString label. - Drop the redundant comment on BuildAadAuthSettings and collapse two multi-line constructor invocations to one line per repo formatting preference (lost in the revert of fb9464c).
vazois
approved these changes
Jun 6, 2026
This was referenced Jun 6, 2026
badrishc
pushed a commit
that referenced
this pull request
Jun 6, 2026
…#1863) * Revert "charts: group all Params variants into one chart per benchmark (#1861)" This reverts commit b89cc9e. * charts: add AAD variants for Operations.* benchmarks to Featured PR #1859 added Params: AAD as a 4th first-class OperationParams variant. The Featured Benchmarks view didn't surface the new AAD charts because FEATURED_BENCHMARKS only listed the (Params: None) entry for each benchmark. Add the 8 corresponding (Params: AAD) entries for the Operations.* benchmarks that gained an AAD variant. Network.* and Lua.* are unchanged since OperationsBase's useAad flag only applies to the Operations.* hierarchy. --------- Co-authored-by: Tiago Napoli <[email protected]>
tiagonapoli
pushed a commit
that referenced
this pull request
Jun 8, 2026
- OperationParams.ctor now throws ArgumentException when both useACLs and useAad are true. Per Copilot review (#1859): useAad was documented as mutually exclusive with useACLs but nothing enforced it, so a caller passing (true, false, true) would silently get the ACL path with a confusing 'ACL+AAD' ToString label. - Drop the redundant comment on BuildAadAuthSettings and collapse two multi-line constructor invocations to one line per repo formatting preference (lost in the revert of fb9464c).
tiagonapoli
added a commit
that referenced
this pull request
Jun 9, 2026
* Cache UTC time in GarnetAadAuthenticator.IsAuthorized to remove syscall from hot path IsAuthorized() is checked twice per command (AdminCommands.cs:32 and :131) via the IsAuthenticated property. Under AAD auth, DateTime.UtcNow showed ~60% exclusive CPU in cluster traces because it issues a clock_gettime syscall via vDSO on every call. Introduce Garnet.common.CachedTime, a process-wide UTC time cache refreshed every 100ms by a background Timer (modeled after Redis's mstime). Reads are a pure memory load via Volatile.Read of a static long. GarnetAadAuthenticator now precomputes the token validity window in ticks once in Authenticate() and compares against CachedTime.UtcNowTicks in IsAuthorized(). 100ms granularity is operationally invisible against AAD token windows measured in hours. Co-authored-by: Copilot <[email protected]> * Fix misleading Redis hz=10 reference in CachedTime doc Redis's server.mstime is refreshed both from serverCron (hz floor, default 100ms) and from beforeSleep (every event loop iteration), so under load it updates far more often than 100ms. Rephrase as an illustrative example rather than a precise cadence match. Co-authored-by: Copilot <[email protected]> * Drop underscore prefix on CachedTime private fields libs/common consistently uses bare camelCase for private fields; align. Co-authored-by: Copilot <[email protected]> * Drop unnecessary tick-cached validity fields GarnetAadAuthenticator is per-RespServerSession (single-threaded), so _validFrom/_validateTo are never accessed cross-thread; there's no DateTime atomicity concern and DateTime comparison compiles to the same mask+compare as long comparison. Strip the fields and inline CachedTime.UtcNow into the original DateTime check. Co-authored-by: Copilot <[email protected]> * Extract Timer callback into named UpdateCachedUtc method; tidy stale docs Co-authored-by: Copilot <[email protected]> * Collapse Timer ctor to single line; document benign overlap race Co-authored-by: Copilot <[email protected]> * Drop Redis reference from RefreshPeriodMs doc Co-authored-by: Copilot <[email protected]> * Use block-scoped namespace in CachedTime (matches libs/common convention) Co-authored-by: Copilot <[email protected]> * Drop platform syscall names from CachedTime summary Co-authored-by: Copilot <[email protected]> * Note CachedTime staleness can exceed RefreshPeriodMs under high load Co-authored-by: Copilot <[email protected]> * Drop Timer comment from CachedTime Co-authored-by: Copilot <[email protected]> * Rename CachedTime to CoarseUtcNow; switch accessors to .Value/.Ticks Co-authored-by: Copilot <[email protected]> * Correct CoarseUtcNow doc: DateTime.UtcNow is vDSO/KUSER_SHARED_DATA, not syscall Co-authored-by: Copilot <[email protected]> * Add BDN benchmark for IsAuthorized hot path: DateTime.UtcNow vs CoarseUtcNow Mirrors GarnetAadAuthenticator.IsAuthorized() body verbatim plus the raw clock primitives for context. Results on i7-12700K / .NET 10: IsAuthorized_DateTimeUtcNow : 19.34 ns (baseline) IsAuthorized_CoarseUtcNow : 1.35 ns (0.070x — ~14.4x faster) Raw_DateTimeUtcNow : 18.06 ns Raw_CoarseUtcNowValue : 0.51 ns Co-authored-by: Copilot <[email protected]> * Rename CoarseUtcNow to CoarseDateTime (UtcNow/UtcNowTicks); use Ticks in AAD hot path CoarseDateTime exposes both UtcNow (DateTime) and UtcNowTicks (long) accessors. GarnetAadAuthenticator now caches token validity as long ticks and compares against CoarseDateTime.UtcNowTicks, eliminating DateTime's per-call Kind-bit masking on the hot path. BDN benchmark shows 1.35 ns -> 0.32 ns vs the DateTime variant (~60x speedup vs DateTime.UtcNow baseline). Co-authored-by: Copilot <[email protected]> * Drop raw clock primitive benchmarks; keep only IsAuthorized variants Co-authored-by: Copilot <[email protected]> * Add blank lines around tick-fields block in GarnetAadAuthenticator Co-authored-by: Copilot <[email protected]> * Refactor coarse time to TimeProvider; inject into AAD authenticator; add tests + TickCount64 benchmark Addresses PR review feedback: 1. CoarseDateTime becomes a thin static wrapper around the new CoarseTimeProvider (instance) which wraps an arbitrary TimeProvider and uses TimeProvider.CreateTimer for cache refresh. CoarseTimeProvider.System is the process-wide singleton backed by TimeProvider.System -- production behaviour unchanged. 2. GarnetAadAuthenticator now takes a CoarseTimeProvider (default = .System). Hot path is identical in production and tests; tests inject a FakeTimeProvider-backed CoarseTimeProvider so advancing the fake fires the same Timer-refresh callback the production cache uses. 3. AadAuthenticationSettings exposes the public TimeProvider parameter, internally owning/disposing the CoarseTimeProvider lifecycle (reuses .System singleton when no override). 4. New GarnetAadAuthenticatorTests covers: default fallback, valid window, expiry after Advance(), staying valid across Advances, malformed token, failed reauthentication clears state. 5. IsAuthorizedBenchmark adds the Environment.TickCount64 variant suggested in PR review (Tornhoof) -- monotonic ms-since-boot, validity window projected at Authenticate() time. * CoarseTimeProvider extends TimeProvider; drop CoarseDateTime facade - CoarseTimeProvider now inherits TimeProvider: overrides GetUtcNow to return the cached coarse value; delegates GetTimestamp / TimestampFrequency / LocalTimeZone / CreateTimer to the wrapped provider. Adds an explicit fast UtcNowTicks accessor (single Volatile.Read) that the AAD hot path keeps using. - Dispose() no-ops on the .System singleton so callers can dispose unconditionally; removes the _ownsCoarseTime bool from AadAuthenticationSettings. - Drop the CoarseDateTime static facade -- redundant now that CoarseTimeProvider.System exposes UtcNow/UtcNowTicks directly. Benchmark renamed (CoarseTimeProvider / CoarseTimeProviderTicks). - libs/common naming convention: drop underscore prefix on new private fields in CoarseTimeProvider (matches the rest of libs/common). * Tighten CoarseTimeProvider API; add Stopwatch.GetTimestamp benchmark - Make CoarseTimeProvider ctor private; CoarseTimeProvider.Create is now the only entry point, structurally preventing accidental Timer duplication. - Replace RefreshPeriodMs (int) with RefreshPeriod (static readonly TimeSpan); pass directly to CreateTimer. - Add IsAuthorized_StopwatchTimestamp benchmark for completeness. - Test updated to use CoarseTimeProvider.Create(fakeProvider). * Verify AAD AUTH succeeded in BDN OperationsBase setup; tighten comments/formatting - Add EmbeddedNetworkSender.GetLastResponse() that returns the bytes written by the most recent SendResponse(int, int) call. - EmbeddedRespServer.GetRespSession now accepts an optional sender so callers can hold a reference for response inspection. - OperationsBase.GlobalSetup, on the AAD path, asserts the AUTH response is exactly '+OK\\r\\n' after sending the AUTH command. A failed AUTH would otherwise silently leave the session unauthenticated and the benchmark would measure auth-rejection overhead instead of the authenticated hot path. Verified by deliberately corrupting the JWT: BDN aborts with InvalidOperationException and the AAD result becomes NA — failure is loud, not silent. - Drop the redundant 'matches the production GarnetAadAuthenticator code path exactly' comment and collapse two multi-line constructor invocations to one line per repo formatting preference. * Revert "Verify AAD AUTH succeeded in BDN OperationsBase setup; tighten comments/formatting" This reverts commit fb9464c. * Reject ACL+AAD combo in OperationParams ctor; collapse formatting - OperationParams.ctor now throws ArgumentException when both useACLs and useAad are true. Per Copilot review (#1859): useAad was documented as mutually exclusive with useACLs but nothing enforced it, so a caller passing (true, false, true) would silently get the ACL path with a confusing 'ACL+AAD' ToString label. - Drop the redundant comment on BuildAadAuthSettings and collapse two multi-line constructor invocations to one line per repo formatting preference (lost in the revert of fb9464c). * dotnet format: strip trailing newlines per .editorconfig insert_final_newline=false * Clarify in ClusterAadAuthTests that the tenant ID is a public identifier Co-authored-by: Copilot <[email protected]> * Bump CoarseTimeProvider.RefreshPeriod from 100ms to 1s Coarseness is already documented as ~RefreshPeriod best-effort; AAD token validity windows are measured in hours, so a 1s lag is operationally invisible and reduces Timer wakeups 10x. Co-authored-by: Copilot <[email protected]> * Add the same not-a-secret comment to GarnetAadAuthenticatorTests Co-authored-by: Copilot <[email protected]> * Simplify CoarseTimeProvider; switch authenticator to TimeProvider abstraction CoarseTimeProvider is no longer parameterized on a wrapped TimeProvider: it always caches TimeProvider.System, the refresh Timer is process-wide static, and instances are stateless (Create/Dispose and the wrapping ctor are gone). A new Instance singleton replaces the former System field. GarnetAadAuthenticator and AadAuthenticationSettings now hold an abstract TimeProvider (defaulting to CoarseTimeProvider.Instance) instead of a concrete CoarseTimeProvider. IsAuthorized reads ticks via _timeProvider.GetUtcNow().UtcTicks so tests can inject a FakeTimeProvider directly, and production keeps the cached hot-path read through Instance. Tests updated accordingly, plus a dedicated CoarseTimeProviderTests fixture. Co-authored-by: Copilot <[email protected]> * Cache DateTimeOffset directly in CoarseTimeProvider via Snapshot wrapper Switch the cache from long utcTicks (reconstructed per read) to a heap-allocated Snapshot holding the immutable DateTimeOffset. Readers become a single Volatile.Read on a reference + field load, avoiding the DateTimeOffset constructor's range-checks (ticks bounds, offset bounds) on the hot path. The Timer allocates one Snapshot per refresh tick (~24 B/s, gen-0). Co-authored-by: Copilot <[email protected]> * Simplify CoarseTimeProvider: drop UtcNowTicks cache and redundant overrides - Remove the parallel long-ticks Volatile cache (UtcNowTicks). Benchmarks showed the heap-allocated Snapshot<DateTimeOffset> reaches parity with direct ticks under PGO, so the dual cache is unnecessary complexity. - Drop the four delegating TimeProvider overrides (GetTimestamp, TimestampFrequency, LocalTimeZone, CreateTimer). The base class defaults already match TimeProvider.System, so the overrides are pure noise. Co-authored-by: Copilot <[email protected]> * Drop spurious allocation-rate estimate in Snapshot comment Co-authored-by: Copilot <[email protected]> * Fix DateTimeOffset size in Snapshot comment (it's 16 bytes, not 12) Co-authored-by: Copilot <[email protected]> * Rename Snapshot to DateTimeOffsetSnapshot for clarity Co-authored-by: Copilot <[email protected]> * Use volatile field for snapshot instead of Volatile.Read/Write The volatile keyword gives reference fields the same acquire/release semantics as Volatile.Read/Write at the call sites, but reads cleaner. The two writes (init + timer tick) and two reads (GetUtcNow + UtcNow) now look like ordinary field access. Co-authored-by: Copilot <[email protected]> * Move DateTimeOffsetSnapshot to end and trim comment verbosity Co-authored-by: Copilot <[email protected]> * Trim verbose comments in GarnetAadAuthenticator Co-authored-by: Copilot <[email protected]> * Revert TimeProvider passthrough in AadAuthenticationSettings GarnetAadAuthenticator's ctor already defaults TimeProvider to CoarseTimeProvider.Instance when null, so the settings-level passthrough is unnecessary in production. Co-authored-by: Copilot <[email protected]> * Remove RefreshPeriod XML doc Co-authored-by: Copilot <[email protected]> * Drop quantified speedup from tick comparison comment Co-authored-by: Copilot <[email protected]> * Remove _timeProvider comment Co-authored-by: Copilot <[email protected]> * Make UTC explicit in token validity tick projection DateTime.Ticks ignores Kind, so .Ticks on a non-UTC DateTime would silently miscompare against UtcTicks. Construct DateTimeOffset with TimeSpan.Zero so UTC interpretation is explicit at the assignment, and a Local-kind DateTime would throw instead of silently mismatching. Also trim the verbose XML doc on GarnetAadAuthenticatorTests. Co-authored-by: Copilot <[email protected]> * Comment UTC contract on token validity tick projection Co-authored-by: Copilot <[email protected]> * Drop redundant CoarseTimeProvider/default-clock tests Co-authored-by: Copilot <[email protected]> * Update IsAuthorized BDN to match CoarseTimeProvider API Co-authored-by: Copilot <[email protected]> * Trim IsAuthorized BDN to DateTime vs CoarseTimeProvider Drop the Stopwatch.GetTimestamp variant and a verbose test comment. Co-authored-by: Copilot <[email protected]> * Apply dotnet format Co-authored-by: Copilot <[email protected]> * Make CoarseTimeProvider constructor private Singleton via Instance is the only intended use; private ctor prevents accidental allocations. Co-authored-by: Copilot <[email protected]> * Trim CoarseTimeProvider.Instance xmldoc Co-authored-by: Copilot <[email protected]> * Add minimal CoarseTimeProvider tests Co-authored-by: Copilot <[email protected]> * Apply dotnet format Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Tiago Napoli <[email protected]> Co-authored-by: Copilot <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Garnet's continuous BDN benchmarks (published to https://microsoft.github.io/garnet/charts/) cover the
None/ACL/AOFmatrix forOperations.*— but no variant exercises the AAD authentication code path. Regressions inGarnetAadAuthenticator.IsAuthorized()cannot be caught by these charts and only surface when someone runs a production CPU profile.A recent example:
DateTime.UtcNowaccounted for ~62.7% exclusive CPU in a cluster under AAD auth (fixed in #1855). That regression would have shown up immediately on theInlinePing_AADchart had this variant existed.Fix
Add
AADas a 4th first-classOperationParamsvariant alongsideNone/ACL/AOF:OperationParams— gainsuseAadbool +"AAD"(or"ACL+AAD", etc.) suffix inToString().OperationsBase.OperationParamsProvider— yields the new(false, false, useAad: true)variant.OperationsBase.GlobalSetup— whenParams.useAad:AadAuthenticationSettingsbacked by a symmetric signing key, a 12-hour JWT, and a stubIssuerSigningTokenProvider(no OpenID metadata refresh,authority="").opts.AuthSettings = aadSettings.AUTH default <jwt>so the session is fully authenticated before any benchmark iteration runs.GarnetAadAuthenticatorcode path production runs.Result
Every existing
Operations.*BDN benchmark now runs a 4th time with AAD enabled. The hot path includes theIsAuthenticated→IsAuthorized()check on every RESP command (twice per command, matching production), so any change in AAD-auth overhead immediately moves the_AADchart line.Local verification (
Operations.BasicOperations.InlinePing, .NET 10, i7-12700K, 100 PINGs/op)NoneThe
+1641 nsis the existingDateTime.UtcNowcost on the AAD path. Once #1855 merges, this drops to ~110 ns (a ~15× improvement on the AAD overhead specifically).Costs / caveats
Operations.*test (BasicOperations,ObjectOperations,HashObjectOperations,SortedSetOperations,CustomOperations,RawStringOperations,ScriptOperations,ModuleOperations,JsonOperations,PubSubOperations,SetOperations,TxnOperations). Roughly ~33% more time per matrix cell inci-bdnbenchmark.yml.BDN_Benchmark_Config.json: intentionally not updated. There are no expected baselines for the new_AADentries yet, andgithub-action-benchmarkpicks up the new chart lines from the full BDN report regardless. Gates can be added once baselines settle (recommendWARN-ON-FAIL_*initially).Paramsmodel; can be extended in a follow-up if desired.Testing
dotnet build benchmark/BDN.benchmark— clean on net8.0 and net10.0.Operations.BasicOperationsend-to-end: all 4 variants (None,ACL,AOF,AAD) execute and produce results. AAD-authenticated PING returns successfully, confirming the AUTH command + JWT validation are wired correctly.Related
Cache DateTime.UtcNow in AAD authenticator hot path. With both PRs merged in order (this one first, then Cache DateTime.UtcNow in AAD authenticator hot path #1855), the_AADchart line will show a sharp drop on the commit landing Cache DateTime.UtcNow in AAD authenticator hot path #1855 — concrete proof the fix worked end-to-end on the service hot path.