Skip to content

[V1] Backport: Fix AOF page/memory size defaults and validation (#1838)#1844

Closed
badrishc wants to merge 1 commit into
release/v1from
badrishc/v1-fix-size-defaults
Closed

[V1] Backport: Fix AOF page/memory size defaults and validation (#1838)#1844
badrishc wants to merge 1 commit into
release/v1from
badrishc/v1-fix-size-defaults

Conversation

@badrishc

@badrishc badrishc commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Backport of #1838 to release/v1. Fixes #1811 and #1770.

Why a separate v1 PR

The main PR's cherry-pick almost applies cleanly, but v1 has two divergences that change the right defaults and require a small extra fix in the test helpers:

  1. v1's default main-log PageSize is 32m (vs 16m on main). The new validation rule requires AofPageSize >= 2 * PageSize, so on v1 the minimum AofPageSize that complies with defaults is 64m, not 32m. Defaults are therefore set to AofPageSize=64m / AofMemorySize=128m on v1.
  2. v1's GetAofSettings signature differs ((int dbId, LightEpoch epoch, out TsavoriteLogSettings)) from main's array-based one, so the test was rewritten against v1's API.
  3. v1 has no aof-bench scaffolding in benchmark/Resp.benchmark/Options.cs, so that file from Fix AOF page/memory size defaults and validation (#1811) #1838 is not backported.
  4. Test-helper defaults bumped: TestUtils.cs and ClusterTestContext.cs defaulted aofMemorySize: "64m". Combined with the new server default AofPageSize=64m, this would violate the new 2x rule on every test that uses the helper. Bumped these helper defaults from 64m to 128m.

What this PR does

Same intent as #1838:

Why this also fixes #1770

#1770 reports that an oversized SET throws Tsavorite.core.TsavoriteException: Entry does not fit on page from inside the AOF write path, closes the connection, but leaves the value cached in memory — creating an AOF/cache divergence. The root cause is exactly the misconfiguration the new validation rule prevents: with the old defaults (AofPageSize=4m and main-log PageSize=32m on v1), an entry up to PageSize bytes is committed to the main store but cannot fit on an AOF page. With the new defaults and validation, the misconfiguration cannot reach runtime — it is caught at startup.

Validation

  • dotnet build clean (net10.0, 0 warnings).
  • GarnetServerConfigTests + RespAofTests full suites: 53 passed, 0 failed, 1 skipped on net10.0 Debug.

Related

Backport of #1838 to release/v1.

Tighten startup validation in GetAofSettings to catch all three AOF sizing
mistakes with Garnet-specific errors that name the actual option, the
effective rounded sizes, and the value to set, instead of letting them
reach the generic Tsavorite messages or the runtime 'Entry does not fit
on page' failure:

  1. AofMemorySize <= AofPageSize (was '<', allowed equality) -- root cause
     of the misleading 'MemorySize must be at least twice the page size'
     error reported in #1811.
  2. AofPageSize > AofSegmentSize -- clearer message naming both options.
  3. New: AofPageSize < 2 * main-log PageSize. An AOF entry can be as
     large as the underlying main-log record (raw-string values up to
     PageSize, object commands like LPUSH/HSET can push it higher), so
     the prior default ratio made the runtime 'Entry does not fit on
     page' easy to hit on routine writes. This is the failure mode
     reported in #1770: an oversized SET threw 'Entry does not fit on
     page', closed the connection, but the value remained cached in
     memory creating an AOF/cache inconsistency. With the new pre-flight
     validation the misconfiguration is now caught at startup.

Improve the Tsavorite runtime error in ValidateAllocatedLength to also
include the entry size and a fix hint, while preserving the
'Entry does not fit on page' prefix for backward compat with the
OversizedAofEntryDoesNotHangServer regression test (#1749).

Bump defaults to comply with the new rule. v1's default main-log
PageSize is 32m (vs 16m on main), so AofPageSize must be at least 64m:
  - AofPageSize:   4m  -> 64m
  - AofMemorySize: 64m -> 128m
Applied in defaults.conf and GarnetServerOptions.cs.

Also bump the test-helper aofMemorySize defaults from 64m -> 128m in
TestUtils.cs and ClusterTestContext.cs so the new 2x rule is satisfied
when tests create a Garnet server with the new default AofPageSize.

Update CLI help text (--aof-memory, --aof-page-size) and the public
website doc to explain the 2x rules.

Tests:
  - Added three regression tests in GarnetServerConfigTests covering the
    AofPageSize-vs-AofSegmentSize, AofMemorySize-vs-AofPageSize, and
    AofPageSize-vs-(2x main PageSize) boundaries.
  - Updated OversizedAofEntryDoesNotHangServer to pass pageSize: '1k'
    alongside aofPageSize: '4k' so it still triggers the runtime
    oversize-entry path under the new validation.

Co-authored-by: Copilot <[email protected]>
Copilot AI review requested due to automatic review settings June 1, 2026 17:51

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports the AOF sizing validation + default updates from #1838 onto release/v1, to prevent common misconfigurations from reaching runtime (notably the Entry does not fit on page AOF enqueue failure) and to provide clearer, Garnet-specific startup errors and docs.

Changes:

  • Tighten GarnetServerOptions.GetAofSettings validation and bump v1 defaults to AofPageSize=64m / AofMemorySize=128m.
  • Improve TsavoriteLog’s oversize-entry exception message to include entry/page sizes and guidance.
  • Update CLI help, website docs, and tests/test-helpers to align with the new sizing rules.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
website/docs/getting-started/configuration.md Document the new AOF sizing constraints (memory >= 2× page, AOF page >= 2× main PageSize).
test/Garnet.test/TestUtils.cs Bump AOF helper defaults to avoid violating new validation with v1 defaults.
test/Garnet.test/RespAofTests.cs Adjust regression test to satisfy new startup validation while still exercising runtime oversize behavior.
test/Garnet.test/GarnetServerConfigTests.cs Add/extend config regression tests for the new AOF validation boundaries and improved messages.
test/Garnet.test.cluster/ClusterTestContext.cs Bump cluster test helper AOF defaults consistent with new sizing rules.
libs/storage/Tsavorite/cs/src/core/TsavoriteLog/TsavoriteLog.cs Improve oversize-entry exception message for easier diagnosis and remediation.
libs/server/Servers/GarnetServerOptions.cs Enforce new validation rules and update v1 defaults for AOF sizing.
libs/host/defaults.conf Update default config values/comments to match new AOF sizing requirements.
libs/host/Configuration/Options.cs Update CLI help text for AOF-related options to describe the new constraints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 914 to 916
var memorySizeBits = AofMemorySizeBits();
var pageSizeBits = AofPageSizeBits();
var segmentSizeBits = AofSegmentSizeBits();
Comment on lines 96 to 105
/// <summary>
/// Total AOF memory buffer used in bytes (rounds down to power of 2) - spills to disk after this limit.
/// </summary>
public string AofMemorySize = "64m";
public string AofMemorySize = "128m";

/// <summary>
/// Aof page size in bytes (rounds down to power of 2).
/// </summary>
public string AofPageSize = "4m";
public string AofPageSize = "64m";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants