Skip to content

Fix missing cluster slot verification for CustomRawStringCmd and CustomObjCmd#1827

Merged
vazois merged 3 commits into
microsoft:mainfrom
Ankith-Kandala:akandala/fix-customcmd-cluster-slot-verify
May 28, 2026
Merged

Fix missing cluster slot verification for CustomRawStringCmd and CustomObjCmd#1827
vazois merged 3 commits into
microsoft:mainfrom
Ankith-Kandala:akandala/fix-customcmd-cluster-slot-verify

Conversation

@Ankith-Kandala

Copy link
Copy Markdown
Contributor

Closes #1826

Problem

RespServerSession.CanServeSlot decides whether a command needs cluster slot verification by checking RespCommand.IsDataCommand(). That check returns false for the four custom-command enum values (CustomRawStringCmd, CustomObjCmd, CustomTxn, CustomProcedure) because they sit past LastDataCommand (EVALSHA) in the enum range.

As a result, CustomRawStringCmd and CustomObjCmd skipped slot verification entirely in cluster mode, so a client could hit a node that doesn't own the key and never get a MOVED redirect back. CustomTxn was unaffected because CustomTransactionProcedure.RegisterKey already runs every key through txnManager.VerifyKeyOwnership. CustomProcedure is left to the procedure author since procedures can touch arbitrary keys. The gap this PR closes is the two remaining types, both of which have a fixed contract: the key is always at parseState[0], enforced by TryCustomRawStringCommand and TryCustomObjectCommand.

What this change does

Adds an explicit switch arm at the top of CanServeSlot that handles CustomRawStringCmd and CustomObjCmd by driving NetworkMultiKeySlotVerify with a synthetic single-key spec.

  • A static readonly CustomCommandSingleKeySpec (SimpleRespKeySpec[]) describes the contract: BeginSearch(index: 1), FindKeys(keyStep: 1, lastKeyOrLimit: 0, isLimit: false). Allocated once at type load, so no per-request allocation is added.
  • csvi.readOnly is derived from the parsed command's type (currentCustomRawStringCommand.type or currentCustomObjectCommand.type == CommandType.Read). cmd.IsReadOnly() cannot be used here because both custom enum values fall above LastReadCommand.
  • The new arm runs before the existing IsDataCommand() bypass, so no other command's slot check is affected.

Testing

Added test/cluster/Garnet.test.cluster.migrate/RedirectTests/TestClusterCustomCommands.cs, which registers four minimal custom commands so the existing slot-verification matrix covers both branches of the new code path:

  • CLUSTERTESTRAWCMD and CLUSTERTESTOBJCMD exercise readOnly=false (CommandType.ReadModifyWrite).
  • CLUSTERTESTRAWREADCMD and CLUSTERTESTOBJREADCMD exercise readOnly=true (CommandType.Read). The read-only stub throws from every write-path method so an accidental write-path dispatch becomes a hard test failure.

ClusterSlotVerificationTests.OneTimeSetUp registers all four via Register.NewCommand on each node and adds them to the TestCommands array, so the existing CLUSTERDOWN/OK/MOVED/ASK/CROSSSLOT/TRYAGAIN matrix automatically runs against the new commands.

Test runs on net10.0 Debug:

  • Garnet.test.cluster.migrate: 54/54
  • Garnet.test.cluster: 137/137
  • Garnet.test.cluster.multilog: 144/144
  • Garnet.test.cluster.replication: 107/107
  • Garnet.test.cluster.replication.asyncreplay: 105/105
  • Garnet.test.cluster.replication.disklesssync: 29/29
  • Garnet.test.cluster.replication.tls: 105/105
  • Garnet.test.cluster.vectorsets: 16/16
  • Garnet.test: 789/792 (3 pre-existing skips unrelated to this change)

dotnet format Garnet.slnx --verify-no-changes and dotnet format Tsavorite.slnx --verify-no-changes both clean. Full solution build: 0 warnings, 0 errors.

Copilot AI review requested due to automatic review settings May 26, 2026 20:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds coverage and implementation support to ensure custom raw-string/object commands participate in cluster slot verification, including both read-only and read-write paths.

Changes:

  • Introduces test-only custom command implementations for raw-string and object commands (read/write variants).
  • Registers the new custom commands in cluster slot-verification tests to exercise readOnly=true and readOnly=false verification branches.
  • Updates slot-verification logic to apply a synthetic single-key spec for custom commands that otherwise skip IsDataCommand() checks.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
test/cluster/Garnet.test.cluster.migrate/RedirectTests/TestClusterCustomCommands.cs Adds test-only custom command/function implementations used to validate redirect/slot verification behavior.
test/cluster/Garnet.test.cluster.migrate/RedirectTests/ClusterSlotVerificationTests.cs Registers the new custom commands on each node so existing redirect tests cover custom read/write commands.
libs/server/Resp/RespServerSessionSlotVerify.cs Ensures custom commands go through slot verification by providing a synthetic single-key spec and readOnly routing.

Comment thread libs/server/Resp/RespServerSessionSlotVerify.cs Outdated
Comment thread libs/server/Resp/RespServerSessionSlotVerify.cs Outdated
…omObjCmd

RespServerSessionSlotVerify.CanServeSlot relies on RespCommand.IsDataCommand() to gate slot verification, but CustomRawStringCmd, CustomObjCmd, CustomTxn and CustomProcedure enum values fall after LastDataCommand (EVALSHA), so IsDataCommand() returns false and these commands skipped slot verification entirely in cluster mode. CustomTxn was already covered by TxnKeyManager.VerifyKeyOwnership (PR microsoft#712); CustomProcedure remains the procedure author's responsibility. This commit fixes the gap for CustomRawStringCmd and CustomObjCmd, which both follow a fixed single-key-at-arg-0 contract enforced by TryCustomRawStringCommand / TryCustomObjectCommand.

Implementation: add a switch arm at the top of CanServeSlot for CustomRawStringCmd | CustomObjCmd that drives NetworkMultiKeySlotVerify with a synthetic SimpleRespKeySpec[] describing the single-key contract (BeginSearch index:1, FindKeys keyStep:1, lastKeyOrLimit:0, isLimit:false). The csvi.readOnly flag is derived from the parsed command type (currentCustomRawStringCommand.type / currentCustomObjectCommand.type == CommandType.Read); cmd.IsReadOnly() cannot be used here because both custom enum values fall above LastReadCommand. The new code path executes before the existing IsDataCommand bypass and does not alter any other command's slot check.

Tests: add TestClusterRawStringCmd / TestClusterObjFactory + matching BaseCommand subclasses CLUSTERTESTRAWCMD and CLUSTERTESTOBJCMD to cover the read-modify-write path, and TestClusterRawStringReadCmd / TestClusterObjGet + CLUSTERTESTRAWREADCMD and CLUSTERTESTOBJREADCMD (CommandType.Read, Arity 2) to exercise the readOnly=true branch. All four commands are registered in ClusterSlotVerificationTests OneTimeSetUp and added to TestCommands so the existing CLUSTERDOWN / OK / MOVED / ASK / CROSSSLOT / TRYAGAIN matrix runs against both write and read variants. All 6 ClusterSlotVerificationTests pass.

Work item: ADO 4878986 (Redis JSON: Garnet OSS Add Cluster Mode Support).
Custom raw-string and object commands need slot verification but sort
past LastDataCommand, so the previous code checked for them at the top
of CanServeSlot and every data command paid for that branch evaluation
on dispatch. Move the check into the !IsDataCommand() branch and pull
the verification body into CanServeSlotForCustomCommand so the common
hot path is untouched and the special case lives where it logically
belongs.
@Ankith-Kandala Ankith-Kandala force-pushed the akandala/fix-customcmd-cluster-slot-verify branch from af774d3 to 1372276 Compare May 27, 2026 19:21
@vazois vazois merged commit bba9353 into microsoft:main May 28, 2026
136 of 137 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cluster: missing slot verification for custom raw-string and custom object commands

3 participants