Skip to content

Add quantization tasks and calls into DiskANN#1882

Merged
kevin-montrose merged 52 commits into
mainfrom
users/kmontrose/vectorSetQuantization-v2
Jun 18, 2026
Merged

Add quantization tasks and calls into DiskANN#1882
kevin-montrose merged 52 commits into
mainfrom
users/kmontrose/vectorSetQuantization-v2

Conversation

@kevin-montrose

@kevin-montrose kevin-montrose commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Change expected return from DiskANN's insert so we can signal a request to quantize.

Spin up tasks to pump background quantization when that is requested.

Tasks take two forms:

  • A single task per Vector Set index which calls build_quant_table(context, index)
  • After that finishes and true is returned, some number that proceed in parallel and call backfill_quant_vectors(context, index, task_index, task_count)

It's on DiskANN to find vectors to backfill and split them up based on task_index / task_count.

Since locks are not held over the entire backfill, DiskANN also needs to validate that quantization is still needed when build_quant_table is called (recreation and restarts can mess with that) and backfill_quant_vectors needs to gracefully deal with concurrent calls.


Upon further discussion, we're also cleaning up quantizer options and the FFI between Garnet and DiskANN around vector formats.

Namely:

  • Rename quantizer XPREQ8 -> XNOQUANT_U8
  • Add XBIN_U8 and XBIN_I8
  • Rename format XB8 -> XU8
  • Add format XI8
  • Garnet now knows the "natural" format of different quantizers and only passes vectors of those formats to DiskANN in insert & search calls

Old options are aliased so we can keep XPREQ8 and XB8 in tests and whatnot without issue.

…e yet; rename quantizers and vector value to better align with expectations; add xbin_i8 quantizer; stop asking diskann to handle all vector formats, instead just use the quantizer to pick a 'native' vector type and map to those types as needed; handle alignement requirements on Garnet side as well
…ng a memory corruption occasionally, not sure if that's on the Garnet or the diskann side yet

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

There are correctness issues in the VSIM parsing and error propagation paths (wrong command name in arity errors, missing max-dimension validation for XU8/XI8, and dimension mismatches returning misleading errors).

Pull request overview

This PR extends Garnet’s Vector Set + DiskANN integration to support background quantization/backfill workflows, updates the FFI to reflect DiskANN’s new insert/search signatures, and cleans up/extends quantizer + vector-format handling (including backward-compatible aliases). It also adds a new server option to control quantization task parallelism, updates docs, bumps the diskann-garnet package, and expands test coverage across standalone + cluster suites.

Changes:

  • Add VectorManager background quantization task pipeline (build_quant_table then parallel backfill_quant_vectors) and queue quantization based on DiskANN insert/recreate signals.
  • Update vector quantizer/format enums + parsing to support XNOQUANT_{U8,I8}, XBIN_{U8,I8}, and XU8/XI8 (with aliases for XPREQ8 and XB8).
  • Update DiskANN interop signatures (insert/search) and add extensive tests/docs/config wiring.
File summaries
File Description
website/docs/dev/vector-sets.md Documents quantizers, background quantization/backfill flow, and updated DiskANN FFI surface.
test/standalone/Garnet.test/TestUtils.cs Wires new vectorSetQuantizationTaskCount option into test server construction.
test/standalone/Garnet.test/GarnetServerConfigTests.cs Adds config parsing tests for the new quantization task-count option and reorganizes preview-related tests.
test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs Expands quantizer coverage and adds an async stress/backfill test.
test/standalone/Garnet.test.extensions/DiskANN/DiskANNServiceTests.cs Updates extension tests for new DiskANN interop signatures and renamed quantizer/format.
test/cluster/Garnet.test.cluster/ClusterTestContext.cs Plumbs vectorSetQuantizationTaskCount into cluster server creation helper.
test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs Broadens format/quantizer replication tests and adds additional validation in concurrent scenarios.
libs/server/Storage/Session/MainStore/VectorStoreOps.cs Updates enums (explicit values) and adjusts dimension computation + call plumbing for TryAdd changes.
libs/server/Servers/GarnetServerOptions.cs Introduces VectorSetQuantizationTaskCount option.
libs/server/Resp/Vector/VectorManager.Replication.cs Updates replication apply-path to new TryAdd signature.
libs/server/Resp/Vector/VectorManager.Quantization.cs New background quantization task runner using Channels + Temp RESP sessions.
libs/server/Resp/Vector/VectorManager.Migration.cs Enables quantization request propagation on index recreation during migration.
libs/server/Resp/Vector/VectorManager.Locking.cs Enables quantization request propagation on index (re)create/recreate paths.
libs/server/Resp/Vector/VectorManager.ElementData.cs Adds vector-format conversion/alignment path so Garnet passes “natural” formats to DiskANN.
libs/server/Resp/Vector/VectorManager.cs Queues quantization on insert signals, adds task/channel lifecycle management, and updates similarity path to new DiskANN signature.
libs/server/Resp/Vector/VectorManager.Cleanup.cs Renames cleanup-session factory to temp-session factory usage.
libs/server/Resp/Vector/RespServerSessionVectors.cs Updates VADD/VSIM parsing for new formats/quantizers and propagates new similarity API error output.
libs/server/Resp/Vector/DiskANNService.cs Updates native interop signatures, adds quantization FFI calls, and changes insert/search call shapes.
libs/server/Databases/SingleDatabaseManager.cs Ensures VectorManager is initialized before vector-set recovery.
libs/server/Databases/MultiDatabaseManager.cs Ensures VectorManager is initialized before per-DB vector-set recovery.
libs/server/API/IGarnetApi.cs Extends VectorSetValueSimilarity API to return an error message span.
libs/server/API/GarnetWatchApi.cs Updates WATCH wrapper to match new VectorSetValueSimilarity signature.
libs/server/API/GarnetApi.cs Updates API forwarder to match new VectorSetValueSimilarity signature.
libs/host/defaults.conf Adds default for VectorSetQuantizationTaskCount and fixes trailing comma placement.
libs/host/Configuration/Options.cs Adds CLI/config support for --vector-set-quantization-task-count.
Directory.Packages.props Bumps diskann-garnet dependency version.

Copilot's findings

  • Files reviewed: 26/26 changed files
  • Comments generated: 8

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs
Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs
Comment thread libs/server/Resp/Vector/VectorManager.cs
Comment thread website/docs/dev/vector-sets.md Outdated
Comment thread website/docs/dev/vector-sets.md Outdated
Comment thread libs/server/Storage/Session/MainStore/VectorStoreOps.cs
Comment thread libs/server/Storage/Session/MainStore/VectorStoreOps.cs Outdated
Comment thread test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

Multiple confirmed correctness/safety issues were identified (notably VSIM missing max-dimension guards for XU8/XI8 and quantization tasks using a temp session without switching to the correct database).

Copilot's findings

Comments suppressed due to low confidence (1)

libs/server/Resp/Vector/RespServerSessionVectors.cs:444

  • When REDUCE is used with extension quantizers, this branch currently reports the generic "asked quantization mismatch" error because customErrMsg is left empty. This makes the actual validation failure hard to understand.
                // We need to reject these HERE because validation during create_index is very awkward
                GarnetStatus res;
                VectorManagerResult result;
                ReadOnlySpan<byte> customErrMsg;
                if (quantType is VectorQuantType.XBin_U8 or VectorQuantType.XBin_I8 or VectorQuantType.XNoQuant_U8 or VectorQuantType.XNoQuant_I8 && reduceDim != 0)
                {
                    result = VectorManagerResult.BadParams;
                    res = GarnetStatus.OK;
                    customErrMsg = default;
                }
  • Files reviewed: 26/26 changed files
  • Comments generated: 11

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs
Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs
Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs Outdated
Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs Outdated
Comment thread libs/server/Resp/Vector/DiskANNService.cs
Comment thread test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs Outdated
Comment thread test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs Outdated
Comment thread libs/host/Configuration/Options.cs
Comment thread libs/host/defaults.conf Outdated
Comment thread libs/server/Resp/Vector/VectorManager.Quantization.cs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

There are correctness issues in new/modified code paths (notably a precedence bug in VADD REDUCE validation and a pooled-buffer casting hazard in vector conversion) that can lead to incorrect behavior or runtime faults.

Copilot's findings
  • Files reviewed: 26/26 changed files
  • Comments generated: 7

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread libs/server/Resp/Vector/RespServerSessionVectors.cs
Comment thread libs/server/Resp/Vector/VectorManager.cs
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs
Comment thread website/docs/dev/vector-sets.md Outdated
Comment thread website/docs/dev/vector-sets.md Outdated
Comment thread test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

There are correctness issues in the updated RESP VADD validation (a precedence bug that can reject valid requests) and release-build safety gaps (DiskANN create_index null return is only asserted in Debug).

Copilot's findings

Comments suppressed due to low confidence (1)

libs/server/Resp/Vector/RespServerSessionVectors.cs:444

  • The extension-quantizer REDUCE rejection has a precedence bug: as written, reduceDim != 0 only applies to the last pattern, so XBin_* and XNoQuant_U8 are rejected even when REDUCE is not specified. This also sets customErrMsg to default, which later surfaces as a misleading quantization-mismatch error. Gate on reduceDim != 0 first and return a specific error message.
                if (quantType is VectorQuantType.XBin_U8 or VectorQuantType.XBin_I8 or VectorQuantType.XNoQuant_U8 or VectorQuantType.XNoQuant_I8 && reduceDim != 0)
                {
                    result = VectorManagerResult.BadParams;
                    res = GarnetStatus.OK;
                    customErrMsg = default;
                }
  • Files reviewed: 26/26 changed files
  • Comments generated: 7

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread libs/server/Resp/Vector/VectorManager.cs
Comment thread libs/server/Resp/Vector/DiskANNService.cs
Comment thread website/docs/dev/vector-sets.md Outdated
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs
Comment thread libs/server/Resp/Vector/VectorManager.ElementData.cs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

The quantization-request signal on index create/recreate is currently stubbed and the new quantization backfill stress test is likely too heavy/flaky for CI without tightening its workload.

Copilot's findings
  • Files reviewed: 26/26 changed files
  • Comments generated: 5

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Comment thread libs/server/Resp/Vector/DiskANNService.cs
Comment thread libs/server/Resp/Vector/VectorManager.Quantization.cs
Comment thread libs/host/Configuration/Options.cs
Comment thread test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs
Comment thread test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs Outdated
@kevin-montrose kevin-montrose merged commit 5fc62fd into main Jun 18, 2026
153 checks passed
@kevin-montrose kevin-montrose deleted the users/kmontrose/vectorSetQuantization-v2 branch June 18, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants