Add quantization tasks and calls into DiskANN#1882
Conversation
…uired by the time we get around to it
…ackground quantization of Vector Sets
…e yet; rename quantizers and vector value to better align with expectations; add xbin_i8 quantizer; stop asking diskann to handle all vector formats, instead just use the quantizer to pick a 'native' vector type and map to those types as needed; handle alignement requirements on Garnet side as well
…ng a memory corruption occasionally, not sure if that's on the Garnet or the diskann side yet
…rare failures here :/
There was a problem hiding this comment.
⚠️ Not ready to approve
There are correctness issues in the VSIM parsing and error propagation paths (wrong command name in arity errors, missing max-dimension validation for XU8/XI8, and dimension mismatches returning misleading errors).
Pull request overview
This PR extends Garnet’s Vector Set + DiskANN integration to support background quantization/backfill workflows, updates the FFI to reflect DiskANN’s new insert/search signatures, and cleans up/extends quantizer + vector-format handling (including backward-compatible aliases). It also adds a new server option to control quantization task parallelism, updates docs, bumps the diskann-garnet package, and expands test coverage across standalone + cluster suites.
Changes:
- Add VectorManager background quantization task pipeline (
build_quant_tablethen parallelbackfill_quant_vectors) and queue quantization based on DiskANN insert/recreate signals. - Update vector quantizer/format enums + parsing to support
XNOQUANT_{U8,I8},XBIN_{U8,I8}, andXU8/XI8(with aliases forXPREQ8andXB8). - Update DiskANN interop signatures (insert/search) and add extensive tests/docs/config wiring.
File summaries
| File | Description |
|---|---|
| website/docs/dev/vector-sets.md | Documents quantizers, background quantization/backfill flow, and updated DiskANN FFI surface. |
| test/standalone/Garnet.test/TestUtils.cs | Wires new vectorSetQuantizationTaskCount option into test server construction. |
| test/standalone/Garnet.test/GarnetServerConfigTests.cs | Adds config parsing tests for the new quantization task-count option and reorganizes preview-related tests. |
| test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs | Expands quantizer coverage and adds an async stress/backfill test. |
| test/standalone/Garnet.test.extensions/DiskANN/DiskANNServiceTests.cs | Updates extension tests for new DiskANN interop signatures and renamed quantizer/format. |
| test/cluster/Garnet.test.cluster/ClusterTestContext.cs | Plumbs vectorSetQuantizationTaskCount into cluster server creation helper. |
| test/cluster/Garnet.test.cluster.vectorsets/VectorSets/ClusterVectorSetTests.cs | Broadens format/quantizer replication tests and adds additional validation in concurrent scenarios. |
| libs/server/Storage/Session/MainStore/VectorStoreOps.cs | Updates enums (explicit values) and adjusts dimension computation + call plumbing for TryAdd changes. |
| libs/server/Servers/GarnetServerOptions.cs | Introduces VectorSetQuantizationTaskCount option. |
| libs/server/Resp/Vector/VectorManager.Replication.cs | Updates replication apply-path to new TryAdd signature. |
| libs/server/Resp/Vector/VectorManager.Quantization.cs | New background quantization task runner using Channels + Temp RESP sessions. |
| libs/server/Resp/Vector/VectorManager.Migration.cs | Enables quantization request propagation on index recreation during migration. |
| libs/server/Resp/Vector/VectorManager.Locking.cs | Enables quantization request propagation on index (re)create/recreate paths. |
| libs/server/Resp/Vector/VectorManager.ElementData.cs | Adds vector-format conversion/alignment path so Garnet passes “natural” formats to DiskANN. |
| libs/server/Resp/Vector/VectorManager.cs | Queues quantization on insert signals, adds task/channel lifecycle management, and updates similarity path to new DiskANN signature. |
| libs/server/Resp/Vector/VectorManager.Cleanup.cs | Renames cleanup-session factory to temp-session factory usage. |
| libs/server/Resp/Vector/RespServerSessionVectors.cs | Updates VADD/VSIM parsing for new formats/quantizers and propagates new similarity API error output. |
| libs/server/Resp/Vector/DiskANNService.cs | Updates native interop signatures, adds quantization FFI calls, and changes insert/search call shapes. |
| libs/server/Databases/SingleDatabaseManager.cs | Ensures VectorManager is initialized before vector-set recovery. |
| libs/server/Databases/MultiDatabaseManager.cs | Ensures VectorManager is initialized before per-DB vector-set recovery. |
| libs/server/API/IGarnetApi.cs | Extends VectorSetValueSimilarity API to return an error message span. |
| libs/server/API/GarnetWatchApi.cs | Updates WATCH wrapper to match new VectorSetValueSimilarity signature. |
| libs/server/API/GarnetApi.cs | Updates API forwarder to match new VectorSetValueSimilarity signature. |
| libs/host/defaults.conf | Adds default for VectorSetQuantizationTaskCount and fixes trailing comma placement. |
| libs/host/Configuration/Options.cs | Adds CLI/config support for --vector-set-quantization-task-count. |
| Directory.Packages.props | Bumps diskann-garnet dependency version. |
Copilot's findings
- Files reviewed: 26/26 changed files
- Comments generated: 8
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
…thub.com/microsoft/garnet into users/kmontrose/vectorSetQuantization-v2
There was a problem hiding this comment.
⚠️ Not ready to approve
Multiple confirmed correctness/safety issues were identified (notably VSIM missing max-dimension guards for XU8/XI8 and quantization tasks using a temp session without switching to the correct database).
Copilot's findings
Comments suppressed due to low confidence (1)
libs/server/Resp/Vector/RespServerSessionVectors.cs:444
- When REDUCE is used with extension quantizers, this branch currently reports the generic "asked quantization mismatch" error because customErrMsg is left empty. This makes the actual validation failure hard to understand.
// We need to reject these HERE because validation during create_index is very awkward
GarnetStatus res;
VectorManagerResult result;
ReadOnlySpan<byte> customErrMsg;
if (quantType is VectorQuantType.XBin_U8 or VectorQuantType.XBin_I8 or VectorQuantType.XNoQuant_U8 or VectorQuantType.XNoQuant_I8 && reduceDim != 0)
{
result = VectorManagerResult.BadParams;
res = GarnetStatus.OK;
customErrMsg = default;
}
- Files reviewed: 26/26 changed files
- Comments generated: 11
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
There was a problem hiding this comment.
⚠️ Not ready to approve
There are correctness issues in new/modified code paths (notably a precedence bug in VADD REDUCE validation and a pooled-buffer casting hazard in vector conversion) that can lead to incorrect behavior or runtime faults.
Copilot's findings
- Files reviewed: 26/26 changed files
- Comments generated: 7
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
There was a problem hiding this comment.
⚠️ Not ready to approve
There are correctness issues in the updated RESP VADD validation (a precedence bug that can reject valid requests) and release-build safety gaps (DiskANN create_index null return is only asserted in Debug).
Copilot's findings
Comments suppressed due to low confidence (1)
libs/server/Resp/Vector/RespServerSessionVectors.cs:444
- The extension-quantizer REDUCE rejection has a precedence bug: as written,
reduceDim != 0only applies to the last pattern, soXBin_*andXNoQuant_U8are rejected even whenREDUCEis not specified. This also setscustomErrMsgto default, which later surfaces as a misleading quantization-mismatch error. Gate onreduceDim != 0first and return a specific error message.
if (quantType is VectorQuantType.XBin_U8 or VectorQuantType.XBin_I8 or VectorQuantType.XNoQuant_U8 or VectorQuantType.XNoQuant_I8 && reduceDim != 0)
{
result = VectorManagerResult.BadParams;
res = GarnetStatus.OK;
customErrMsg = default;
}
- Files reviewed: 26/26 changed files
- Comments generated: 7
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
There was a problem hiding this comment.
⚠️ Not ready to approve
The quantization-request signal on index create/recreate is currently stubbed and the new quantization backfill stress test is likely too heavy/flaky for CI without tightening its workload.
Copilot's findings
- Files reviewed: 26/26 changed files
- Comments generated: 5
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Change expected return from DiskANN's
insertso we can signal a request to quantize.Spin up tasks to pump background quantization when that is requested.
Tasks take two forms:
build_quant_table(context, index)backfill_quant_vectors(context, index, task_index, task_count)It's on DiskANN to find vectors to backfill and split them up based on
task_index / task_count.Since locks are not held over the entire backfill, DiskANN also needs to validate that quantization is still needed when
build_quant_tableis called (recreation and restarts can mess with that) andbackfill_quant_vectorsneeds to gracefully deal with concurrent calls.Upon further discussion, we're also cleaning up quantizer options and the FFI between Garnet and DiskANN around vector formats.
Namely:
XPREQ8->XNOQUANT_U8XBIN_U8andXBIN_I8XB8->XU8XI8Old options are aliased so we can keep
XPREQ8andXB8in tests and whatnot without issue.