[ENH]: Add option to enable quantization in schema#6295
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
|
The update carries this quantization flag through the Go coordinator schema, Rust type layer, and generated TypeScript client to keep the SPANN schema synchronized end-to-end, and it also introduces the coordinator-side Cmek struct so CMEK-aware schema handling remains consistent with server expectations. Affected Areas• Go coordinator schema definitions This summary was automatically generated by @propel-code-bot |
| EfConstruction *int `json:"ef_construction,omitempty"` | ||
| EfSearch *int `json:"ef_search,omitempty"` | ||
| MaxNeighbors *int `json:"max_neighbors,omitempty"` | ||
| Quantize bool `json:"quantize,omitempty"` |
There was a problem hiding this comment.
[Logic] The Quantize field is missing from the SpannConfiguration struct (the internal configuration used by the system), defined earlier in this file (around line 44). Adding it here to the Schema (SpannIndexConfig) without adding it to the internal configuration means the setting may be lost when converting between schema and internal representations.
Context for Agents
The `Quantize` field is missing from the `SpannConfiguration` struct (the internal configuration used by the system), defined earlier in this file (around line 44). Adding it here to the Schema (`SpannIndexConfig`) without adding it to the internal configuration means the setting may be lost when converting between schema and internal representations.
File: go/pkg/sysdb/coordinator/model/collection_configuration.go
Line: 166| 64 | ||
| } | ||
|
|
||
| pub fn default_quantize() -> bool { |
There was a problem hiding this comment.
[Logic] The quantize field appears to be missing from the InternalSpannConfiguration struct defined in this file. Without it, the quantize setting will be dropped when converting from SpannIndexConfig to InternalSpannConfiguration, preventing the index from receiving this configuration.
Context for Agents
The `quantize` field appears to be missing from the `InternalSpannConfiguration` struct defined in this file. Without it, the `quantize` setting will be dropped when converting from `SpannIndexConfig` to `InternalSpannConfiguration`, preventing the index from receiving this configuration.
File: rust/types/src/spann_configuration.rs
Line: 71| ef_construction: user.ef_construction.or(default.ef_construction), | ||
| ef_search: user.ef_search.or(default.ef_search), | ||
| max_neighbors: user.max_neighbors.or(default.max_neighbors), | ||
| quantize: user.quantize || default.quantize, |
There was a problem hiding this comment.
[Logic] This boolean OR logic prevents users from disabling quantization if the default configuration has it enabled (since false || true results in true).
If the intention is to allow users to override the default, consider changing quantize to Option<bool> in the config struct and using .or() logic similar to other fields:
quantize: user.quantize.or(default.quantize),Context for Agents
This boolean OR logic prevents users from disabling quantization if the default configuration has it enabled (since `false || true` results in `true`).
If the intention is to allow users to override the default, consider changing `quantize` to `Option<bool>` in the config struct and using `.or()` logic similar to other fields:
```rust
quantize: user.quantize.or(default.quantize),
```
File: rust/types/src/collection_schema.rs
Line: 1591| ef_construction: Some(spann_config.ef_construction), | ||
| ef_search: Some(spann_config.ef_search), | ||
| max_neighbors: Some(spann_config.max_neighbors), | ||
| quantize: false, |
There was a problem hiding this comment.
[Logic] This hardcodes quantize to false when converting from InternalCollectionConfiguration. Once InternalSpannConfiguration is updated to include the quantize field, this should map the value from there (spann_config.quantize) instead of hardcoding it, otherwise the setting will be lost during conversion.
Context for Agents
This hardcodes `quantize` to `false` when converting from `InternalCollectionConfiguration`. Once `InternalSpannConfiguration` is updated to include the `quantize` field, this should map the value from there (`spann_config.quantize`) instead of hardcoding it, otherwise the setting will be lost during conversion.
File: rust/types/src/collection_schema.rs
Line: 6181| pub max_neighbors: Option<usize>, | ||
| /// Enable quantization for vector search (cloud-only feature) | ||
| #[serde(default = "default_quantize", skip_serializing_if = "is_false")] | ||
| pub quantize: bool, |
There was a problem hiding this comment.
[Logic] Ideally, this should be Option<bool> to match the pattern of other fields in this struct. Using bool makes it impossible to distinguish between a user explicitly setting false versus the field being missing (and defaulting to false). This distinction is crucial for proper configuration merging (allowing users to override defaults).
Context for Agents
Ideally, this should be `Option<bool>` to match the pattern of other fields in this struct. Using `bool` makes it impossible to distinguish between a user explicitly setting `false` versus the field being missing (and defaulting to `false`). This distinction is crucial for proper configuration merging (allowing users to override defaults).
File: rust/types/src/collection_schema.rs
Line: 2647| /** | ||
| * Enable quantization for vector search (cloud-only feature) | ||
| */ | ||
| quantize?: boolean; |
There was a problem hiding this comment.
[Security] You mentioned this flag should stay server-controlled and hidden from clients, but by adding it to the OpenAPI schema we now emit it in the generated client types:
/**
* Enable quantization for vector search (cloud-only feature)
*/
quantize?: boolean;Because the SDK now exposes quantize, any client can set it in schema updates, and nothing in the merge path ignores client-supplied values (the merge logic user.quantize || default.quantize will happily accept true). If the intent is for the backend to own this flag, it shouldn’t appear in the public API at all (or at minimum should be marked read-only so generators don’t surface it as writable). Please update the OpenAPI definition so the generated types don’t include a writable quantize property, otherwise we’re effectively exposing a knob we wanted to keep server-side.
Context for Agents
You mentioned this flag should stay server-controlled and hidden from clients, but by adding it to the OpenAPI schema we now emit it in the generated client types:
```ts
/**
* Enable quantization for vector search (cloud-only feature)
*/
quantize?: boolean;
```
Because the SDK now exposes `quantize`, any client can set it in schema updates, and nothing in the merge path ignores client-supplied values (the merge logic `user.quantize || default.quantize` will happily accept `true`). If the intent is for the backend to own this flag, it shouldn’t appear in the public API at all (or at minimum should be marked read-only so generators don’t surface it as writable). Please update the OpenAPI definition so the generated types don’t include a writable `quantize` property, otherwise we’re effectively exposing a knob we wanted to keep server-side.
File: clients/new-js/packages/chromadb/src/api/types.gen.ts
Line: 563
Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
pytestfor python,yarn testfor js,cargo testfor rustMigration plan
None
Observability plan
None
Documentation Changes
None