Skip to content

[ENH] Quantized Spann Segment Reader#6405

Merged
Sicheng-Pan merged 5 commits intomainfrom
02-10-_enh_quantized_spann_segment_reader
Feb 14, 2026
Merged

[ENH] Quantized Spann Segment Reader#6405
Sicheng-Pan merged 5 commits intomainfrom
02-10-_enh_quantized_spann_segment_reader

Conversation

@Sicheng-Pan
Copy link
Copy Markdown
Contributor

@Sicheng-Pan Sicheng-Pan commented Feb 11, 2026

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • Extended existing test_persist to use the reader impl
  • New functionality
    • Introducing quantized spann segment reader

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

@github-actions
Copy link
Copy Markdown

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Copy link
Copy Markdown
Contributor Author

Sicheng-Pan commented Feb 11, 2026

@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from a520611 to ff6baf8 Compare February 11, 2026 20:56
@blacksmith-sh

This comment has been minimized.

@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_wire_up_quantized_writer_in_compaction branch from 3324688 to f823d3d Compare February 11, 2026 21:07
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from ff6baf8 to 9a26a8f Compare February 11, 2026 21:07
@blacksmith-sh

This comment has been minimized.

@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review February 12, 2026 01:34
@propel-code-bot
Copy link
Copy Markdown
Contributor

propel-code-bot bot commented Feb 12, 2026

This reader is wired into the SpannProvider to handle rotate, navigate, and bruteforce read-path operations over persisted quantized segments by loading the requisite blockfile metadata, reusing the USearch centroid index in read-only mode, and enforcing dimensionality with deduplicated scoring keyed by global versions. Complementary utility updates add predicate-aware, deduplicated cluster queries and shared rotation/center constants, enabling integration tests with realistic 1024-dimensional embeddings alongside the new math and RNG dependencies required to validate accuracy end-to-end.

Key Changes

• Added QuantizedSpannSegmentReader (constructor, rotate, navigate, load_cluster, bruteforce) plus a SpannProvider::read_quantized_usearch helper that wires in existing providers.
• Extended quantized SPANN writer error enum with DimensionMismatch and exported rotation/center constants so both writer and reader share metadata semantics.
• Enhanced query_quantized_cluster to accept a (id, version) predicate, deduplicate IDs, and sort by the estimated distance before returning results.
• Augmented test_quantized_spann_segment_writer_persist to generate 1024-D random embeddings, run multiple writer cycles, reopen with the new reader, and validate rotate/navigate/bruteforce accuracy.
• Updated Cargo feature gating to pull in faer for matrix multiplication and rand for deterministic embedding generation in tests.

Possible Issues

QuantizedSpannSegmentReader::from_segment unconditionally preloads all PREFIX_VERSION blocks, which may be prohibitively large for big segments; consider lazy loading or per-cluster prefetching.
bruteforce currently makes one blockfile get per ID to obtain global versions; if clusters contain many points this could still be latency-heavy without batching or caching.

This summary was automatically generated by @propel-code-bot

@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_wire_up_quantized_writer_in_compaction branch from f823d3d to d45735e Compare February 12, 2026 01:40
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from 9a26a8f to a601921 Compare February 12, 2026 01:40
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_wire_up_quantized_writer_in_compaction branch from d45735e to d24f142 Compare February 12, 2026 01:41
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from a601921 to 80a5af5 Compare February 12, 2026 01:41
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from 80a5af5 to 514208a Compare February 12, 2026 21:27
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_wire_up_quantized_writer_in_compaction branch from d2f93df to bc0818a Compare February 12, 2026 21:33
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from 514208a to 939cc61 Compare February 12, 2026 21:33
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from 939cc61 to 833b8cc Compare February 13, 2026 00:43
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_wire_up_quantized_writer_in_compaction branch from bc0818a to 465c5fc Compare February 13, 2026 00:43
@blacksmith-sh

This comment has been minimized.

Copy link
Copy Markdown
Contributor Author

Sicheng-Pan commented Feb 14, 2026

Merge activity

  • Feb 14, 1:52 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Feb 14, 2:32 AM UTC: Graphite rebased this pull request as part of a merge.
  • Feb 14, 3:39 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Feb 14, 3:39 AM UTC: @Sicheng-Pan merged this pull request with Graphite.

@Sicheng-Pan Sicheng-Pan changed the base branch from 02-10-_enh_wire_up_quantized_writer_in_compaction to graphite-base/6405 February 14, 2026 01:53
@Sicheng-Pan Sicheng-Pan changed the base branch from graphite-base/6405 to main February 14, 2026 02:30
@Sicheng-Pan Sicheng-Pan force-pushed the 02-10-_enh_quantized_spann_segment_reader branch from 6875112 to ee117ac Compare February 14, 2026 02:31
@Sicheng-Pan Sicheng-Pan merged commit e280642 into main Feb 14, 2026
155 of 193 checks passed
tanujnay112 added a commit that referenced this pull request Feb 18, 2026
- **[ENH]: Cache rust git submodules in mounted volume (#6424)**
- **[CHORE](k8s) increase dev CPU limits from 100m to 200-300m (#6435)**
- **[ENH] replace live cloud tests with k8s integration tests (#6434)**
- **[ENH] Make dirty_log_collections metric mcmr-aware. (#6353)**
- **[ENH] Quantized Spann Segment Writer (#6397)**
- **[ENH] Wire up quantized writer in compaction (#6399)**
- **[ENH] Quantized Spann Segment Reader (#6405)**
- **[ENH] Wire up quantized reader in new orchestrator (#6409)**
- **[ENH] Garbage collect usearch index files (#6416)**
- **[ENH] Trace quantized spann implementation (#6425)**
- **[ENH]: Precompute data chunk len() (#6442)**
- **[BUG]: Compaction version file flush was incomplete on MCMR
(#6423)**
- **[DOC]: Fixed broken links in Readme (#6440)**
- **[DOC] Fix link to Rust documentation (#6443)**
- **[ENH]: Allow users to disable FTS in schema (#6214)**

---------

Co-authored-by: Robert Escriva <[email protected]>
Co-authored-by: Macronova <[email protected]>
Co-authored-by: Nilpotent <[email protected]>
Co-authored-by: anderk222 <[email protected]>
Co-authored-by: Sanket Kedia <[email protected]>
@Sicheng-Pan Sicheng-Pan deleted the 02-10-_enh_quantized_spann_segment_reader branch February 25, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants