[MOD-13843] A Rust implementation of the numeric index by LukeMathWalker · Pull Request #8276 · RediSearch/RediSearch

LukeMathWalker · 2026-02-04T17:08:49Z

Describe the changes in the pull request

A pure Rust implementation of the numeric index (and the associated range tree) from numeric_index.c/numeric_index.h.

Key differences:

All nodes are stored in a contiguous slab of memory (arena-style). Parent nodes refer to their children by index rather than by pointer.
The HLL stored inside the numeric range tree using wyhash rather than fnv since Port HyperLogLog implementation to Rust #8095 proved wyhash to be faster and more accurate.

Which additional issues this PR fixes

MOD-...
#...

Main objects this PR modified

...

Mark if applicable

This PR introduces API changes
This PR introduces serialization changes

Release Notes

This PR requires release notes
This PR does not require release notes

If a release note is required (bug fix / new feature / enhancement), describe the user impact of this PR in the title.

Note

High Risk
Adds a brand-new numeric indexing data structure (splitting/balancing, memory accounting, iterator invalidation) and expands workspace dependencies/lints; correctness and performance regressions are plausible despite tests.

Overview
Introduces a new Rust crate numeric_range_tree and wires it into the workspace, providing an arena-allocated numeric range tree with adaptive leaf splitting, AVL-like single-rotation balancing, optional internal-range retention, revision-based iterator invalidation, and memory/statistics tracking.

Updates hyperloglog to include and expose a shared WyHasher implementation (used by the range tree’s HLL for cardinality estimation) and adjusts the benchmarks accordingly. Workspace config is extended with new deps (slab, rstest) and a rustdoc lint allowance, plus corresponding Cargo.lock updates.

^{Written by Cursor Bugbot for commit 601f53a. This will update automatically on new commits. Configure here.}

src/redisearch_rs/numeric_range_tree/src/tree.rs

src/redisearch_rs/numeric_range_tree/src/range.rs

src/redisearch_rs/numeric_range_tree/src/node.rs

src/redisearch_rs/numeric_range_tree/src/tree.rs

codecov · 2026-02-04T17:41:58Z

Codecov Report

❌ Patch coverage is 83.48271% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.05%. Comparing base (d3ac7b4) to head (601f53a).
⚠️ Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
src/redisearch_rs/numeric_range_tree/src/index.rs	40.00%	54 Missing ⚠️
src/redisearch_rs/numeric_range_tree/src/arena.rs	53.06%	23 Missing ⚠️
src/redisearch_rs/numeric_range_tree/src/node.rs	90.06%	15 Missing ⚠️
...edisearch_rs/numeric_range_tree/src/tree/insert.rs	94.14%	9 Missing and 4 partials ⚠️
...earch_rs/numeric_range_tree/src/tree/invariants.rs	87.62%	5 Missing and 7 partials ⚠️
src/redisearch_rs/hyperloglog/src/wyhash.rs	75.00%	3 Missing ⚠️
src/redisearch_rs/numeric_range_tree/src/range.rs	95.08%	3 Missing ⚠️
...c/redisearch_rs/numeric_range_tree/src/tree/mod.rs	95.83%	3 Missing ⚠️
.../redisearch_rs/numeric_range_tree/src/unique_id.rs	50.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8276      +/-   ##
==========================================
+ Coverage   79.34%   83.05%   +3.70%     
==========================================
  Files         389      399      +10     
  Lines       57556    58334     +778     
  Branches    15708    16486     +778     
==========================================
+ Hits        45669    48449    +2780     
+ Misses      11729     9718    -2011     
- Partials      158      167       +9

Flag	Coverage Δ
flow	`84.12% <ø> (+5.54%)`	⬆️
unit	`50.94% <83.48%> (+0.46%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/redisearch_rs/numeric_range_tree/src/tree/find.rs

src/redisearch_rs/numeric_range_tree/src/tree/invariants.rs

src/redisearch_rs/numeric_range_tree/src/node.rs

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

LukeMathWalker · 2026-02-08T09:30:11Z

Extracted find-related code to #8302.

src/redisearch_rs/numeric_range_tree/src/lib.rs

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

meiravgri

Super clean!
To avoid losing track, I have reviewed most core imp:
src: arena.rs, lib.rs, mode.rs range.rs
src/tree: insert.rs, mod.rs

will continue reviewing other files after comments from this batch are reviewed

meiravgri · 2026-02-08T11:26:00Z

src/redisearch_rs/numeric_range_tree/src/arena.rs

+    /// Debug-asserts that the resulting key fits in `u32`.
+    pub fn insert(&mut self, node: NumericRangeNode) -> NodeIndex {
+        let key = self.nodes.insert(node);
+        debug_assert!(key <= u32::MAX as usize);


We can't actually guarantee that Slab::insert will return a usize less than u32::MAX, right? Does the memory savings of u32 over u64 justify the silent truncation risk?

Also, what's the purpose of the debug_assert!? I don't see us testing such a scenario.

Debug asserts can be enabled in "release" builds too, they aren't necessarily tied to testing.
E.g. it's possible to produce a release build with debug assertions enabled to re-run particular scenarios and see what triggers.
In this specific case, it's better to make it a production assert, since the overhead is going to be negligible.

On one hand i dont think we need to crash the server in this case, because the user didnt do anything wrong
on the other hand, if handling it gracefully will add complexity to the code it doesn't worth it...
your call

src/redisearch_rs/numeric_range_tree/src/node.rs

meiravgri · 2026-02-08T13:38:09Z

src/redisearch_rs/numeric_range_tree/src/node.rs

+    /// Returns `0.0` for leaf nodes.
+    pub const fn split_value(&self) -> f64 {
+        match self {
+            Self::Leaf(_) => 0.0,


is 0.0 the right value to return for leaf nodes when 0.0 is a valid split value for internal nodes?

Option is the right abstraction here, yes.

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

meiravgri · 2026-02-08T15:02:07Z

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

+            while reader.next_record(&mut result).unwrap_or(false) {
+                // SAFETY: We know the result contains numeric data
+                let entry_value = unsafe { result.as_numeric_unchecked() };
+                entries.push((result.doc_id, entry_value));
+            }
+
+            (split, entries)


do we copy the entire inverted index of the range here?

Yes, we need to collect all its entries into a buffer.

meiravgri · 2026-02-08T15:05:35Z

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

+        // Take the existing range from the leaf and convert to an internal node.
+        let old_range = nodes[node_idx].take_range();
+        let new_node =
+            NumericRangeNode::internal_indexed(split, left_idx, right_idx, old_range, nodes);
+        nodes[node_idx] = new_node;


dont we need to pass the range only if internal ranges are enabled?

At this stage, we keep it unconditionally, then it's trimmed in node_add based on the max_depth_range that was specified. I could also push the check further down, bringing max_depth_range into this context.

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

GuyAv46

Awesome! First review without the tests.
Deletion (and repair) logic will come in a later PR, I assume? Interested in the slab compaction logic

GuyAv46 · 2026-02-08T08:08:50Z

src/redisearch_rs/numeric_range_tree/src/node.rs

+        match self {
+            Self::Leaf(leaf) => {
+                // Replace with a default range; callers are expected to replace the whole node.
+                Some(std::mem::replace(&mut leaf.range, NumericRange::new(false)))


This is the actual default, right?
Consider using std::mem::take

GuyAv46 · 2026-02-08T08:20:23Z

src/redisearch_rs/numeric_range_tree/src/node.rs

+    /// Returns `0.0` for leaf nodes.
+    pub const fn split_value(&self) -> f64 {
+        match self {
+            Self::Leaf(_) => 0.0,


Not sure if it matters much, but 0.0 is an arbitrary valid value.
Consider returning NaN, None, panic, or even extract the median instead

Option is the right abstraction here, yes.

GuyAv46 · 2026-02-08T14:33:48Z

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

+/// cache locality, eliminates per-node heap allocation overhead, and makes
+/// rotations cheaper (index swaps instead of alloc/dealloc).


makes rotations cheaper (index swaps instead of alloc/dealloc)

The C implementation doesn't require [de]allocations, so this is not a real improvement. The rest is true :)

I reworded it around pruning, which is when we see the difference (i.e. one realloc vs multiple deallocs).

GuyAv46 · 2026-02-08T14:50:21Z

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

+    pub const fn mem_usage(&self) -> usize {
+        let base_size = std::mem::size_of::<Self>();
+        // Our tree is a full binary tree, so #nodes = 2 * #leaves - 1
+        let nodes_count = 2 * self.stats.num_leaves.saturating_sub(1) + 1;


Why do we need 2 * self.stats.num_leaves.saturating_sub(1) + 1? num_leaves should always be at least 1, no?

True, but we don't need this either, we know the exact node count thanks to the arena.

GuyAv46 · 2026-02-08T14:54:43Z

src/redisearch_rs/numeric_range_tree/src/range.rs

+/// hashed into HyperLogLog registers. We hash the raw bytes (bit
+/// representation) rather than the numeric value — see
+/// [`NumericRange::update_cardinality`] for rationale.
+fn update_cardinality(hll: &mut Hll, value: f64) {


Why have this helper, and not simply define it at the NumericRange implementation of update_cardinality?

Later on, when have the GC code (#8293) we'll need to compute hashes of values outside the context of a NumericRange, and since f64 doesn't implement Hash, I used this helper as the source of truth for how we are hashing floats.
A different approach could be to make HyperLogLog strongly typed—generic over the type of values it's counting, and then create a thin wrapper around f64 which encodes the desired hashing behaviour.

src/redisearch_rs/numeric_range_tree/src/node.rs

src/redisearch_rs/numeric_range_tree/src/tree/invariants.rs

GuyAv46 · 2026-02-08T15:58:00Z

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

+        self.last_doc_id = doc_id;
+
+        let mut rv = AddResult::default();
+        Self::node_add(


Consider making node_add return an AddResult, unless it has performance panalties

It won't work unless we make node_add non-recursive. It needs to thread rv via function parameters so that each "layer" of the chain can update it in place.

I see. In practice, the child sets it inially and then it gets updated on the way up. Could be an idea for the future

GuyAv46 · 2026-02-08T16:13:46Z

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

+            // Collect all entries from the range
+            let mut entries: Vec<(ffi::t_docId, f64)> = Vec::new();


Why do we collect the results, and not use the reader in the "Redistribute entries to children" for loop?

To avoid an annoying borrow-checker issue, but it can be worked around.

src/redisearch_rs/numeric_range_tree/src/tree/insert.rs

LukeMathWalker · 2026-02-09T13:00:10Z

Awesome! First review without the tests. Deletion (and repair) logic will come in a later PR, I assume? Interested in the slab compaction logic

Yes, they are in #8293.

I think I have addressed all feedback in the latest commit. Please have a second look! @meiravgri @GuyAv46

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

src/redisearch_rs/numeric_range_tree/src/iter.rs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs

sonarqubecloud · 2026-02-09T15:04:32Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions bot added the size:XL label Feb 4, 2026

cursor bot reviewed Feb 4, 2026

View reviewed changes

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from b3f8540 to 9ccd2fc Compare February 4, 2026 17:15

cursor bot reviewed Feb 4, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree.rs Outdated Show resolved Hide resolved

src/redisearch_rs/numeric_range_tree/src/tree.rs Outdated Show resolved Hide resolved

src/redisearch_rs/numeric_range_tree/src/tree.rs Outdated Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch 6 times, most recently from 24119a9 to 9c26b70 Compare February 4, 2026 20:41

LukeMathWalker mentioned this pull request Feb 5, 2026

Set per-test timeouts for the Rust unit test suite. #8279

Merged

4 tasks

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from 9c26b70 to bb910df Compare February 5, 2026 12:06

cursor bot reviewed Feb 5, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/find.rs Outdated Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch 7 times, most recently from 4486b46 to 8f3faa8 Compare February 5, 2026 16:49

LukeMathWalker requested a review from meiravgri February 5, 2026 16:50

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from 8f3faa8 to bdf8961 Compare February 6, 2026 08:28

LukeMathWalker mentioned this pull request Feb 6, 2026

[MOD-13913] Add debug implementation for NumericRangeTree #8292

Merged

4 tasks

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch 2 times, most recently from 9a7b195 to 4d02a30 Compare February 6, 2026 11:26

cursor bot reviewed Feb 6, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/invariants.rs Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from 4d02a30 to 853f01f Compare February 6, 2026 11:42

cursor bot reviewed Feb 6, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/node.rs Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from 853f01f to 7e72e18 Compare February 6, 2026 16:07

cursor bot reviewed Feb 6, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch 4 times, most recently from dbac863 to c4fe674 Compare February 8, 2026 09:26

LukeMathWalker mentioned this pull request Feb 8, 2026

[MOD-13920] Add find (range query) algorithm for numeric_range_tree #8302

Merged

4 tasks

cursor bot reviewed Feb 8, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/lib.rs Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from c4fe674 to 63087b3 Compare February 8, 2026 10:02

cursor bot reviewed Feb 8, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs Show resolved Hide resolved

meiravgri reviewed Feb 8, 2026

View reviewed changes

GuyAv46 reviewed Feb 8, 2026

View reviewed changes

GuyAv46 reviewed Feb 9, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs Outdated Show resolved Hide resolved

GuyAv46 reviewed Feb 9, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/iter.rs Outdated Show resolved Hide resolved

LukeMathWalker requested review from GuyAv46 and meiravgri February 9, 2026 14:18

cursor bot reviewed Feb 9, 2026

View reviewed changes

src/redisearch_rs/numeric_range_tree/src/tree/mod.rs Show resolved Hide resolved

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from e09782e to 06f9330 Compare February 9, 2026 14:59

LukeMathWalker added 3 commits February 9, 2026 16:03

Add numeric_range_tree crate (core data structure)

e77d6ce

Address review feedback.

954c038

Address review feedback, round #2

601f53a

LukeMathWalker force-pushed the numeric-range-tree-1-rust branch from 06f9330 to 601f53a Compare February 9, 2026 15:03

meiravgri approved these changes Feb 9, 2026

View reviewed changes

LukeMathWalker enabled auto-merge February 9, 2026 15:05

GuyAv46 approved these changes Feb 9, 2026

View reviewed changes

LukeMathWalker added this pull request to the merge queue Feb 9, 2026

Merged via the queue into master with commit c20a807 Feb 9, 2026
50 checks passed

		/// cache locality, eliminates per-node heap allocation overhead, and makes
		/// rotations cheaper (index swaps instead of alloc/dealloc).

		// Collect all entries from the range
		let mut entries: Vec<(ffi::t_docId, f64)> = Vec::new();

Conversation

LukeMathWalker commented Feb 4, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the changes in the pull request

Which additional issues this PR fixes

Main objects this PR modified

Mark if applicable

Release Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LukeMathWalker commented Feb 8, 2026

Uh oh!

Uh oh!

Uh oh!

meiravgri left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

GuyAv46 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LukeMathWalker commented Feb 4, 2026 •

edited by cursor bot

Loading

codecov bot commented Feb 4, 2026 •

edited

Loading