Optimize ImmutablePointToValues: inline single-value points to reduce RAM usage and increase performance by TY0909 · Pull Request #8235 · qdrant/qdrant

TY0909 · 2026-02-26T08:04:21Z

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
Have you checked your code using cargo clippy --workspace --all-features command?

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

Summary

Refactors ImmutablePointToValues<N> in lib/segment/src/index/field_index/immutable_point_to_values.rs to store single-value points inline instead of placing them in the shared values_container. This reduces memory indirection and improves performance for the common case where a payload field holds exactly one value per point (e.g. a category or status field).

Motivation

The original implementation stored each point's values as a Range<u32> in point_to_values, with the actual values being stored in the point_to_values_container:

// OLD layout
point_to_values: Vec<Range<u32>>,   // 8 bytes per point
point_to_values_container: Vec<N>,  // all values, including single-value points

For a single-value point this meant:

An 8-byte Range<u32> entry in point_to_values.
The actual value N stored separately inside point_to_values_container.

In practice, the majority of payload fields (keyword, integer, float…) store exactly one value per point. Forcing those values through the container adds unnecessary heap indirection and wastes allocations that could be avoided entirely.

Changes

New `PointValueEntry<N>` enum

Replaces the Range<u32> with a enum:

enum PointValueEntry<N> {
    /// Exactly one value — stored inline, no container access needed.
    Single(N),
    /// Zero or 2+ values — stored as a contiguous slice in `values_container`.
    Slice { start: u32, count: u32 },
}

Cardinality	Old storage	New storage
0 values	`Range<u32>` (empty) + nothing in container	`Slice { start: 0, count: 0 }`
1 value	`Range<u32>` + `N` in container	`Single(N)` — fully inline
2+ values	`Range<u32>` + `N…` in container	`Slice { start, count }` + `N…` in container

Updated struct fields

Old	New
`point_to_values: Vec<Range<u32>>`	`point_entries: Vec<PointValueEntry<N>>`
`point_to_values_container: Vec<N>`	`values_container: Vec<N>`

Smarter `values_container` pre-allocation

The constructor now calculates capacity by excluding single-value points, so the container is pre-allocated to exactly the right size:

let container_capacity = src
    .iter()
    .filter(|values| values.len() != 1)
    .fold(0, |acc, values| acc + values.len());

All public methods updated

new() — dispatches to Single vs Slice during construction.
check_values_any() — matches on the entry variant; Single avoids a container lookup entirely.
get_values() — returns a uniform std::slice::Iter<'_, N> for both variants.
get_values_count() — returns Some(1) directly for Single, no container access.
remove_point() — uses std::mem::take on the entry; Single extracts the value directly without touching the container.

Impact

For a segment where every point has exactly one value in a given field, the values_container will be completely empty after this change — all values will live inline in point_entries. The previously mandatory container entries are completely eliminated.

When searching for a field that has exactly one value per point, we no longer need to get a Range<u32> and then retrieve values from the point_to_values_container. Instead, we can get the value directly from point_entries, avoiding indirection and improving cache locality.

Tests

The existing test_immutable_point_to_values_remove test is preserved and refactored to use the new shared check_values helper. The following new test cases are added:

Test	What it covers
`test_single_value_stored_inline`	Verifies that single-value points do not appear in `values_container`
`test_all_single_values`	Verifies `values_container` is empty when every point has exactly one value
`test_get_values_count`	Correct counts for all cardinalities (0, 1, 2+)
`test_check_values_any`	Predicate evaluation across all variants and edge cases
`test_remove_single_value_point`	Removal correctness for inline `Single` entries
`test_remove_out_of_bounds`	Out-of-bounds removal returns an empty `Vec`
`test_get_values_out_of_bounds`	Out-of-bounds lookup returns `None`
`test_empty_source`	Constructing from an empty `Vec` is a no-op with safe default behaviour

Compatibility

Logic is fully backward-compatible (same public API surface)
All existing tests pass
New tests cover all entry variants and boundary conditions
No unsafe code introduced
u32 is preserved throughout to maintain the existing RAM-efficiency guarantee for indices

Performance Test in a Real Scenario

In our test collection with 260 million points, filtered-query RPS improved from 358 to 569 (about a 58% improvement). The test configuration is shown below.

In this test, each filtered field has at most one value per point (some points have no value).

Collection Config

{
    "vectors": {
        "pv": {
            "size": 768,
            "distance": "Dot",
            "hnsw_config": {
                "m": 48,
                "ef_construct": 512,
                "on_disk": false,
                "payload_m": 48
            },
            "quantization_config": {
                "binary": {
                    "always_ram": true,
                    "encoding": "two_bits"
                }
            },
            "on_disk": true
        }
    },
    "sharding_method": "auto",
    "shard_number": 15,
    "replication_factor": 2,
    "write_consistency_factor": 2,
    "on_disk_payload": true,
    "optimizer_config": {
        "default_segment_number": 4,
        "max_segment_size": 16777216
    }
}

Query Sample

{
    "query": pv_test,
    "using": "pv",
    "params": {
        "hnsw_ef": 48,
        "exact": False,
        "quantization": {"ignore": False, "rescore": True, "oversampling": 2},
    },
    "limit": 60,
    "filter": {
        "must": [
            {
                "key": "A",
                "range": {
                    "gte": A1,
                    "lte": A2,
                },
            },
            {"key": "B", "match": {"any": B_list}},
            {"key": "C", "match": {"any": [C1, C2]}},
            {"key": "D", "match": {"value": D2}},
            {"key": "E", "range": {"gte": E1, "lte": E2}},
            {
                "key": "F",
                "match": {"except": [F1, F2, F3]},
            },
        ]
    },
    "with_payload": ["S", "V"],
    "with_vector": False,
}

Benches

We extracted the core logic from both structs, focused on check_values_any, and ran benchmarks. The results are shown below.

Cardinality	Old storage	New storage
single-value	[1.7936 ms 1.7941 ms 1.7948 ms]	[834.23 µs 835.15 µs 836.41 µs]
sparse values	[1.6957 ms 1.6972 ms 1.6993 ms]	[864.67 µs 864.78 µs 864.91 µs]
multi-values	[3.4935 ms 3.5153 ms 3.5442 ms]	[3.2952 ms 3.2977 ms 3.3012 ms]

single-value means every point has exactly one value.

sparse values means every point has at most one value.

multi-values means every point has a random number of values.

And the detail benchmark file as shown below:

common.rs

#![allow(dead_code)]

use std::time::Duration;

use criterion::BenchmarkGroup;
use criterion::measurement::WallTime;
use rand::{RngExt, SeedableRng};

/// Change this alias to switch the value type used by every benchmark.
pub type ValueType = i64;

use qdrant_filter_test::inline;
use qdrant_filter_test::origin;

// ── Shared constants ─────────────────────────────────────────────────────────
// Change these once and every benchmark will pick up the new values.

/// Seed used to generate the point data.
pub const DATA_SEED: u64 = 42;

/// Seed used to generate random query point IDs.
pub const QUERY_SEED: u64 = 123;

/// Total number of points to generate.
pub const POINTS_NUM: usize = 4000_0000;

/// Upper bound (inclusive) for the random value assigned to each point.
pub const MAX_VALUE: ValueType = 2;

/// Number of point IDs to query during the benchmark.
pub const QUERY_COUNT: usize = 10_0000;

/// The value we pass to `check_values_any` as the predicate target.
pub const CHECK_VALUE: ValueType = 1;

/// Ratio of inner `Vec`s that are empty in the sparse-single-value data type.
/// Must be in the range `0.0..=1.0`. For example, `0.3` means 30 % of entries
/// will be empty.
pub const EMPTY_RATIO: f64 = 0.1;

/// Maximum number of elements in each inner `Vec` for the multi-value data
/// type. Each inner `Vec` will have a random length in `1..=MAX_VALUES_PER_POINT`.
pub const MAX_VALUES_PER_POINT: usize = 5;

// ── Data builders ────────────────────────────────────────────────────────────

/// **Type 1 – single value**: every inner `Vec` has exactly one element.
pub fn build_single_value_data(
    seed: u64,
    points_num: usize,
    max_value: ValueType,
) -> Vec<Vec<ValueType>> {
    let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
    let mut values = Vec::with_capacity(points_num);
    for _ in 0..points_num {
        let value: ValueType = rng.random_range(0..=max_value);
        values.push(vec![value]);
    }
    values
}

/// **Type 2 – sparse single value**: every inner `Vec` has *at most* one
/// element. A proportion of `empty_ratio` entries will be empty.
pub fn build_sparse_single_value_data(
    seed: u64,
    points_num: usize,
    max_value: ValueType,
    empty_ratio: f64,
) -> Vec<Vec<ValueType>> {
    use rand::seq::SliceRandom;

    let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
    let empty_count = (points_num as f64 * empty_ratio).round() as usize;
    let non_empty_count = points_num - empty_count;

    let mut values = Vec::with_capacity(points_num);

    // First, push all non-empty entries
    for _ in 0..non_empty_count {
        let value: ValueType = rng.random_range(0..=max_value);
        values.push(vec![value]);
    }

    // Then, push all empty entries
    for _ in 0..empty_count {
        values.push(vec![]);
    }

    // Shuffle to distribute empties randomly
    values.shuffle(&mut rng);

    values
}

/// **Type 3 – multi value**: every inner `Vec` has a random number of elements
/// in `1..=max_values_per_point`.
pub fn build_multi_value_data(
    seed: u64,
    points_num: usize,
    max_value: ValueType,
    max_values_per_point: usize,
) -> Vec<Vec<ValueType>> {
    let mut rng = rand::rngs::StdRng::seed_from_u64(seed);
    let mut values = Vec::with_capacity(points_num);
    for _ in 0..points_num {
        let count = rng.random_range(1..=max_values_per_point);
        let inner: Vec<ValueType> = (0..count)
            .map(|_| rng.random_range(0..=max_value))
            .collect();
        values.push(inner);
    }
    values
}

// ── TestData ─────────────────────────────────────────────────────────────────

/// Pre-built instances of every implementation, ready for benchmarking.
pub struct TestData {
    pub inline: inline::ImmutablePointToValues<ValueType>,
    pub origin: origin::ImmutablePointToValues<ValueType>,
}

impl TestData {
    /// Construct all implementations from the given raw data.
    pub fn from_data(points_values: Vec<Vec<ValueType>>) -> Self {
        let inline = inline::ImmutablePointToValues::new(points_values.clone());
        let origin = origin::ImmutablePointToValues::new(points_values.clone());

        Self { inline, origin }
    }

    /// Convenience: build from the *single-value* data type using the shared
    /// constants.
    pub fn single_value() -> Self {
        Self::from_data(build_single_value_data(DATA_SEED, POINTS_NUM, MAX_VALUE))
    }

    /// Convenience: build from the *sparse-single-value* data type using the
    /// shared constants.
    pub fn sparse_single_value() -> Self {
        Self::from_data(build_sparse_single_value_data(
            DATA_SEED,
            POINTS_NUM,
            MAX_VALUE,
            EMPTY_RATIO,
        ))
    }

    /// Convenience: build from the *multi-value* data type using the shared
    /// constants.
    pub fn multi_value() -> Self {
        Self::from_data(build_multi_value_data(
            DATA_SEED,
            POINTS_NUM,
            MAX_VALUE,
            MAX_VALUES_PER_POINT,
        ))
    }
}

// ── Query-ID helpers ─────────────────────────────────────────────────────────

/// Generate `QUERY_COUNT` random point IDs in `[0, POINTS_NUM)`.
pub fn build_random_point_ids() -> Vec<u32> {
    let mut rng = rand::rngs::StdRng::seed_from_u64(QUERY_SEED);
    (0..QUERY_COUNT)
        .map(|_| rng.random_range(0..POINTS_NUM as u32))
        .collect()
}

// ── Bench Group Config ───────────────────────────────────────────────────────

/// Configure a benchmark group with standardised timing parameters.
pub fn configure_benchmark(group: &mut BenchmarkGroup<WallTime>) {
    group.measurement_time(Duration::from_secs(30));
}

random.rs

mod common;

use criterion::{Criterion, black_box, criterion_group, criterion_main, configure_benchmark};

use crate::common::{CHECK_VALUE, TestData, build_random_point_ids};

// ── Helper: run all implementations against one TestData ─────────────────────

fn bench_all_impls(
    group: &mut criterion::BenchmarkGroup<criterion::measurement::WallTime>,
    data: &TestData,
    point_ids: &[u32],
) {
    configure_benchmark(group);

    group.bench_function("origin", |b| {
        b.iter(|| {
            for &point_id in point_ids {
                black_box(
                    data.origin
                        .check_values_any(black_box(point_id), |v| *v == CHECK_VALUE),
                );
            }
        });
    });

    group.bench_function("inline", |b| {
        b.iter(|| {
            for &point_id in point_ids {
                black_box(
                    data.inline
                        .check_values_any(black_box(point_id), |v| *v == CHECK_VALUE),
                );
            }
        });
    });
}

// ── Benchmarks ───────────────────────────────────────────────────────────────

fn random_single_value(c: &mut Criterion) {
    let data = TestData::single_value();
    let point_ids = build_random_point_ids();

    let mut group = c.benchmark_group("Random single_value");
    bench_all_impls(&mut group, &data, &point_ids);
    group.finish();
}

fn random_sparse_single_value(c: &mut Criterion) {
    let data = TestData::sparse_single_value();
    let point_ids = build_random_point_ids();

    let mut group = c.benchmark_group("Random sparse_single_value");
    bench_all_impls(&mut group, &data, &point_ids);
    group.finish();
}

fn random_multi_value(c: &mut Criterion) {
    let data = TestData::multi_value();
    let point_ids = build_random_point_ids();

    let mut group = c.benchmark_group("Random multi_value");
    bench_all_impls(&mut group, &data, &point_ids);
    group.finish();
}

criterion_group!(
    benches,
    random_single_value,
    random_sparse_single_value,
    random_multi_value,
);
criterion_main!(benches);

origin.rs

use std::ops::Range;

pub struct ImmutablePointToValues<N: Default> {
    point_to_values: Vec<Range<u32>>,
    point_to_values_container: Vec<N>,
}

impl<N: Default> ImmutablePointToValues<N> {
    pub fn new(src: Vec<Vec<N>>) -> Self {
        let mut point_to_values = Vec::with_capacity(src.len());
        let all_values_count = src.iter().fold(0, |acc, values| acc + values.len());
        let mut point_to_values_container = Vec::with_capacity(all_values_count);
        for values in src {
            let container_len = point_to_values_container.len() as u32;
            let range = container_len..container_len + values.len() as u32;
            point_to_values.push(range.clone());
            point_to_values_container.extend(values);
        }
        Self {
            point_to_values,
            point_to_values_container,
        }
    }

    pub fn check_values_any(&self, idx: u32, check_fn: impl FnMut(&N) -> bool) -> bool {
        let Some(range) = self.point_to_values.get(idx as usize) else {
            return false;
        };

        let range = range.start as usize..range.end as usize;
        if let Some(values) = self.point_to_values_container.get(range) {
            values.iter().any(check_fn)
        } else {
            false
        }
    }
}

inline.rs

#[derive(Debug, Clone)]
enum PointValueEntry<N> {
    Single(N),
    Slice { start: u32, count: u32 },
}

impl<N> Default for PointValueEntry<N> {
    fn default() -> Self {
        PointValueEntry::Slice { start: 0, count: 0 }
    }
}

#[derive(Debug, Clone, Default)]
pub struct ImmutablePointToValues<N: Default> {
    point_entries: Vec<PointValueEntry<N>>,
    values_container: Vec<N>,
}

impl<N: Default> ImmutablePointToValues<N> {
    pub fn new(src: Vec<Vec<N>>) -> Self {
        let mut point_entries = Vec::with_capacity(src.len());
        let container_capacity = src
            .iter()
            .filter(|values| values.len() != 1)
            .fold(0, |acc, values| acc + values.len());
        let mut values_container = Vec::with_capacity(container_capacity);

        for values in src {
            if values.len() == 1 {
                let value = values.into_iter().next().expect("length checked above");
                point_entries.push(PointValueEntry::Single(value));
            } else {
                let start = values_container.len() as u32;
                let count = values.len() as u32;
                point_entries.push(PointValueEntry::Slice { start, count });
                values_container.extend(values);
            }
        }

        Self {
            point_entries,
            values_container,
        }
    }

    pub fn check_values_any(&self, idx: u32, check_fn: impl FnMut(&N) -> bool) -> bool {
        let Some(entry) = self.point_entries.get(idx as usize) else {
            return false;
        };

        match entry {
            PointValueEntry::Single(v) => std::iter::once(v).any(check_fn),
            PointValueEntry::Slice { start, count } => {
                let range = *start as usize..(*start + *count) as usize;
                if let Some(values) = self.values_container.get(range) {
                    values.iter().any(check_fn)
                } else {
                    false
                }
            }
        }
    }
}

- Replace PointMeta with PointValueEntry enum to distinguish single vs multi-value points - Store single values directly in point_entries, reducing container overhead - Only multi-value and empty points use the shared values_container - Update all methods and tests to use new representation - Add tests for edge cases and single-value optimizations

1995chen · 2026-02-26T10:40:53Z

Hi @timvisee @generall, we've implemented some minor optimizations to the filter query component. Initial testing in our environment shows performance gains. Would you be available to review it when convenient? Please don't hesitate to contact us if anything is unclear.

generall · 2026-02-26T11:30:20Z

Hey @TY0909, @1995chen thank you for a very high quality contributions so far! We will review your PR shortly and very likely include it into the next patch release comming in a few weeks.

timvisee

Thank you!

I'd like to add an explicit branch for a zero sized value. I'll commit it to this branch right away.

1995chen · 2026-02-26T13:57:07Z

We found in our tests on check_values_any that adding the extra branch caused a small performance drop, so we ended up removing it. That said, we totally respect your call on this.

Thank you!

I'd like to add an explicit branch for a zero sized value. I'll commit it to this branch right away.

timvisee · 2026-02-26T14:09:40Z

We found in our tests on check_values_any that adding the extra branch caused a small performance drop, so we ended up removing it. That said, we totally respect your call on this.

Thank you!
I'd like to add an explicit branch for a zero sized value. I'll commit it to this branch right away.

I see. Thanks for considering it.

I didn't rerun a benchmark on it, so let me do that now. My guess is that it'd just be this branch.

timvisee · 2026-02-26T14:36:11Z

I get more significant differences between the existing and the inline version, so that's good news in terms of what speed we can gain here!

Cardinality	Old storage	New storage
single-value	[879.80 µs 884.48 µs 890.47 µs]	[289.52 µs 289.61 µs 289.69 µs]
sparse values	[881.96 µs 887.21 µs 893.43 µs]	[338.06 µs 339.56 µs 341.12 µs]
multi-values	[1.7552 ms 1.7587 ms 1.7625 ms]	[1.7858 ms 1.7903 ms 1.7955 ms]

Having the extra branch for zero sized values does indeed appear slightly worse, though it might fall within margin of error:

Random single_value/inline
                        time:   [289.61 µs 291.14 µs 292.53 µs]
                        change: [+0.1420% +0.5605% +0.9003%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Random sparse_single_value/inline
                        time:   [332.71 µs 333.92 µs 335.27 µs]
                        change: [−1.1472% −0.7012% −0.2721%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Random multi_value/inline
                        time:   [1.9469 ms 1.9562 ms 1.9665 ms]
                        change: [+8.7182% +9.4678% +10.246%] (p = 0.00 < 0.05)
                        Performance has regressed.

I'll remove the branch anyhow.

Thanks once again for the detailed PR and benchmarks!

… RAM usage and increase performance (#8235) * optimize: performance improve * Optimize ImmutablePointToValues to store single values inline - Replace PointMeta with PointValueEntry enum to distinguish single vs multi-value points - Store single values directly in point_entries, reducing container overhead - Only multi-value and empty points use the shared values_container - Update all methods and tests to use new representation - Add tests for edge cases and single-value optimizations * Explicit branch for zero sized value * Update comment * Remove zero count branch as it appears to make it slightly slower --------- Co-authored-by: 1995chen <[email protected]> Co-authored-by: timvisee <[email protected]>

timvisee · 2026-05-11T16:11:36Z

We've released Qdrant 1.18.0 includes these changes. You can read more about it in our release blog here.

Thank you for picking this up!

1995chen and others added 2 commits February 26, 2026 07:59

optimize: performance improve

442184b

This comment was marked as resolved.

Sign in to view

qdrant deleted a comment from coderabbitai Bot Feb 26, 2026

IvanPleshkov self-requested a review February 26, 2026 11:23

timvisee self-requested a review February 26, 2026 11:25

Explicit branch for zero sized value

007c24f

timvisee approved these changes Feb 26, 2026

View reviewed changes

Comment thread lib/segment/src/index/field_index/immutable_point_to_values.rs Outdated

Comment thread lib/segment/src/index/field_index/immutable_point_to_values.rs Outdated

Comment thread lib/segment/src/index/field_index/immutable_point_to_values.rs

Update comment

7f1def2

IvanPleshkov approved these changes Feb 26, 2026

View reviewed changes

qdrant deleted a comment from coderabbitai Bot Feb 26, 2026

Remove zero count branch as it appears to make it slightly slower

f279b76

This comment was marked as resolved.

Sign in to view

qdrant deleted a comment from coderabbitai Bot Feb 26, 2026

timvisee merged commit ada8108 into qdrant:dev Feb 26, 2026
16 of 17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ImmutablePointToValues: inline single-value points to reduce RAM usage and increase performance#8235

Optimize ImmutablePointToValues: inline single-value points to reduce RAM usage and increase performance#8235
timvisee merged 5 commits into
qdrant:devfrom
TY0909:optimize/immutable_point_to_values

TY0909 commented Feb 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

1995chen commented Feb 26, 2026

Uh oh!

generall commented Feb 26, 2026

Uh oh!

timvisee left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

1995chen commented Feb 26, 2026

Uh oh!

timvisee commented Feb 26, 2026

Uh oh!

timvisee commented Feb 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

timvisee commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

TY0909 commented Feb 26, 2026

All Submissions:

New Feature Submissions:

Changes to Core Features:

Summary

Motivation

Changes

New PointValueEntry<N> enum

Updated struct fields

Smarter values_container pre-allocation

All public methods updated

Impact

Tests

Compatibility

Performance Test in a Real Scenario

Benches

Uh oh!

This comment was marked as resolved.

Uh oh!

1995chen commented Feb 26, 2026

Uh oh!

generall commented Feb 26, 2026

Uh oh!

timvisee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

1995chen commented Feb 26, 2026

Uh oh!

timvisee commented Feb 26, 2026

Uh oh!

timvisee commented Feb 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

timvisee commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

New `PointValueEntry<N>` enum

Smarter `values_container` pre-allocation