Skip to content

IndexIVFPQFastScan crashes with certain nlist values #4089

@alisafaya

Description

@alisafaya

IndexIVFPQFastScan crashes with certain nlist values

The IndexIVFPQFastScan index in the Faiss similarity search library crashes when the nlist parameter (number of Voronoi cells) is not byte-aligned (e.g. 256, 65536). This occurs when calling either index.reconstruct(i) or index.reconstruct_batch(ids) after adding vectors.

Platform

  • OS: MacOS
  • Installed from: Anaconda
  • Running on: CPU
  • Interface: Python

Reproduction Instructions

import numpy as np
import faiss

faiss.omp_set_num_threads(16)

dim = 64  # Dimensionality of vectors 
train_vectors = np.random.random((100000, dim)).astype("float32")

def run_example(nlist: int):
    # Create an IVF index
    index = faiss.IndexIVFPQFastScan(faiss.IndexFlatL2(dim), dim, nlist, dim // 2, 4, faiss.METRIC_L2) 
    index.make_direct_map(True)
    index.train(train_vectors)
    index.add(train_vectors)
    
    print(index.reconstruct(24545))
    print(index.reconstruct_batch(np.arange(10)))

run_example(256) # This works fine
run_example(100) # Crashes with error below whenever `nlist` is not byte-aligned

Error Message

libc++abi: terminating due to uncaught exception of type faiss::FaissException: Error in faiss::idx_t faiss::Level1Quantizer::decode_listno(const uint8_t *) const at 
/Users/runner/miniconda3/conda-bld/faiss-pkg_1728491206806/work/faiss/IndexIVF.cpp:149: 
Error: 'list_no >= 0 && list_no < nlist' failed

Note: If you set by_residual=True in the IndexIVFPQFastScan constructor, the error only occurs for reconstruct_batch(), not reconstruct().

Potential Reasons

The issue seems to stem from the values in codes in IndexIVFPQFastScan::sa_decode(): here.

The potential reason is that these codes are unpacked incorrectly. Which makes the list_no value read from code invalid when nlist is not byte-aligned, causing the bounds check 'list_no >= 0 && list_no < nlist' to fail.

Additional note: The call to decode_listno is unnecessary when by_residual is false.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions