Summary
faiss 1.80 release supports range query for hsnw algo is not working as expected. Threshold is getting ignored during search operation.
Platform
OS: 5.10.217-205.860.amzn2.x86_64 x86_64 GNU/Linux
Faiss version:
1.8.0
Installed from:
!conda install -c pytorch -y faiss-cpu=1.8.0
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
import faiss
import numpy as np
Define the dimension of the vectors
d = 256 # Example dimension
nb = 10 # Number of database vectors
nq = 2 # Number of query vectors
Generate random database and query vectors
np.random.seed(1234) # For reproducibility
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
faiss.normalize_L2(xb) # Normalize both query and database vectors
faiss.normalize_L2(xq)
hnsw_index_ip = faiss.IndexHNSWFlat(256, 16, faiss.METRIC_INNER_PRODUCT)
hnsw_index_ip.hnsw.efConstruction = 512
hnsw_index_ip.hnsw.efSearch = 512
hnsw_index_ip.add(xb)
radius = 0.74 # Cosine similarity threshold
lims, D, I = hnsw_index_ip.range_search(xq, radius)
Process results for Example 1
for i in range(len(xq)):
start = lims[i]
end = lims[i+1]
print(f"Query {i} has {end - start} neighbors within cosine similarity {radius}.")
print("Distances (cosine similarities):", D[start:end])
print("Indices:", I[start:end])
Results
Query 0 has 10 neighbors within cosine similarity 0.74.
Distances (cosine similarities): [0.71962094 0.75443786 0.75163096 0.7508825 0.7652079 0.7373253
0.7335136 0.7609381 0.73516536 0.7450698 ]
Indices: [9 7 8 3 1 6 2 4 5 0]
Query 1 has 10 neighbors within cosine similarity 0.74.
Distances (cosine similarities): [0.74524117 0.72365034 0.76518726 0.7913578 0.74097526 0.7454239
0.75055194 0.7481433 0.711897 0.7772374 ]
Indices: [9 7 8 3 1 6 2 4 5 0]
Summary
faiss 1.80 release supports range query for hsnw algo is not working as expected. Threshold is getting ignored during search operation.
Platform
OS: 5.10.217-205.860.amzn2.x86_64 x86_64 GNU/Linux
Faiss version:
1.8.0
Installed from:
!conda install -c pytorch -y faiss-cpu=1.8.0
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
import faiss
import numpy as np
Define the dimension of the vectors
d = 256 # Example dimension
nb = 10 # Number of database vectors
nq = 2 # Number of query vectors
Generate random database and query vectors
np.random.seed(1234) # For reproducibility
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
faiss.normalize_L2(xb) # Normalize both query and database vectors
faiss.normalize_L2(xq)
hnsw_index_ip = faiss.IndexHNSWFlat(256, 16, faiss.METRIC_INNER_PRODUCT)
hnsw_index_ip.hnsw.efConstruction = 512
hnsw_index_ip.hnsw.efSearch = 512
hnsw_index_ip.add(xb)
radius = 0.74 # Cosine similarity threshold
lims, D, I = hnsw_index_ip.range_search(xq, radius)
Process results for Example 1
for i in range(len(xq)):
start = lims[i]
end = lims[i+1]
print(f"Query {i} has {end - start} neighbors within cosine similarity {radius}.")
print("Distances (cosine similarities):", D[start:end])
print("Indices:", I[start:end])
Results
Query 0 has 10 neighbors within cosine similarity 0.74.
Distances (cosine similarities): [0.71962094 0.75443786 0.75163096 0.7508825 0.7652079 0.7373253
0.7335136 0.7609381 0.73516536 0.7450698 ]
Indices: [9 7 8 3 1 6 2 4 5 0]
Query 1 has 10 neighbors within cosine similarity 0.74.
Distances (cosine similarities): [0.74524117 0.72365034 0.76518726 0.7913578 0.74097526 0.7454239
0.75055194 0.7481433 0.711897 0.7772374 ]
Indices: [9 7 8 3 1 6 2 4 5 0]