Hello,
Great library. Had two questions I couldn't find answers to in the past issues / READMEs
- Is it possible to add additional shards to an already pre-built index that was created from a set of shards (
faiss.contrib.ondisk.merge_ondisk()), or does one need to run merge_ondisk again? I'm dealing with very large indices...
- In a distributed setting, indexing shards requires knowing the global ID offset. (https://github.com/facebookresearch/faiss/blob/main/demos/demo_ondisk_ivf.py#L53). Is there an efficient way to merge index shards that always start from zero? For example, I may have 10 machines that create an IVF index shard of 100 elements from 0-99. In the merge step, can I efficiently (or not) "on-the-fly" offset the indices of each such that they are globally consistent, perhaps based on the order of the list provided to
merge_ondisk?
Thank you!
Hello,
Great library. Had two questions I couldn't find answers to in the past issues / READMEs
faiss.contrib.ondisk.merge_ondisk()), or does one need to runmerge_ondiskagain? I'm dealing with very large indices...merge_ondisk?Thank you!