Skip to content

Add on disk 4x compression with Faiss#2425

Open
naveentatikonda wants to merge 10 commits intoopensearch-project:mainfrom
naveentatikonda:faiss_ondisk_4x
Open

Add on disk 4x compression with Faiss#2425
naveentatikonda wants to merge 10 commits intoopensearch-project:mainfrom
naveentatikonda:faiss_ondisk_4x

Conversation

@naveentatikonda
Copy link
Copy Markdown
Member

Description

Add on disk 4x compression with Faiss which accepts fp32 vectors as input and dynamically quantizes them into byte sized vectors using the Faiss SQ8 quantizer.

Related Issues

Resolves #1723

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement backport 2.x labels Jan 23, 2025
@naveentatikonda naveentatikonda force-pushed the faiss_ondisk_4x branch 2 times, most recently from 6721604 to 7739606 Compare January 23, 2025 23:03
@naveentatikonda naveentatikonda changed the base branch from main to 2.x January 23, 2025 23:03
Copy link
Copy Markdown
Collaborator

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed partially

Comment thread jni/src/faiss_index_service.cpp Outdated
}
}

jlong IndexService::initIndexFromTemplate(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between this and initIndex is the index creation call to faiss, can we abstract out the logic and reuse the rest please. you can pass in the pointer returned by faiss to reuse the logic and set the index uniq pointer maybe.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored as discussed and also validated that the index is getting deleted

Comment thread jni/src/faiss_index_service.cpp
Comment thread src/main/java/org/opensearch/knn/index/KNNIndexShard.java Outdated
@naveentatikonda
Copy link
Copy Markdown
Member Author

naveentatikonda commented Jan 27, 2025

Update - After looking at the benchmarks and having a chat with other maintainers of k-NN we want to put this feature on hold from releasing in 2.19. Will run more benchmarking tests and compare them with lucene (with rescoring POC) and then decide if we need to switch the default engine for on_disk 4x compression.

Comment on lines +263 to +265
if ((quantizationParams.getTypeIdentifier()).equals(
ScalarQuantizationParams.generateTypeIdentifier(ScalarQuantizationType.EIGHT_BIT)
)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to ensure that NPE doesn't come in case quantizationParams.getTypeIdentifier() == null, we should reverse the check.

ScalarQuantizationParams.generateTypeIdentifier(ScalarQuantizationType.EIGHT_BIT).equal(quantizationParams.getTypeIdentifier())

Comment on lines +266 to +271
quantizationState = quantizationService.train(quantizationParams, knnVectorValues, totalLiveDocs, fieldInfo);
} else {
initQuantizationStateWriterIfNecessary();
quantizationState = quantizationService.train(quantizationParams, knnVectorValues, totalLiveDocs, fieldInfo);
quantizationStateWriter.writeState(fieldInfo.getFieldNumber(), quantizationState);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole logic can be simplified where we create the QS first and then we just see if writer needs to init and it needs to write the state,.

return (indexInfo.getQuantizationState() instanceof ByteScalarQuantizationState);
}

private byte[] getIndexTemplate(BuildIndexParams indexInfo) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit pick: make the param final for all the functions

int dimensions;

if (quantizationState != null) {
if (quantizationState != null && !(quantizationState instanceof ByteScalarQuantizationState)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need java doc on this. This kind of instanceOf check is making me nervous. can we think of something here.

Comment on lines +124 to +126
public QuantizationState train(final TrainingRequest<float[]> trainingRequest, final FieldInfo fieldInfo) throws IOException {
return null;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Comment on lines +69 to +71
public QuantizationState train(final TrainingRequest<float[]> trainingRequest, final FieldInfo fieldInfo) throws IOException {
return null;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

*/
QuantizationState train(TrainingRequest<T> trainingRequest) throws IOException;

QuantizationState train(TrainingRequest<T> trainingRequest, FieldInfo fieldInfo) throws IOException;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add java doc and also why we need this function? I thought we are keeping QF free from fieldInfo and other things.

import static org.opensearch.knn.common.FieldInfoExtractor.extractVectorDataType;
import static org.opensearch.knn.index.codec.transfer.OffHeapVectorTransferFactory.getVectorTransfer;

public class ByteScalarQuantizer implements Quantizer<float[], byte[]> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add java doc on all your new classes.

Comment on lines +60 to +71
if (sampledIndices.length == 0) {
return null;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have some logs here.

@naveentatikonda naveentatikonda deleted the faiss_ondisk_4x branch June 4, 2025 04:02
@naveentatikonda naveentatikonda changed the base branch from 2.x to main June 4, 2025 04:20
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Features Introduces a new unit of functionality that satisfies a requirement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Faiss online training to quantize fp32 vectors as int8 vectors

4 participants