Skip to content

Native support for half precision #4324

@cjnolet

Description

@cjnolet

Many new embedding models are outputting half precison (and even single byte precision) and we would like to be able to support these types natively within FAISS.

From what we can tell, the most straightforward way to do this would be to introduce a class-level template in faiss::Index or abstract class which we can instantiate/implement for those index types which we would like to support, but would always instantiate float types so that we can maintain compatiblity w/ the existing FAISS APIs.

@mdouze @wickedfoo @algoriddle @alexanderguzhva any other ideas on how we could support this funciontality? This is specifically being requested for CAGRA to start, but I suspect we will eventually want to support this more broadly? I also understand this can add to the binary size. At least on the GPU side, cuVS contains half- and byte-precision already, so it's just a matter of calling those APIs. Eventually cuVS will be moving some of the additoinal types to the new nvjjitlink technology so that it'll be compiled and linked at runtime.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions