Native support for half precision

Many new embedding models are outputting half precison (and even single byte precision) and we would like to be able to support these types natively within FAISS. 

From what we can tell, the most straightforward way to do this would be to introduce a class-level template in `faiss::Index` or abstract class which we can instantiate/implement for those index types which we would like to support, but would always instantiate `float` types so that we can maintain compatiblity w/ the existing FAISS APIs. 

@mdouze @wickedfoo @algoriddle @alexanderguzhva any other ideas on how we could support this funciontality? This is specifically being requested for CAGRA to start, but I suspect we will eventually want to support this more broadly? I also understand this can add to the binary size. At least on the GPU side, cuVS contains half- and byte-precision already, so it's just a matter of calling those APIs. Eventually cuVS will be moving some of the additoinal types to the new `nvjjitlink` technology so that it'll be compiled and linked at runtime. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native support for half precision #4324

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Native support for half precision #4324

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions