Nearest neighbor search for Ruby and S3 Vectors
Add this line to your application’s Gemfile:
gem "neighbor-s3"Create a vector bucket and set your AWS credentials in your environment:
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...Create an index
index = Neighbor::S3::Index.new("items", bucket: "my-bucket", dimensions: 3, distance: "cosine")
index.createAdd vectors
index.add(1, [1, 1, 1])
index.add(2, [2, 2, 2])
index.add(3, [1, 1, 2])Search for nearest neighbors to a vector
index.search([1, 1, 1], count: 5)Search for nearest neighbors to a vector in the index
index.search_id(1, count: 5)IDs are treated as strings by default, but can also be treated as integers
Neighbor::S3::Index.new("items", id_type: "integer", ...)Add or update a vector
index.add(id, vector)Add or update multiple vectors
index.add_all([{id: 1, vector: [1, 2, 3]}, {id: 2, vector: [4, 5, 6]}])Get a vector
index.find(id)Get all vectors
index.find_in_batches do |batch|
# ...
endRemove a vector
index.remove(id)Remove multiple vectors
index.remove_all(ids)Add a vector with metadata
index.add(id, vector, metadata: {category: "A"})Add multiple vectors with metadata
index.add_all([
{id: 1, vector: [1, 2, 3], metadata: {category: "A"}},
{id: 2, vector: [4, 5, 6], metadata: {category: "B"}}
])Get metadata with search results
index.search(vector, with_metadata: true)Filter by metadata
index.search(vector, filter: {category: "A"})Supports these operators
Specify non-filterable metadata on index creation
Neighbor::S3::Index.new(name, non_filterable: ["category"], ...)You can use Neighbor S3 for online item-based recommendations with Disco. We’ll use MovieLens data for this example.
Create an index
index = Neighbor::S3::Index.new("movies", bucket: "my-bucket", dimensions: 20, distance: "cosine")Fit the recommender
data = Disco.load_movielens
recommender = Disco::Recommender.new(factors: 20)
recommender.fit(data)Store the item factors
index.add_all(recommender.item_ids.map { |v| {id: v, vector: recommender.item_factors(v)} })And get similar movies
index.search_id("Star Wars (1977)").map { |v| v[:id] }See the complete code
Get index info
index.infoCheck if an index exists
index.exists?Drop an index
index.dropView the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/neighbor-s3.git
cd neighbor-s3
bundle install
bundle exec rake test