Skip to content

pgvector/pgvector-r

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pgvector-r

pgvector examples for R

Supports DBI and dbx

Build Status

Getting Started

Follow the instructions for your database library:

Or check out an example:

DBI

Enable the extension

dbExecute(db, "CREATE EXTENSION IF NOT EXISTS vector")

Create a table

dbExecute(db, "CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")

Insert vectors

encodeVector <- function(vec) {
  stopifnot(is.numeric(vec))
  paste0("[", paste(vec, collapse=","), "]")
}

embeddings <- list(
  c(1, 1, 1),
  c(2, 2, 2),
  c(1, 1, 2)
)

items <- data.frame(embedding=sapply(embeddings, encodeVector))
dbAppendTable(db, "items", items)

Get the nearest neighbors

params <- list(encodeVector(c(1, 2, 3)))
dbGetQuery(db, "SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5", params=params)

Add an approximate index

dbExecute(db, "CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")
# or
dbExecute(db, "CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

dbx

Enable the extension

dbxExecute(db, "CREATE EXTENSION IF NOT EXISTS vector")

Create a table

dbxExecute(db, "CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")

Insert vectors

encodeVector <- function(vec) {
  stopifnot(is.numeric(vec))
  paste0("[", paste(vec, collapse=","), "]")
}

embeddings <- list(
  c(1, 1, 1),
  c(2, 2, 2),
  c(1, 1, 2)
)

items <- data.frame(embedding=sapply(embeddings, encodeVector))
dbxInsert(db, "items", items)

Get the nearest neighbors

params <- list(encodeVector(c(1, 2, 3)))
dbxSelect(db, "SELECT * FROM items ORDER BY embedding <-> ? LIMIT 5", params=params)

Add an approximate index

dbxExecute(db, "CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)")
# or
dbxExecute(db, "CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)")

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/pgvector/pgvector-r.git
cd pgvector-r
createdb pgvector_r_test
Rscript -e "install.packages('remotes', repos='https://cloud.r-project.org')"
Rscript -e "remotes::install_deps(dependencies=TRUE)"
Rscript DBI/example.R
Rscript dbx/example.R

To run an example:

cd examples/openai
createdb pgvector_example
Rscript -e "remotes::install_deps(dependencies=TRUE)"
Rscript example.R

About

pgvector examples for R

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages