Ruby bindings for SlateDB, a cloud-native embedded key-value store built on object storage.
These bindings are still in early development, and while SlateDB itself is used in Production, these bindings have yet to be. Contributions are welcome!
- Cross-compile native extensions
Add this line to your application's Gemfile:
gem 'slatedb'And then execute:
bundle installOr install it yourself as:
gem install slatedbImportant
This gem currently requires a working Rust toolchain to install until the dependencies are cross-compiled.
require 'slatedb'
# Open a database with in-memory storage (for testing)
db = SlateDb::Database.open("/tmp/mydb")
# Store a value
db.put("hello", "world")
# Retrieve a value
value = db.get("hello") # => "world"
# Delete a value
db.delete("hello")
# Close the database
db.closeThe block form automatically closes the database when the block exits:
SlateDb::Database.open("/tmp/mydb") do |db|
db.put("key", "value")
db.get("key") # => "value"
end # automatically closedFor persistent storage, provide an object store URL:
# Local filesystem
SlateDb::Database.open("/tmp/mydb", url: "file:///tmp/mydb") do |db|
db.put("key", "value")
end
# S3 (requires AWS credentials)
SlateDb::Database.open("mydb", url: "s3://mybucket/path") do |db|
db.put("key", "value")
end
# Azure Blob Storage
SlateDb::Database.open("mydb", url: "az://container/path") do |db|
db.put("key", "value")
end
# Google Cloud Storage
SlateDb::Database.open("mydb", url: "gs://bucket/path") do |db|
db.put("key", "value")
endSlateDB uses the object_store crate, which automatically discovers credentials from standard environment variables and configuration files:
AWS S3:
- Environment variables:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN,AWS_REGION - Credential files:
~/.aws/credentials,~/.aws/config - IAM roles (when running on EC2/ECS/EKS)
- Web identity tokens (for IRSA on EKS)
Azure Blob Storage:
- Environment variables:
AZURE_STORAGE_ACCOUNT_NAME,AZURE_STORAGE_ACCOUNT_KEY,AZURE_STORAGE_SAS_TOKEN - Azure CLI credentials:
az login - Managed Identity (when running on Azure)
Google Cloud Storage:
- Environment variables:
GOOGLE_SERVICE_ACCOUNT,GOOGLE_SERVICE_ACCOUNT_PATH,GOOGLE_SERVICE_ACCOUNT_KEY - Application Default Credentials:
gcloud auth application-default login - Service account key file:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
Example with explicit AWS credentials:
# Set credentials via environment
ENV['AWS_ACCESS_KEY_ID'] = 'your-access-key'
ENV['AWS_SECRET_ACCESS_KEY'] = 'your-secret-key'
ENV['AWS_REGION'] = 'us-east-1'
SlateDb::Database.open("mydb", url: "s3://mybucket/path") do |db|
db.put("key", "value")
end# Set TTL (time-to-live) in milliseconds
db.put("key", "value", ttl: 60_000) # expires in 60 seconds
# Don't wait for durability
db.put("key", "value", await_durable: false)# Filter by durability level
db.get("key", durability_filter: "memory")
db.get("key", durability_filter: "remote")
# Include uncommitted data
db.get("key", dirty: true)# Don't wait for durability
db.delete("key", await_durable: false)Iterate over key ranges using the scan method:
# Scan all keys from "a" onwards
db.scan("a").each do |key, value|
puts "#{key}: #{value}"
end
# Scan a specific range [start, end)
db.scan("a", "z").each do |key, value|
puts "#{key}: #{value}"
end
# Use Enumerable methods
keys = db.scan("user:").map { |k, v| k }
users = db.scan("user:").select { |k, v| v.include?("active") }
# Convert to array
all_entries = db.scan("").to_aScan all keys with a given prefix using scan_prefix:
# Scan all keys starting with "user:"
db.scan_prefix("user:").each do |key, value|
puts "#{key}: #{value}"
end
# Block form
db.scan_prefix("order:") do |key, value|
puts "#{key}: #{value}"
end
# Works with transactions, snapshots, and readers too
db.transaction do |txn|
txn.scan_prefix("item:").each do |k, v|
puts "#{k}: #{v}"
end
endMerge operations allow you to combine values without reading them first, useful for counters, append-only logs, and similar patterns:
# Open with a built-in merge operator
SlateDb::Database.open("/tmp/mydb", merge_operator: :string_concat) do |db|
# Merge appends to existing values (or creates if key doesn't exist)
db.merge("log", "line1\n")
db.merge("log", "line2\n")
db.merge("log", "line3\n")
db.get("log") # => "line1\nline2\nline3\n"
end
# Merge with options
db.merge("key", "value", ttl: 60_000, await_durable: false)
# Works in transactions and batches
db.transaction do |txn|
txn.merge("counter", "1")
end
db.batch do |b|
b.merge("key", "a")
.merge("key", "b")
endYou can provide a Ruby Proc/lambda as a custom merge operator:
# Counter merge operator (adds numbers)
counter_merge = ->(key, existing, new_value) {
existing_num = existing ? existing.to_i : 0
(existing_num + new_value.to_i).to_s
}
SlateDb::Database.open("/tmp/mydb", merge_operator: counter_merge) do |db|
db.merge("visits", "1")
db.merge("visits", "1")
db.merge("visits", "1")
db.get("visits") # => "3"
end
# Max value merge operator
max_merge = ->(key, existing, new_value) {
existing_num = existing ? existing.to_i : 0
new_num = new_value.to_i
[existing_num, new_num].max.to_s
}
SlateDb::Database.open("/tmp/mydb", merge_operator: max_merge) do |db|
db.merge("high_score", "100")
db.merge("high_score", "250")
db.merge("high_score", "150")
db.get("high_score") # => "250"
endThe proc receives three arguments:
key- The key being mergedexisting- The existing value (nil if no value exists)new_value- The new merge operand
Note: Custom Proc merge operators work best with direct db.merge() calls. When used with transactions or batches, some merge operations may be processed on background threads and fall back to string concatenation.
:string_concat(or:concat) - Concatenates byte values (built-in)- Any
Procorlambda- Custom merge logic
Perform multiple writes atomically:
# Create a batch manually
batch = SlateDb::WriteBatch.new
batch.put("key1", "value1")
batch.put("key2", "value2", ttl: 60_000)
batch.delete("old_key")
db.write(batch)
# Or use the block helper
db.batch do |b|
b.put("key1", "value1")
b.put("key2", "value2")
b.delete("old_key")
endACID transactions with snapshot or serializable isolation:
# Block form (recommended) - auto-commits on success, rolls back on exception
db.transaction do |txn|
balance = txn.get("balance").to_i
txn.put("balance", (balance - 100).to_s)
txn.put("withdrawal", "100")
end
# With serializable isolation for strict consistency
db.transaction(isolation: :serializable) do |txn|
counter = txn.get("counter").to_i
txn.put("counter", (counter + 1).to_s)
end
# Manual transaction management
txn = db.begin_transaction(isolation: :snapshot)
txn.put("key", "value")
txn.commit # or txn.rollbackTransaction operations:
db.transaction do |txn|
# Read
value = txn.get("key")
# Write
txn.put("key", "value")
txn.put("expiring", "data", ttl: 30_000)
# Delete
txn.delete("old_key")
# Scan
txn.scan("prefix:").each do |k, v|
puts "#{k}: #{v}"
end
# Scan with prefix
txn.scan_prefix("user:").each do |k, v|
puts "#{k}: #{v}"
end
endIn serializable transactions, use mark_read to explicitly track keys for conflict detection without actually reading them:
db.transaction(isolation: :serializable) do |txn|
# Mark keys as read for conflict detection
txn.mark_read(["key1", "key2", "key3"])
# Now if another transaction modifies key1/key2/key3,
# this transaction will fail on commit
txn.put("result", "computed_value")
endCreate durable checkpoints for backup or read replica purposes:
SlateDb::Database.open("/tmp/mydb", url: "file:///tmp/mydb") do |db|
db.put("key", "value")
db.flush
# Create a checkpoint
checkpoint = db.create_checkpoint
puts "Checkpoint ID: #{checkpoint[:id]}"
puts "Manifest ID: #{checkpoint[:manifest_id]}"
# Create a named checkpoint with lifetime
checkpoint = db.create_checkpoint(
name: "before-migration",
lifetime: 3_600_000 # 1 hour in milliseconds
)
endPoint-in-time consistent reads:
# Block form (recommended)
db.snapshot do |snap|
# All reads see the same consistent state
value1 = snap.get("key1")
value2 = snap.get("key2")
snap.scan("prefix:").each do |k, v|
puts "#{k}: #{v}"
end
end # automatically closed
# Manual management
snap = db.snapshot
value = snap.get("key")
snap.closeOpen a database in read-only mode, useful for replicas:
# Basic read-only access
SlateDb::Reader.open("/tmp/mydb", url: "s3://bucket/path") do |reader|
value = reader.get("key")
reader.scan("prefix:").each do |k, v|
puts "#{k}: #{v}"
end
end
# Open at a specific checkpoint
SlateDb::Reader.open("/tmp/mydb",
url: "s3://bucket/path",
checkpoint_id: "uuid-here") do |reader|
reader.get("key")
endAdministrative operations for database management:
admin = SlateDb::Admin.new("/tmp/mydb", url: "s3://bucket/path")
# Manifests
json = admin.read_manifest # Latest manifest as JSON
json = admin.read_manifest(123) # Specific manifest by ID
json = admin.list_manifests # List all manifests
json = admin.list_manifests(start: 1, end_id: 10) # Range query
# Checkpoints
result = admin.create_checkpoint(name: "backup-2024")
# => { id: "uuid-string", manifest_id: 7 }
checkpoints = admin.list_checkpoints
checkpoints = admin.list_checkpoints(name: "backup") # Filter by name
admin.refresh_checkpoint("uuid", lifetime: 3600_000) # Extend lifetime
admin.delete_checkpoint("uuid")
# Garbage Collection
admin.run_gc # Run with default settings
admin.run_gc(min_age: 3600_000) # Set min age for all directories (1 hour)
admin.run_gc(manifest_min_age: 86400_000) # Custom age for manifest (1 day)
admin.run_gc(wal_min_age: 60_000) # Custom age for WAL (1 minute)
admin.run_gc(compacted_min_age: 60_000) # Custom age for compacted (1 minute)Ensure all writes are persisted:
db.put("key", "value")
db.flushSlateDB is fully thread-safe and optimized for concurrent access.
- The
Databaseclass can be safely shared across multiple Ruby threads - All operations (get, put, delete, scan, transactions) are thread-safe
- The Ruby bindings release the Global VM Lock (GVL) during I/O operations, allowing other Ruby threads to run concurrently
- Perfect for use with multi-threaded Ruby applications like Puma, Sidekiq, and concurrent test suites
db = SlateDb::Database.open("/tmp/mydb")
# Safe to use from multiple threads
threads = 10.times.map do |i|
Thread.new do
db.put("key-#{i}", "value-#{i}")
db.get("key-#{i}")
end
end
threads.each(&:join)Implementation details:
- The underlying SlateDB library uses
Arc(atomic reference counting) andRwLockfor internal state management - I/O operations release the Ruby GVL using
rb_thread_call_without_gvl, preventing blocking other threads - A shared Tokio multi-threaded runtime handles all async operations efficiently
SlateDB defines several exception classes:
begin
db.put("", "value") # empty key
rescue SlateDb::InvalidArgumentError => e
puts "Invalid argument: #{e.message}"
rescue SlateDb::TransactionError => e
puts "Transaction conflict: #{e.message}"
rescue SlateDb::Error => e
puts "SlateDB error: #{e.message}"
endException hierarchy:
SlateDb::Error- Base class (inherits fromStandardError)SlateDb::TransactionError- Transaction conflictsSlateDb::ClosedError- Database has been closedSlateDb::UnavailableError- Storage/network unavailableSlateDb::InvalidArgumentError- Invalid argumentsSlateDb::DataError- Data corruption or format errorsSlateDb::InternalError- Internal errors
- Ruby 3.1+
- Rust toolchain (for building from source)
After checking out the repo, run:
bundle install
bundle exec rake compile
bundle exec rake specTo run specific tests:
bundle exec rspec spec/database_spec.rb
bundle exec rspec spec/transaction_spec.rbBug reports and pull requests are welcome on GitHub at https://github.com/catkins/slatedb-rb.
Also, find me on the SlateDB Discord Server.
Apache-2.0