A high-performance Dart wrapper for the Rust-based EmbedAnything library, providing fast and efficient vector embeddings for text using state-of-the-art models from HuggingFace Hub.
EmbedAnythingInDart brings the power of Rust's performance to Dart applications, enabling you to generate high-quality embeddings for semantic search, similarity matching, and other NLP tasks. The library leverages Dart's Native Assets system for seamless cross-platform compilation and provides an idiomatic Dart API with automatic memory management.
Key Benefits:
- Fast Rust-powered embedding generation
- Automatic memory management via NativeFinalizer
- Support for popular BERT and Jina models from HuggingFace
- Batch processing for optimal throughput
- Cross-platform support (macOS, Linux, Windows)
- Zero-copy FFI for maximum performance
- Text Embedding: Generate dense vector representations of text using BERT and Jina models
- Batch Processing: Efficiently process multiple texts with 5-10x speedup over sequential processing
- Automatic Memory Management: NativeFinalizer ensures native resources are cleaned up automatically
- Flexible Configuration: Customize model loading with precision (F32/F16), normalization, and batch size options
- Semantic Similarity: Built-in cosine similarity computation for comparing embeddings
- Cross-Platform: Native compilation for macOS, Linux, and Windows via Dart Native Assets
- Type-Safe Errors: Comprehensive sealed error hierarchy for robust error handling
Add this to your package's pubspec.yaml:
dependencies:
embedanythingindart:
git:
url: https://github.com/yourusername/embedanythingindart.git
ref: mainThen run:
dart pub getPrerequisites:
- Dart SDK >=3.11.0
- Rust toolchain 1.90.0 or later
- Platform-specific build tools:
- macOS: Xcode Command Line Tools
- Linux: build-essential, pkg-config
- Windows: MSVC Build Tools
import 'package:embedanythingindart/embedanythingindart.dart';
void main() {
// Load a model (first load downloads from HuggingFace, ~100-500MB)
final embedder = EmbedAnything.fromPretrainedHf(
model: EmbeddingModel.bert,
modelId: 'sentence-transformers/all-MiniLM-L6-v2',
);
// Generate a single embedding
final result = embedder.embedText('Hello, world!');
print('Embedding dimension: ${result.dimension}'); // 384
// Batch processing (much faster for multiple texts)
final texts = [
'Machine learning is fascinating',
'I love programming in Dart',
'The weather is nice today',
];
final embeddings = embedder.embedTextsBatch(texts);
// Compute semantic similarity
final similarity = embeddings[0].cosineSimilarity(embeddings[1]);
print('Similarity: ${similarity.toStringAsFixed(4)}'); // 0.3245
// Cleanup (automatic via finalizer, but manual is recommended)
embedder.dispose();
}Comprehensive documentation is available to help you get the most out of EmbedAnythingInDart. We recommend reading the guides in the following order:
- Getting Started - Installation, prerequisites, and your first embedding
- Core Concepts - Architecture, key classes, and fundamental concepts
- Usage Guide - Common patterns, real-world examples, and best practices
- API Reference - Complete API documentation with all classes and methods
- Models and Configuration - Choosing models, performance comparison, and configuration options
- Error Handling - Error types, handling strategies, and troubleshooting
- Advanced Topics - File embedding, streaming, optimization, and advanced patterns
Each guide includes working code examples extracted from the test suite and example application.
| Model ID | Type | Dimensions | Speed | Quality | Best For |
|---|---|---|---|---|---|
sentence-transformers/all-MiniLM-L6-v2 |
BERT | 384 | Fast | Good | General purpose |
sentence-transformers/all-MiniLM-L12-v2 |
BERT | 384 | Medium | Better | Quality-focused tasks |
jinaai/jina-embeddings-v2-small-en |
Jina | 512 | Fast | Good | English semantic search |
jinaai/jina-embeddings-v2-base-en |
Jina | 768 | Medium | Excellent | High-quality retrieval |
Model Download:
- Models are downloaded from HuggingFace Hub on first use
- Cached locally in
~/.cache/huggingface/hub - First load: 2-5 seconds (plus download time)
- Subsequent loads: <100ms (from cache)
// BERT all-MiniLM-L6-v2 (recommended for most use cases)
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());
// Jina v2-base-en (best quality)
final embedder = EmbedAnything.fromConfig(ModelConfig.jinaV2Base());final config = ModelConfig(
modelId: 'sentence-transformers/all-MiniLM-L6-v2',
modelType: EmbeddingModel.bert,
revision: 'main', // Git branch/tag/commit
dtype: ModelDtype.f16, // F16 for faster inference
normalize: true, // Normalize to unit length
defaultBatchSize: 64, // Batch size for processing
);
final embedder = EmbedAnything.fromConfig(config);final embedder = EmbedAnything.fromPretrainedHf(
model: EmbeddingModel.bert,
modelId: 'sentence-transformers/all-MiniLM-L6-v2',
revision: 'main',
);final result = embedder.embedText('The quick brown fox');
print('Dimension: ${result.dimension}');
print('First 5 values: ${result.values.take(5)}');Performance:
- Short text (10 words): ~7-8ms
- Medium text (100 words): ~8-15ms (varies by model)
- Long text (500 words): ~15-30ms (varies by model)
- Very long text (>512 tokens): truncated automatically
See benchmark/results.md for detailed performance data.
final texts = List.generate(100, (i) => 'Text $i');
final results = embedder.embedTextsBatch(texts);
// Process results
for (var i = 0; i < results.length; i++) {
print('Text $i: dimension ${results[i].dimension}');
}Performance:
- Batch processing is 3-4x faster than sequential (measured)
- Recommended batch size: 32-128 items
- BERT L6: ~775 items/sec for batch of 100
- Memory usage scales with batch size
For comprehensive benchmarks, see benchmark/results.md.
For large batches (100+ texts), use the async API with progress tracking:
final results = await embedder.embedTextsBatchAsync(
largeTextList,
chunkSize: 50, // Optional: override default (32)
onProgress: (completed, total) {
print('Progress: $completed / $total');
},
);Migration Note (v0.2.0+): embedTextsBatchAsync() now automatically chunks
large batches to prevent memory issues and system overload. This is a behavior
change - previous versions sent all texts to Rust at once. The chunking uses
ModelConfig.defaultBatchSize (default: 32). To process a specific chunk size,
pass the chunkSize parameter.
For large batch operations, you can limit the number of threads used for parallel computation. This is especially important on machines with many CPU cores:
// Configure BEFORE loading any models (call once at app startup)
EmbedAnything.configureThreadPool(4); // Limit to 4 threads
// Check current thread pool size
print('Thread pool size: ${EmbedAnything.getThreadPoolSize()}');
// Now load models - they will use the configured thread pool
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());Recommended thread counts:
- 4-8 threads for most use cases
- 2-4 threads for memory-constrained environments
- Default (num_cpus) may cause high memory usage on machines with many cores
Note: Thread pool configuration must happen before ANY embedding operations. Once the pool is initialized, it cannot be reconfigured.
final emb1 = embedder.embedText('I love machine learning');
final emb2 = embedder.embedText('Machine learning is great');
final emb3 = embedder.embedText('I enjoy cooking');
final sim12 = emb1.cosineSimilarity(emb2);
final sim13 = emb1.cosineSimilarity(emb3);
print('Related texts: ${sim12.toStringAsFixed(4)}'); // ~0.87
print('Unrelated texts: ${sim13.toStringAsFixed(4)}'); // ~0.21Interpreting Similarity Scores:
- 0.9-1.0: Nearly identical meaning
- 0.7-0.9: Highly related
- 0.5-0.7: Moderately related
- 0.3-0.5: Somewhat related
- 0.0-0.3: Weakly related or unrelated
try {
final embedder = EmbedAnything.fromPretrainedHf(
model: EmbeddingModel.bert,
modelId: 'invalid/model',
);
} on EmbedAnythingError catch (e) {
switch (e) {
case ModelNotFoundError():
print('Model not found: ${e.modelId}');
print('Check https://huggingface.co/ for valid models');
case InvalidConfigError():
print('Invalid configuration: ${e.field} - ${e.reason}');
case EmbeddingFailedError():
print('Embedding failed: ${e.reason}');
case MultiVectorNotSupportedError():
print('Multi-vector embeddings not yet supported');
case FFIError():
print('FFI error: ${e.operation}');
}
}Note: Benchmarks below are representative values from Phase 1 testing. Actual performance depends on hardware, model, and text characteristics.
| Model | Cold Start (first time) | Warm Start (cached) | Memory |
|---|---|---|---|
| BERT all-MiniLM-L6-v2 | 2-5 seconds + download | ~100ms | ~90MB |
| Jina v2-base-en | 3-6 seconds + download | ~150ms | ~280MB |
Note: Cold start includes model download (100-500MB depending on model).
| Text Length | BERT MiniLM-L6 | Jina v2-base |
|---|---|---|
| Short (10 words) | ~5-10ms | ~10-15ms |
| Medium (100 words) | ~8-15ms | ~12-20ms |
| Long (500 words) | ~15-30ms | ~20-40ms |
Platform note: Performance varies by CPU. M1/M2 Macs show ~1.5x speedup over Intel equivalents.
| Batch Size | Processing Time | Speedup vs Sequential |
|---|---|---|
| 10 items | ~20-30ms | ~3x faster |
| 100 items | ~170-250ms | ~5x faster |
| 1000 items | ~1.5-2.5s | ~8x faster |
Recommendation: Use batch processing for 10+ items for best performance.
- Base library overhead: ~10MB
- Per-model memory: 45-280MB (depends on model and dtype)
- Per-embedding overhead: Negligible (<1KB per embedding)
- Batch processing: Temporary memory scales linearly with batch size
void processTexts() {
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());
final result = embedder.embedText('test');
// No dispose() needed - finalizer cleans up when embedder is garbage collected
}void processTexts() {
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());
try {
final result = embedder.embedText('test');
// Use result...
} finally {
embedder.dispose(); // Immediate cleanup
}
}- Manual
dispose()for long-running services or apps with many embedders - Automatic cleanup for short-lived scripts or infrequent usage
- Reuse embedders instead of creating many instances
- Use batch processing to minimize overhead
- Avoid creating embedders in loops - create once, reuse many times
| Platform | Architecture | Status |
|---|---|---|
| macOS | x64 (Intel) | Supported |
| macOS | ARM64 (Apple Silicon) | Supported |
| Linux | x64 | Supported |
| Linux | ARM64 | Supported |
| Windows | x64 | Supported |
macOS:
- macOS 11.0 or later
- Xcode Command Line Tools:
xcode-select --install
Linux:
- Debian/Ubuntu:
apt install build-essential pkg-config - Fedora/RHEL:
dnf install gcc pkg-config - Arch:
pacman -S base-devel
Windows:
- Visual Studio 2019 or later with C++ Build Tools
- Or Windows SDK + MSVC from VS Build Tools
The first build compiles 488 Rust crates and will take 5-15 minutes depending on your hardware. Subsequent builds are incremental and much faster (<30 seconds).
# First build - patience required!
dart run --enable-experiment=native-assets example/embedanythingindart_example.dartFor detailed troubleshooting, see TROUBLESHOOTING.md.
Common issues:
| Issue | Quick Solution |
|---|---|
| First build extremely slow | Expected - compiling 488 crates takes 5-15 minutes |
| Model download fails | Check internet connection, set HF_TOKEN for private models |
| Asset not found error | Verify asset name consistency in Cargo.toml, build.dart, bindings.dart |
| Out of memory | Reduce batch size or use F16 dtype |
| Tests fail | Ensure internet connection for model downloads |
For detailed guides with examples and best practices, see the Documentation section above, particularly:
- API Reference - Complete API documentation with signatures, parameters, and examples
- Usage Guide - Practical usage patterns
- Error Handling - Error types and handling strategies
Auto-generated dartdoc API documentation is also available:
dart doc
dart pub global activate dhttpd
dhttpd --path doc/apiThen open http://localhost:8080 in your browser.
Core Classes:
EmbedAnything- Main embedder interfaceEmbeddingResult- Embedding vector resultModelConfig- Model configurationEmbeddingModel- Model architecture enumModelDtype- Model data type enumEmbedAnythingError- Error hierarchy
Contributions are welcome! Please follow these guidelines:
# Clone the repository
git clone https://github.com/yourusername/embedanythingindart.git
cd embedanythingindart
# Install dependencies
dart pub get
# Install Rust targets
cd rust && rustup show
# Run tests
dart test --enable-experiment=native-assets
# Run analyzer
dart analyze
# Format code
dart format .# Run all tests
dart test --enable-experiment=native-assets
# Run specific test file
dart test --enable-experiment=native-assets test/model_config_test.dart
# Run with coverage
dart test --enable-experiment=native-assets --coverage=coverage
# Run Phase 3 file/directory embedding tests specifically
dart test --enable-experiment=native-assets test/phase3_integration_test.dartPhase 3 Test Requirements:
- Test fixtures are located in
test/fixtures/ - Integration tests require internet connection on first run to download BERT model (~90MB)
- Subsequent runs use cached model from
~/.cache/huggingface/hub - Tests verify:
- File embedding (.txt, .md files)
- Directory streaming with extension filtering
- Error handling (FileNotFoundError, UnsupportedFileFormatError)
- Metadata parsing and ChunkEmbedding utilities
- Memory management (no leaks)
For detailed information about Phase 3 features, see test fixture documentation in test/fixtures/README.md.
- Follow Effective Dart guidelines
- Add dartdoc comments to all public APIs
- Ensure
dart analyzepasses with zero issues - Ensure
cargo clippypasses with zero warnings - Write tests for new features
- Update CHANGELOG.md with changes
This project uses Dart's Native Assets system. Key files:
hook/build.dart- Native asset build hookrust/Cargo.toml- Rust crate configurationrust-toolchain.toml- Rust version pinning
Asset name consistency is critical:
rust/Cargo.toml:name = "embedanything_dart"hook/build.dart:assetName: 'embedanything_dart'lib/src/ffi/bindings.dart:assetId: 'package:embedanythingindart/embedanything_dart'
This project is licensed under the MIT License - see the LICENSE file for details.
The underlying EmbedAnything Rust library is licensed under the Apache License 2.0.
- EmbedAnything - The Rust library this wraps
- HuggingFace - For hosting the embedding models
- sentence-transformers - For the BERT models
- Jina AI - For the Jina embedding models
- surrealdartb - SurrealDB client for Dart with vector support
- EmbedAnything - Upstream Rust library
- Candle - Rust ML framework powering EmbedAnything
Phase 2: Production Readiness
- CI/CD pipeline with multi-platform testing
- Automated release process
- Performance optimizations
- Security audit
Phase 3: Multi-Modal Expansion
- PDF, DOCX, Markdown file embedding
- Image embedding (CLIP, ColPali)
- Audio embedding (Whisper)
Phase 4: Advanced Features
- Multi-vector embedding support (ColBERT)
- GPU acceleration (CUDA, Metal)
- Model quantization (INT8, INT4)
- Custom tokenizer configuration
Phase 5: Ecosystem Integration
- Vector database adapters (Pinecone, Weaviate, Qdrant)
- Cloud provider embeddings (OpenAI, Cohere)
- Mobile platform support (iOS, Android)
Questions or Issues?
- Open an issue on GitHub
- Read the comprehensive documentation for guides and examples
- Check Error Handling for troubleshooting common problems
- Review the API Reference for complete API documentation