EmbedAnythingInDart

A high-performance Dart wrapper for the Rust-based EmbedAnything library, providing fast and efficient vector embeddings for text using state-of-the-art models from HuggingFace Hub.

Overview

EmbedAnythingInDart brings the power of Rust's performance to Dart applications, enabling you to generate high-quality embeddings for semantic search, similarity matching, and other NLP tasks. The library leverages Dart's Native Assets system for seamless cross-platform compilation and provides an idiomatic Dart API with automatic memory management.

Key Benefits:

Fast Rust-powered embedding generation
Automatic memory management via NativeFinalizer
Support for popular BERT and Jina models from HuggingFace
Batch processing for optimal throughput
Cross-platform support (macOS, Linux, Windows)
Zero-copy FFI for maximum performance

Features

Text Embedding: Generate dense vector representations of text using BERT and Jina models
Batch Processing: Efficiently process multiple texts with 5-10x speedup over sequential processing
Automatic Memory Management: NativeFinalizer ensures native resources are cleaned up automatically
Flexible Configuration: Customize model loading with precision (F32/F16), normalization, and batch size options
Semantic Similarity: Built-in cosine similarity computation for comparing embeddings
Cross-Platform: Native compilation for macOS, Linux, and Windows via Dart Native Assets
Type-Safe Errors: Comprehensive sealed error hierarchy for robust error handling

Installation

Add this to your package's pubspec.yaml:

dependencies:
  embedanythingindart:
    git:
      url: https://github.com/yourusername/embedanythingindart.git
      ref: main

Then run:

dart pub get

Prerequisites:

Dart SDK >=3.11.0
Rust toolchain 1.90.0 or later
Platform-specific build tools:
- macOS: Xcode Command Line Tools
- Linux: build-essential, pkg-config
- Windows: MSVC Build Tools

Quick Start

import 'package:embedanythingindart/embedanythingindart.dart';

void main() {
  // Load a model (first load downloads from HuggingFace, ~100-500MB)
  final embedder = EmbedAnything.fromPretrainedHf(
    model: EmbeddingModel.bert,
    modelId: 'sentence-transformers/all-MiniLM-L6-v2',
  );

  // Generate a single embedding
  final result = embedder.embedText('Hello, world!');
  print('Embedding dimension: ${result.dimension}'); // 384

  // Batch processing (much faster for multiple texts)
  final texts = [
    'Machine learning is fascinating',
    'I love programming in Dart',
    'The weather is nice today',
  ];
  final embeddings = embedder.embedTextsBatch(texts);

  // Compute semantic similarity
  final similarity = embeddings[0].cosineSimilarity(embeddings[1]);
  print('Similarity: ${similarity.toStringAsFixed(4)}'); // 0.3245

  // Cleanup (automatic via finalizer, but manual is recommended)
  embedder.dispose();
}

Documentation

Comprehensive documentation is available to help you get the most out of EmbedAnythingInDart. We recommend reading the guides in the following order:

Getting Started - Installation, prerequisites, and your first embedding
Core Concepts - Architecture, key classes, and fundamental concepts
Usage Guide - Common patterns, real-world examples, and best practices
API Reference - Complete API documentation with all classes and methods
Models and Configuration - Choosing models, performance comparison, and configuration options
Error Handling - Error types, handling strategies, and troubleshooting
Advanced Topics - File embedding, streaming, optimization, and advanced patterns

Each guide includes working code examples extracted from the test suite and example application.

Supported Models

Model ID	Type	Dimensions	Speed	Quality	Best For
`sentence-transformers/all-MiniLM-L6-v2`	BERT	384	Fast	Good	General purpose
`sentence-transformers/all-MiniLM-L12-v2`	BERT	384	Medium	Better	Quality-focused tasks
`jinaai/jina-embeddings-v2-small-en`	Jina	512	Fast	Good	English semantic search
`jinaai/jina-embeddings-v2-base-en`	Jina	768	Medium	Excellent	High-quality retrieval

Model Download:

Models are downloaded from HuggingFace Hub on first use
Cached locally in ~/.cache/huggingface/hub
First load: 2-5 seconds (plus download time)
Subsequent loads: <100ms (from cache)

Usage

Loading a Model

Using Predefined Configurations

// BERT all-MiniLM-L6-v2 (recommended for most use cases)
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());

// Jina v2-base-en (best quality)
final embedder = EmbedAnything.fromConfig(ModelConfig.jinaV2Base());

Using Custom Configuration

final config = ModelConfig(
  modelId: 'sentence-transformers/all-MiniLM-L6-v2',
  modelType: EmbeddingModel.bert,
  revision: 'main',        // Git branch/tag/commit
  dtype: ModelDtype.f16,   // F16 for faster inference
  normalize: true,         // Normalize to unit length
  defaultBatchSize: 64,    // Batch size for processing
);

final embedder = EmbedAnything.fromConfig(config);

Legacy API (Backward Compatible)

final embedder = EmbedAnything.fromPretrainedHf(
  model: EmbeddingModel.bert,
  modelId: 'sentence-transformers/all-MiniLM-L6-v2',
  revision: 'main',
);

Generating Embeddings

Single Text

final result = embedder.embedText('The quick brown fox');
print('Dimension: ${result.dimension}');
print('First 5 values: ${result.values.take(5)}');

Performance:

Short text (10 words): ~7-8ms
Medium text (100 words): ~8-15ms (varies by model)
Long text (500 words): ~15-30ms (varies by model)
Very long text (>512 tokens): truncated automatically

See benchmark/results.md for detailed performance data.

Batch Processing

final texts = List.generate(100, (i) => 'Text $i');
final results = embedder.embedTextsBatch(texts);

// Process results
for (var i = 0; i < results.length; i++) {
  print('Text $i: dimension ${results[i].dimension}');
}

Performance:

Batch processing is 3-4x faster than sequential (measured)
Recommended batch size: 32-128 items
BERT L6: ~775 items/sec for batch of 100
Memory usage scales with batch size

For comprehensive benchmarks, see benchmark/results.md.

Async Batch Processing with Progress Tracking

For large batches (100+ texts), use the async API with progress tracking:

final results = await embedder.embedTextsBatchAsync(
  largeTextList,
  chunkSize: 50,  // Optional: override default (32)
  onProgress: (completed, total) {
    print('Progress: $completed / $total');
  },
);

Migration Note (v0.2.0+): embedTextsBatchAsync() now automatically chunks large batches to prevent memory issues and system overload. This is a behavior change - previous versions sent all texts to Rust at once. The chunking uses ModelConfig.defaultBatchSize (default: 32). To process a specific chunk size, pass the chunkSize parameter.

Thread Pool Configuration

For large batch operations, you can limit the number of threads used for parallel computation. This is especially important on machines with many CPU cores:

// Configure BEFORE loading any models (call once at app startup)
EmbedAnything.configureThreadPool(4);  // Limit to 4 threads

// Check current thread pool size
print('Thread pool size: ${EmbedAnything.getThreadPoolSize()}');

// Now load models - they will use the configured thread pool
final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());

Recommended thread counts:

4-8 threads for most use cases
2-4 threads for memory-constrained environments
Default (num_cpus) may cause high memory usage on machines with many cores

Note: Thread pool configuration must happen before ANY embedding operations. Once the pool is initialized, it cannot be reconfigured.

Computing Similarity

final emb1 = embedder.embedText('I love machine learning');
final emb2 = embedder.embedText('Machine learning is great');
final emb3 = embedder.embedText('I enjoy cooking');

final sim12 = emb1.cosineSimilarity(emb2);
final sim13 = emb1.cosineSimilarity(emb3);

print('Related texts: ${sim12.toStringAsFixed(4)}');    // ~0.87
print('Unrelated texts: ${sim13.toStringAsFixed(4)}');  // ~0.21

Interpreting Similarity Scores:

0.9-1.0: Nearly identical meaning
0.7-0.9: Highly related
0.5-0.7: Moderately related
0.3-0.5: Somewhat related
0.0-0.3: Weakly related or unrelated

Error Handling

try {
  final embedder = EmbedAnything.fromPretrainedHf(
    model: EmbeddingModel.bert,
    modelId: 'invalid/model',
  );
} on EmbedAnythingError catch (e) {
  switch (e) {
    case ModelNotFoundError():
      print('Model not found: ${e.modelId}');
      print('Check https://huggingface.co/ for valid models');
    case InvalidConfigError():
      print('Invalid configuration: ${e.field} - ${e.reason}');
    case EmbeddingFailedError():
      print('Embedding failed: ${e.reason}');
    case MultiVectorNotSupportedError():
      print('Multi-vector embeddings not yet supported');
    case FFIError():
      print('FFI error: ${e.operation}');
  }
}

Performance Characteristics

Note: Benchmarks below are representative values from Phase 1 testing. Actual performance depends on hardware, model, and text characteristics.

Model Loading

Model	Cold Start (first time)	Warm Start (cached)	Memory
BERT all-MiniLM-L6-v2	2-5 seconds + download	~100ms	~90MB
Jina v2-base-en	3-6 seconds + download	~150ms	~280MB

Note: Cold start includes model download (100-500MB depending on model).

Single Embedding Latency

Text Length	BERT MiniLM-L6	Jina v2-base
Short (10 words)	~5-10ms	~10-15ms
Medium (100 words)	~8-15ms	~12-20ms
Long (500 words)	~15-30ms	~20-40ms

Platform note: Performance varies by CPU. M1/M2 Macs show ~1.5x speedup over Intel equivalents.

Batch Throughput

Batch Size	Processing Time	Speedup vs Sequential
10 items	~20-30ms	~3x faster
100 items	~170-250ms	~5x faster
1000 items	~1.5-2.5s	~8x faster

Recommendation: Use batch processing for 10+ items for best performance.

Memory Usage

Base library overhead: ~10MB
Per-model memory: 45-280MB (depends on model and dtype)
Per-embedding overhead: Negligible (<1KB per embedding)
Batch processing: Temporary memory scales linearly with batch size

Memory Management

Automatic Cleanup (Recommended for Most Cases)

void processTexts() {
  final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());
  final result = embedder.embedText('test');
  // No dispose() needed - finalizer cleans up when embedder is garbage collected
}

Manual Cleanup (Recommended for Long-Running Applications)

void processTexts() {
  final embedder = EmbedAnything.fromConfig(ModelConfig.bertMiniLML6());
  try {
    final result = embedder.embedText('test');
    // Use result...
  } finally {
    embedder.dispose(); // Immediate cleanup
  }
}

Best Practices

Manual dispose() for long-running services or apps with many embedders
Automatic cleanup for short-lived scripts or infrequent usage
Reuse embedders instead of creating many instances
Use batch processing to minimize overhead
Avoid creating embedders in loops - create once, reuse many times

Platform Support

Supported Platforms

Platform	Architecture	Status
macOS	x64 (Intel)	Supported
macOS	ARM64 (Apple Silicon)	Supported
Linux	x64	Supported
Linux	ARM64	Supported
Windows	x64	Supported

Platform Requirements

macOS:

macOS 11.0 or later
Xcode Command Line Tools: xcode-select --install

Linux:

Debian/Ubuntu: apt install build-essential pkg-config
Fedora/RHEL: dnf install gcc pkg-config
Arch: pacman -S base-devel

Windows:

Visual Studio 2019 or later with C++ Build Tools
Or Windows SDK + MSVC from VS Build Tools

First Build

The first build compiles 488 Rust crates and will take 5-15 minutes depending on your hardware. Subsequent builds are incremental and much faster (<30 seconds).

# First build - patience required!
dart run --enable-experiment=native-assets example/embedanythingindart_example.dart

Troubleshooting

For detailed troubleshooting, see TROUBLESHOOTING.md.

Common issues:

Issue	Quick Solution
First build extremely slow	Expected - compiling 488 crates takes 5-15 minutes
Model download fails	Check internet connection, set HF_TOKEN for private models
Asset not found error	Verify asset name consistency in Cargo.toml, build.dart, bindings.dart
Out of memory	Reduce batch size or use F16 dtype
Tests fail	Ensure internet connection for model downloads

API Reference

Comprehensive Documentation

For detailed guides with examples and best practices, see the Documentation section above, particularly:

API Reference - Complete API documentation with signatures, parameters, and examples
Usage Guide - Practical usage patterns
Error Handling - Error types and handling strategies

Generated API Docs

Auto-generated dartdoc API documentation is also available:

dart doc
dart pub global activate dhttpd
dhttpd --path doc/api

Then open http://localhost:8080 in your browser.

Core Classes:

EmbedAnything - Main embedder interface
EmbeddingResult - Embedding vector result
ModelConfig - Model configuration
EmbeddingModel - Model architecture enum
ModelDtype - Model data type enum
EmbedAnythingError - Error hierarchy

Contributing

Contributions are welcome! Please follow these guidelines:

Development Setup

# Clone the repository
git clone https://github.com/yourusername/embedanythingindart.git
cd embedanythingindart

# Install dependencies
dart pub get

# Install Rust targets
cd rust && rustup show

# Run tests
dart test --enable-experiment=native-assets

# Run analyzer
dart analyze

# Format code
dart format .

Testing

# Run all tests
dart test --enable-experiment=native-assets

# Run specific test file
dart test --enable-experiment=native-assets test/model_config_test.dart

# Run with coverage
dart test --enable-experiment=native-assets --coverage=coverage

# Run Phase 3 file/directory embedding tests specifically
dart test --enable-experiment=native-assets test/phase3_integration_test.dart

Phase 3 Test Requirements:

Test fixtures are located in test/fixtures/
Integration tests require internet connection on first run to download BERT model (~90MB)
Subsequent runs use cached model from ~/.cache/huggingface/hub
Tests verify:
- File embedding (.txt, .md files)
- Directory streaming with extension filtering
- Error handling (FileNotFoundError, UnsupportedFileFormatError)
- Metadata parsing and ChunkEmbedding utilities
- Memory management (no leaks)

For detailed information about Phase 3 features, see test fixture documentation in test/fixtures/README.md.

Code Standards

Follow Effective Dart guidelines
Add dartdoc comments to all public APIs
Ensure dart analyze passes with zero issues
Ensure cargo clippy passes with zero warnings
Write tests for new features
Update CHANGELOG.md with changes

Build System

This project uses Dart's Native Assets system. Key files:

hook/build.dart - Native asset build hook
rust/Cargo.toml - Rust crate configuration
rust-toolchain.toml - Rust version pinning

Asset name consistency is critical:

rust/Cargo.toml: name = "embedanything_dart"
hook/build.dart: assetName: 'embedanything_dart'
lib/src/ffi/bindings.dart: assetId: 'package:embedanythingindart/embedanything_dart'

License

This project is licensed under the MIT License - see the LICENSE file for details.

The underlying EmbedAnything Rust library is licensed under the Apache License 2.0.

Acknowledgments

EmbedAnything - The Rust library this wraps
HuggingFace - For hosting the embedding models
sentence-transformers - For the BERT models
Jina AI - For the Jina embedding models

Related Projects

surrealdartb - SurrealDB client for Dart with vector support
EmbedAnything - Upstream Rust library
Candle - Rust ML framework powering EmbedAnything

Roadmap

Phase 2: Production Readiness

CI/CD pipeline with multi-platform testing
Automated release process
Performance optimizations
Security audit

Phase 3: Multi-Modal Expansion

PDF, DOCX, Markdown file embedding
Image embedding (CLIP, ColPali)
Audio embedding (Whisper)

Phase 4: Advanced Features

Multi-vector embedding support (ColBERT)
GPU acceleration (CUDA, Metal)
Model quantization (INT8, INT4)
Custom tokenizer configuration

Phase 5: Ecosystem Integration

Vector database adapters (Pinecone, Weaviate, Qdrant)
Cloud provider embeddings (OpenAI, Cohere)
Mobile platform support (iOS, Android)

Questions or Issues?

Open an issue on GitHub
Read the comprehensive documentation for guides and examples
Check Error Handling for troubleshooting common problems
Review the API Reference for complete API documentation

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.claude		.claude
@agent-os		@agent-os
agent-os		agent-os
benchmark		benchmark
coverage		coverage
doc		doc
docs		docs
example		example
example_docs		example_docs
hook		hook
lib		lib
rust		rust
specs/2025-11-03-phase-3-dart-api-file-directory-embedding		specs/2025-11-03-phase-3-dart-api-file-directory-embedding
test		test
.DS_Store		.DS_Store
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
analysis_options.yaml		analysis_options.yaml
dart_test.yaml		dart_test.yaml
example_document.txt		example_document.txt
pubspec.yaml		pubspec.yaml

cotw-fabier/embedanythingindart

Folders and files

Latest commit

History

Repository files navigation