zvec is an open-source, in-process vector database designed to be lightweight, lightning-fast, and directly embedded into applications README.md30-31 Built on Proxima (Alibaba's battle-tested vector search engine), it provides production-grade similarity search with minimal configuration README.md30-31
As an in-process library, zvec eliminates the need for external server management, running wherever your application code runs—from local notebooks and CLI tools to high-performance servers README.md47-48
For a step-by-step guide on installation and your first query, see Getting Started.
The zvec architecture is composed of layers that bridge high-level application logic to hardware-accelerated search kernels.
The following diagram illustrates how the major subsystems relate, from the Python/Node.js entry points down to the core engine and persistence layer.
System Component Map
Sources: README.md51-64 examples/c++/db/main.cc168-188 CMakeLists.txt58-62 README.md36-37
The primary interface for users is the Python package, which provides intuitive classes like CollectionSchema, VectorQuery, and Doc to manage data README.md77-103 It uses pybind11 to wrap the C++ implementation CMakeLists.txt58-59
zvec.CollectionSchema, zvec.VectorSchema, zvec.Doc, zvec.VectorQuery README.md81-99zvec_db)This layer manages the lifecycle of collections, schemas, and data persistence. It supports a wide variety of data types including scalars (INT32, STRING, etc.) and vectors (FP32, FP16, INT8, Sparse) examples/c++/db/main.cc21-162
CollectionSchema examples/c++/db/main.cc168-169 FieldSchema examples/c++/db/main.cc172-173 Doc examples/c++/db/main.cc12-14zvec_core)The algorithmic heart of zvec, containing implementations for Approximate Nearest Neighbor (ANN) search. It utilizes an IndexFactory to instantiate various index types like HNSW or IVF based on configuration examples/c++/core/main.cc4-5 examples/c++/core/main.cc16
IndexFactory, Index, HNSWIndexParamBuilder, HNSWQueryParamBuilder examples/c++/core/main.cc4-7ailego)A low-level library providing hardware abstraction. It includes SIMD-accelerated math kernels with auto-detection for CPU features like AVX-512 and AVX2 cmake/option.cmake127-189 CMakeLists.txt50-53
The following diagram bridges the conceptual subsystems to specific code entities and library targets within the repository.
Code Entity Space Bridge
Sources: examples/c++/db/main.cc5-8 examples/c++/core/main.cc4-7 CMakeLists.txt75-80 README.md36-37
Zvec includes a robust set of tools for performance evaluation and offline index construction:
local_builder: For building large-scale indices from raw vector files tools/core/README.md136recall: For measuring the accuracy of ANN indices against ground truth tools/core/README.md142bench: For high-concurrency performance benchmarking tools/core/README.md147ENABLE_SKYLAKE_AVX512 and BUILD_PYTHON_BINDINGS CONTRIBUTING.md72-76 CMakeLists.txt58Sources: README.md30-31 CONTRIBUTING.md12-46 CMakeLists.txt64-67 tools/core/README.md91-148