Skip to content

alDuncanson/latent

Repository files navigation

latent

Go Version Go Reference Version Build Release Go Report Card License

Peer into latent space.

demo

Terminal UI for visualizing high-dimensional text embeddings via dimensionality reduction. Embeds text using Ollama's nomic-embed-text model (768D vectors), persists to Qdrant vector database over gRPC, and projects to 2D using PCA (SVD-based) or UMAP for nonlinear manifold approximation. Clustering via HDBSCAN reveals semantic structure without specifying k. Built with Bubble Tea and Lipgloss.

Prerequisites

  • Ollama serving nomic-embed-text on localhost:11434
  • Qdrant running on localhost:6334 (gRPC)

Install

curl -sSL https://raw.githubusercontent.com/alDuncanson/latent/main/install.sh | bash

or

go install github.com/alDuncanson/latent@latest

Usage

latent                    # Start TUI
latent dataset.csv        # Import from CSV (requires `text` column)
latent dataset.json       # Import from JSON (array of strings or {text: ...} objects)
latent --preload          # Seed with demo word list

Hugging Face Datasets

latent --hf-dataset stanfordnlp/imdb --hf-split test --hf-max-rows 50
latent --hf-dataset rajpurkar/squad --hf-column question --hf-max-rows 200

Flags: --hf-dataset, --hf-split (default: train), --hf-column (default: text), --hf-max-rows (default: 100)