English | 中文
A lightweight, zero-heavyweight-dependency RAG (Retrieval-Augmented Generation) knowledge base built with FastAPI + Streamlit + ChromaDB + FastEmbed. Includes a framework-agnostic Agent Skill module for semantic search over a local document corpus.
Pre-loaded with classic Chinese literary works as sample data.
| Feature | Description |
|---|---|
| Knowledge CRUD | Add, edit, delete documents via UI or file upload; auto-persisted to disk |
| Semantic Search | Local ONNX embedding (BAAI/bge-small-zh-v1.5) with cosine similarity ranking |
| Streaming Q&A | DeepSeek-powered RAG with character-level graceful fallback (no API key needed) |
| Agent Skill | Framework-adaptive skill compatible with Hermas, OpenClaw, and plain Python |
| Anti-hallucination | Cosine threshold filter + strict system prompt with citation enforcement |
| Zero Heavy Deps | No PyTorch / CUDA required — runs on CPU via ONNX runtime |
┌──────────────────────────────────────────────────────────┐
│ User / Agent │
│ Streamlit UI ←→ FastAPI Backend ←→ Agent Skill │
└────────────────────────┬─────────────────────────────────┘
│
┌──────────────▼──────────────┐
│ VectorStore │
│ FastEmbed (ONNX, local) │
│ ChromaDB (in-memory) │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ RAG Engine │
│ DeepSeek Chat API │
│ ↳ fallback: char streaming │
└─────────────────────────────┘
Data persistence: Documents are stored as .txt files under agent/data/ and reloaded into the in-memory vector store on every backend start.
lightweight-rag-agent/
├── agent/
│ ├── backend/
│ │ ├── config.py # Centralized config via pydantic-settings
│ │ ├── embeddings.py # FastEmbed ONNX wrapper
│ │ ├── models.py # Pydantic request/response models
│ │ ├── rag.py # Streaming RAG + fallback streamer
│ │ ├── vector_store.py # In-memory ChromaDB + chunking logic
│ │ └── main.py # FastAPI routes + lifespan hooks
│ ├── frontend/
│ │ └── app.py # Streamlit UI (3 tabs)
│ ├── data/ # Persisted document corpus (auto-loaded)
│ ├── .env.example # Environment variable template
│ └── requirements.txt
├── skills/
│ ├── SKILL.md # Skill spec (trigger rules, tool schema)
│ ├── rag_skill.py # Agent Skill implementation
│ └── test_skill.py # Automated test suite (4 cases)
├── DESIGN.md # Architecture & design decisions
├── SETUP.md # Detailed setup & deployment guide
└── README.md # This file
# 1. Install dependencies
cd agent
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 2. Configure (optional — works without API key via fallback)
cp .env.example .env
# Edit .env and set DEEPSEEK_API_KEY=sk-your-key
# 3. Start backend
uvicorn backend.main:app --host 0.0.0.0 --port 8000
# 4. Start frontend (new terminal)
streamlit run frontend/app.py --server.port 8501Open http://localhost:8501 — the knowledge base is pre-loaded with sample documents.
See SETUP.md for full deployment details.
# With backend running:
./agent/venv/bin/python skills/test_skill.pyRuns 4 automated test cases: normal retrieval, semantic generalization, empty input guard, and connection error handling.
| Layer | Technology |
|---|---|
| Backend | FastAPI, Uvicorn |
| Frontend | Streamlit |
| Embeddings | FastEmbed + BAAI/bge-small-zh-v1.5 (ONNX) |
| Vector DB | ChromaDB (in-memory) |
| LLM | DeepSeek Chat API (OpenAI-compatible) |
| Config | pydantic-settings, python-dotenv |
English | 中文
基于 FastAPI + Streamlit + ChromaDB + FastEmbed 构建的轻量级 RAG 知识库 Demo,内置框架自适应 Agent Skill 模块,支持对本地文档库进行语义检索。
以经典中文文学作品作为示例数据集,开箱即用。
| 特性 | 说明 |
|---|---|
| 知识库 CRUD | 支持表单录入、文件上传,操作自动同步磁盘 |
| 语义搜索 | 本地 ONNX 向量模型(BAAI/bge-small-zh-v1.5),余弦相似度排序 |
| 流式问答 | DeepSeek 驱动,无 API Key 时自动降级为逐字流式输出 |
| Agent 技能 | 跨框架兼容(Hermas / OpenClaw / 原生 Python) |
| 防幻觉设计 | 相似度阈值过滤 + 严格系统提示词 + 强制出处标注 |
| 零重型依赖 | 无需 PyTorch / CUDA,纯 ONNX CPU 运行 |
# 1. 安装依赖
cd agent
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 2. 配置(可选,不配置时自动降级)
cp .env.example .env
# 编辑 .env,填入 DEEPSEEK_API_KEY=sk-xxx
# 3. 启动后端
uvicorn backend.main:app --host 0.0.0.0 --port 8000
# 4. 启动前端(新终端)
streamlit run frontend/app.py --server.port 8501打开 **http://localhost:8501**,示例文档已预加载。详细说明见 SETUP.md。