We completely rewrote this project from scratch. Here's what you get:
| Feature | Old (rag-server-mcp) | New (CodeRAG) |
|---|---|---|
| Startup | 10-30s (ChromaDB + Ollama) | <1s (no external deps) |
| Indexing | Minutes (embedding API calls) | Seconds (TF-IDF + optional vectors) |
| Search | ~500ms (vector only) | <50ms (hybrid search) |
| Memory | 500MB+ (ChromaDB) | <100MB (SQLite) |
| Feature | Old | New |
|---|---|---|
| TF-IDF | ❌ | ✅ StarCoder2 tokenizer |
| Vector Search | ✅ Basic | ✅ Hybrid (TF-IDF + Vector) |
| Code Understanding | ❌ Generic | ✅ Code-aware tokenization |
| Incremental Updates | ❌ Full rebuild | ✅ Smart diff detection |
# Old way (painful)
docker-compose up -d # Start ChromaDB
ollama pull nomic-embed # Download model
# Wait 30 seconds...
# New way (instant)
npx @sylphx/coderag-mcp
# Done. That's it. 🎉- Smoothed IDF - No term gets ignored (even common ones like
function) - Logarithmic boosting - Stable ranking without score explosion
- StarCoder2 tokenizer - 4.7MB model trained on code
- Pre-computed magnitudes - Memory-efficient cosine similarity
// Remove from your MCP config
{
"mcpServers": {
"rag-server": { ... } // DELETE THIS
}
}{
"mcpServers": {
"coderag": {
"command": "npx",
"args": ["-y", "@sylphx/coderag-mcp"]
}
}
}No Docker. No Ollama. No ChromaDB. Just works.
| Feature | rag-server-mcp | CodeRAG |
|---|---|---|
| External Services | ChromaDB + Ollama | None |
| Docker Required | Yes | No |
| Startup Time | 10-30s | <1s |
| Search Algorithm | Vector only | Hybrid (TF-IDF + Vector) |
| Code Tokenization | Generic | StarCoder2 (code-aware) |
| Incremental Index | No | Yes |
| Memory Usage | 500MB+ | <100MB |
| Low Memory Mode | No | Yes (SQL-based) |
| Offline Support | No (needs Ollama) | Yes (TF-IDF) |
| Vector Search | Required | Optional |
| Actively Maintained | ❌ No | ✅ Yes |
- New Repository: github.com/SylphxAI/coderag
- NPM Package: @sylphx/coderag-mcp
- Documentation: CodeRAG README