
CODEBASE INTELLIGENCE FOR AI AGENTS · OPEN SOURCE · HOSTED
Your AI agent doesn't
understand your codebase.
Index any repo into a documented dependency graph in under 30 seconds — real architecture, ownership, and decisions for your AI agents instead of guesses.
One engine, three interfaces
Install once. Choose the interface that fits your workflow — or use all three. They share the same data, the same intelligence, the same stores.

CLI
For the solo developer

MCP Server
For AI-native workflows

Web UI
For the whole team
Most tools answer one question.
repowise answers five.
Graph structure, code health, git history, generated documentation, and architectural decisions — five layers that compound into genuine understanding.
Every dependency, ranked and traced
- Tree-sitter ASTs across 10+ languages → directed dependency graph
- PageRank and betweenness centrality surface critical symbols
- Edge types: imports, calls, inherits, implements, co-changes
- Scales to 30K+ nodes with automatic SQLite-backed graph
It knows which file breaks next — before it does
- State-of-the-art accuracy: ~73% accurate at calling which files are headed for a bug — and on the same code, the same real defects, it matches or beats the best commercial tools and published academic models.
- One 1–10 score from 25 deterministic signals: tangled complexity, hidden coupling, missing tests, runaway churn, fragile ownership. No LLM, no cloud — under 30 seconds on a 3,000-file repo.
- The weights are learned from a real defect corpus, not hand-tuned — so it out-predicts “what changed recently” and “what broke before” by 10+ points, and matches published academic defect models on benchmarks it never saw.
- Ranks what to fix first by impact-for-effort, and alerts you the moment a file's health starts sliding.
proven on 21 real projects across 9 languages
History that writes the documentation
- Hotspot detection — top 25% churn + complexity files flagged
- Co-change partners: files that change together without imports
- Ownership from git blame — primary owner + top 3 contributors
- Significant commits filtered into generation prompts
Wiki pages that stay fresh
- 9-level hierarchical generation: symbols → files → modules → repo
- Confidence scoring with git-informed decay — stale pages auto-regenerate
- RAG context via LanceDB or pgvector — each page knows its imports
- Resumable, crash-safe, idempotent — checkpoint after every page
The why behind your architecture
- 4 capture sources: inline markers, git archaeology, README mining, CLI
- Staleness tracking — decisions age when governed files get commits
- get_why() searches decisions before you change anything
- Health dashboard: stale decisions, ungoverned hotspots, proposed reviews
Code health intelligence on every PR
The Repowise PR Bot is a GitHub App that posts one deterministic comment per pull request — hotspots, hidden coupling, declining health, dead code. Zero LLM calls. Green PRs stay silent. Free for public/OSS repos; private repos require the Pro plan.
- One comment per PR. Edited in place on re-pushes.
- Silence rule. Stays quiet unless health degrades, a hotspot is touched, a co-change partner is missing, or dead code shifts.
- Zero LLM cost. Pure tree-sitter, NetworkX, the 12-biomarker scorer.
- Free forever for OSS. Private repos unlock with the Pro plan.
⚠️ Health: 7.0 → 6.8 (-0.2)
graph.py3.3 → 2.2▼ -1.1
untested hotspot, brain method, nested complexity
🔥 Hotspot touched
graph.py — 21 commits/90d, 13 dependents
primary owner: Raghav (62%)
🔗 Hidden coupling
graph.py co-changes with orchestrator.py (8×)
— not in this PR.
9 tools your AI agent already knows how to call
get_overview()— Architecture summary, module map, entry points, tech stack.get_answer()— One-call RAG Q&A. Retrieves over the wiki, gates on confidence, returns a cited 2–5 sentence answer.get_context()— Docs, ownership, history, decisions, freshness for files, modules, or symbols. Pass multiple targets in one call.get_symbol()— Raw source bytes for one indexed symbol with exact line bounds — cheaper and safer than Read + offset math.search_codebase()— Semantic search over the full wiki using LanceDB or pgvector. Natural language queries.get_risk()— Hotspot score, dependents, co-change partners, risk summary. Also returns top 5 global hotspots.get_why()— Three modes: natural language search over decisions, path-based lookup, or health dashboard.get_dead_code()— Unreachable files, unused exports, zombie packages — sorted by confidence and cleanup impact.get_health()— Per-file health scores from 25 deterministic biomarkers, the worst files, and ranked refactoring targets.
CLAUDE.md that writes itself
- Architecture overview from the real dependency graph
- Hotspot warnings with churn metrics and owners
- Key design decisions and architectural constraints
- Dead code summary with confidence scores
- Entry points, build commands, and tech stack
- Also generates cursor.md — same data, different format
The full picture, side by side
- Auto-generated docs, git intelligence, decision records, and MCP tools — one package
- Open-source (AGPL-3.0) and fully self-hostable
- 15/15 features vs 4–5/15 for any single competitor
| Feature | repowise | Google CodeWiki | DeepWiki | CodeScene | Sourcegraph |
|---|---|---|---|---|---|
| Self-hostable OSS | ✓ | — | — | — | — |
| Works with private repos | ✓ | — | ✓ | ✓ | ✓ |
| Auto-generated wiki (LLM) | ✓ | ✓ | ✓ | — | — |
| Git intelligence (hotspots / ownership / co-changes) | ✓ | — | — | ✓ | — |
| Dead code detection | ✓ | — | — | — | — |
| Architectural decision records | ✓ | — | — | — | — |
| MCP server for AI agents | ✓ | — | — | — | — |
| Semantic search | ✓ | ✓ | ✓ | — | ✓ |
| Doc freshness / confidence scoring | ✓ | — | — | — | — |
| CLAUDE.md auto-generation | ✓ | — | — | — | — |
| Codebase chat (agentic) | ✓ | ✓ | ✓ | — | — |
| Dependency graph visualization | ✓ | ✓ | ✓ | ✓ | ✓ |
| PR review bot (code-health comments) | ✓ | — | — | ✓ | — |
| Provider choice (4 LLM providers) | ✓ | — | — | — | — |
| Privacy (code never leaves your infra) | ✓ | — | — | ✓ | ✓ |
Self-assessed against publicly documented features as of May 2026. Vendor capabilities change — please verify before committing to any tool.
Guides, comparisons, and deep dives

Does our code-health score actually predict bugs? A leakage-free benchmark
We scored 21 repos six months before their bugs landed to test whether a deterministic code-health score predicts defects. AUC 0.737, and the honest caveats.

Process metrics beat structural metrics for predicting defects
Complexity and code smells are the metrics everyone reaches for. Across 25 biomarkers and 21 repos, the strongest defect predictors were evolutionary, not structural. Here's the data.

Best AI Code Review Tools (LLM-Based and Deterministic)
Best AI code review tools fall into two camps: LLM-based reviewers that try to reason about intent, architecture, and change impact, and deterministic…
Three paths to codebase intelligence
- Self-host — free, forever
pip install repowise— your machine, your server, your CI- AGPL-3.0 · full feature set · code never leaves your infra
- Hosted SaaS — live now
- Managed indexing · team workspaces · semantic chat
- Pro at $15/mo with LLM credits included · Sign up free →
- Enterprise
- On-prem · SSO · role-based access · dedicated support · SLAs