A fully containerized Graph RAG application for cybersecurity threat intelligence, powered by local LLMs via Ollama.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Docker Network β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββ β
β β Frontend β β Backend β β Neo4j β β Ollama β β
β β (Nginx) ββββββΆβ (FastAPI) ββββββΆβ (Graph DB) β β (LLM) β β
β β Port 8501 β β Port 8000 β β Port 7474 β β Port 11434β β
β βββββββββββββββ ββββββββ¬βββββββ βββββββββββββββ βββββββββββββ β
β β β² β
β βββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Data Ingestion (Job) β β
β β Loads MITRE ATT&CK data into Neo4j on startup β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Image:
neo4j:5.15-community - Purpose: Stores threat intelligence as a knowledge graph
- Ports:
7474- Browser UI7687- Bolt protocol
- Data: Persisted via Docker volume
- Image:
ollama/ollama:latest - Model:
mistral:7b - Purpose:
- LLM for natural language understanding and generation
- Embedding generation via
nomic-embed-text
- Port:
11434 - Note: Runs on CPU (no GPU available), 8GB memory limit
- Image: Custom Python image
- Purpose:
- REST API for frontend
- RAG pipeline orchestration
- Cypher query generation from natural language
- Graph traversal and context retrieval
- Port:
8000 - Endpoints:
POST /query - Natural language query GET /graph/stats - Graph statistics GET /graph/actors - List threat actors GET /graph/techniques - List techniques GET /graph/actors/{name}/techniques - Get actor's techniques GET /graph/actors/{name}/attack-path - Get actor's kill chain GET /graph/techniques/{id}/mitigations - Get technique mitigations GET /graph/search?q= - Search across all entities GET /graph/visualize - Get graph data for visualization GET /health - Health check
- Image: Nginx Alpine
- Purpose: Modern web UI for querying threat intelligence
- Port:
8501(mapped from internal port 80) - Tech Stack:
- HTML5/CSS3/JavaScript
- jQuery for AJAX requests
- Chart.js for statistics visualization
- vis-network for interactive graph visualization
- marked.js for markdown rendering
- Features:
- Query Page: Natural language queries with example suggestions
- Explore Page: Browse threat actors, techniques, and search
- Graph Map: Interactive network visualization with filtering
- Statistics: Charts showing node/relationship distribution
- Image: Custom Python image
- Purpose: One-time job to load MITRE ATT&CK data
- Data Sources:
- MITRE ATT&CK Enterprise (STIX format)
- Relationships: Actors β Techniques β Tactics β Mitigations
| Label | Properties | Description |
|---|---|---|
ThreatActor |
id, name, description, aliases, country | APT groups, criminal orgs |
Technique |
id, name, description, platforms, detection | ATT&CK techniques |
Tactic |
id, name, description, shortname | ATT&CK tactics (kill chain phases) |
Malware |
id, name, description, platforms | Malware families |
Tool |
id, name, description | Legitimate tools used maliciously |
Mitigation |
id, name, description | Defensive measures |
(:ThreatActor)-[:USES]->(:Technique)
(:ThreatActor)-[:USES]->(:Malware)
(:ThreatActor)-[:USES]->(:Tool)
(:Technique)-[:BELONGS_TO]->(:Tactic)
(:Technique)-[:MITIGATED_BY]->(:Mitigation)
(:Malware)-[:EMPLOYS]->(:Technique)
(:Tool)-[:EMPLOYS]->(:Technique)
User Query
β
βΌ
βββββββββββββββββββββββ
β 1. Query Analysis β β Ollama extracts intent & entities
ββββββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββββββ
β 2. Graph Retrieval β β Cypher query against Neo4j
ββββββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββββββ
β 3. Context Building β β Combine graph results + embeddings
ββββββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββββββ
β 4. Response Gen β β Ollama generates final answer
βββββββββββββββββββββββ
| Natural Language Query | Graph Retrieval |
|---|---|
| "What techniques does APT29 use?" | Match path from actor to techniques |
| "How do I defend against phishing?" | Find mitigations for T1566 |
| "Which actors target healthcare?" | Filter actors by target industry |
| "Show the kill chain for Lazarus" | Traverse actor β techniques β tactics |
graph-rag/
βββ docker-compose.yml
βββ .env.example
βββ Makefile # Useful commands
βββ README.md
β
βββ backend/
β βββ Dockerfile
β βββ requirements.txt
β βββ app/
β βββ main.py # FastAPI app
β βββ config.py # Settings
β βββ routers/
β β βββ query.py # Query endpoints
β β βββ graph.py # Graph endpoints
β βββ services/
β β βββ neo4j_service.py # Graph operations
β β βββ ollama_service.py# LLM operations
β β βββ rag_pipeline.py # RAG orchestration
β βββ models/
β βββ schemas.py # Pydantic models
β
βββ frontend/
β βββ Dockerfile
β βββ nginx.conf # Nginx configuration
β βββ index.html # Main HTML page
β βββ css/
β β βββ style.css # Styles
β βββ js/
β βββ app.js # JavaScript application
β
βββ ingestion/
β βββ Dockerfile
β βββ requirements.txt
β βββ ingest.py # Main ingestion script
β βββ parsers/
β βββ mitre_attack.py # MITRE ATT&CK parser
β
βββ data/
βββ .gitkeep # Downloaded data stored here
# Neo4j
NEO4J_URI=bolt://neo4j:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=threatintel123
# Ollama
OLLAMA_HOST=http://ollama:11434
OLLAMA_MODEL=mistral:7b
OLLAMA_EMBED_MODEL=nomic-embed-text
# Backend
LOG_LEVEL=INFO- Docker & Docker Compose installed on target machine
- At least 16GB RAM (for Ollama + Neo4j)
- ~10GB disk space
# Clone repository
git clone https://github.com/encryptedtouhid/graph-rag.git
cd graph-rag
# Copy environment file
cp .env.example .env
# Start all services
docker-compose up -d
# Watch logs
docker-compose logs -f
# Access services
# - Frontend: http://localhost:8501
# - Backend API: http://localhost:8000/docs
# - Neo4j Browser: http://localhost:7474- Ollama init container will auto-pull
mistral:7bandnomic-embed-textmodels - Ingestion job loads MITRE ATT&CK data into Neo4j
- System ready when all health checks pass
make help # Show all available commands
make build # Build all Docker images
make up # Start all services in background
make up-logs # Start all services with logs
make down # Stop all services
make logs # View logs from all services
make logs-backend # View backend logs only
make status # Show status of all services
make restart # Restart all services
make clean # Stop and remove containers, volumes, images
make rebuild # Clean rebuild and start
make shell-backend # Open shell in backend container
make shell-neo4j # Open cypher-shell in Neo4j
make reset-db # Clear database and re-run ingestion- Add IOC ingestion (AlienVault OTX)
- Add CVE/NVD data
- Implement semantic search with vector index
- Add query caching
- Add authentication
- Kubernetes deployment manifests
- GPU support for Ollama
