JARVIS

Real-time person intelligence platform. Point smart glasses at someone, get a full dossier in seconds.

Glasses camera → facial recognition → browser agent swarm → social enrichment → corkboard UI

Built for the Web Agents Hackathon (Browser Use + YC).

How It Works

Capture — Meta Ray-Ban glasses stream video frames to the backend (or upload a photo manually)
Detect & Identify — MediaPipe detects faces, ArcFace generates embeddings, PimEyes + reverse image search identifies the person
Research — A swarm of Browser Use agents fans out across LinkedIn, Twitter/X, Instagram, and Google in parallel. Exa API provides a fast-pass enrichment layer
Synthesize — Claude 3.5 Sonnet (or Gemini 2.0 Flash fallback) generates a structured dossier from all collected intel
Display — Results stream live to a military-themed corkboard UI with pushpins, red string connections, and real-time activity feed

Tech Stack

Layer	Technology
Backend	FastAPI (Python 3.11+)
Frontend	Next.js 16, Framer Motion, Tailwind CSS
Real-time DB	Convex
Face Detection	MediaPipe
Face Embeddings	ArcFace (insightface)
Reverse Face Search	PimEyes, PicImageSearch
Browser Agents	Browser Use (cloud sessions)
Fast Search	Exa API
Twitter Scraping	twscrape
Synthesis	Anthropic Claude (primary), Google Gemini (fallback)
Observability	Laminar
Voice Commands	Gemini Flash transcription via WebSocket

Quick Start

Prerequisites

Python 3.11+
Node.js 20+
API keys (see Environment Variables)

1. Clone and set up environment

git clone <repo-url> && cd YC_hackathon
cp .env.example .env
# Fill in your API keys in .env

2. Backend

cd backend
python -m venv .venv && source .venv/bin/activate
pip install -e .
uvicorn main:app --reload --port 8000

The backend starts even with missing API keys — features degrade gracefully. Check what's available:

GET http://localhost:8000/api/health
GET http://localhost:8000/api/services

3. Frontend

cd frontend
npm install
npm run dev

Opens at http://localhost:3000. The UI will show a "BACKEND OFFLINE" indicator if the backend isn't reachable.

4. Convex (real-time database)

cd frontend
npx convex dev

Convex is optional — the app works without it using in-memory storage, but you lose persistence and real-time sync across tabs.

Environment Variables

Copy .env.example to .env and fill in what you have. Nothing is strictly required — the app degrades gracefully.

Variable	Required For	Where to Get It
`NEXT_PUBLIC_CONVEX_URL`	Real-time DB	convex.dev
`EXA_API_KEY`	Fast person search	exa.ai
`BROWSER_USE_API_KEY`	Agent swarm (LinkedIn, Twitter, etc.)	cloud.browser-use.com
`OPENAI_API_KEY`	Agent LLM backbone	platform.openai.com
`GEMINI_API_KEY`	Vision, synthesis fallback, voice	ai.google.dev
`ANTHROPIC_API_KEY`	Dossier synthesis (primary)	console.anthropic.com
`CONVEX_URL`	Backend → Convex writes	Same Convex deployment
`MONGODB_URI`	Persistent storage	mongodb.com/atlas
`TELEGRAM_BOT_TOKEN`	Telegram photo intake	BotFather
`LAMINAR_API_KEY`	Agent tracing	lmnr.ai
`SUPERMEMORY_API_KEY`	Dossier caching	supermemory.ai
`PIMEYES_ACCOUNT_POOL`	Facial recognition search	JSON array of PimEyes accounts

Project Structure

backend/
├── agents/           # Browser Use agent swarm (LinkedIn, Twitter, Google, etc.)
├── capture/          # Camera input, frame extraction, Telegram bot
├── identification/   # Face detection, ArcFace embeddings, PimEyes search
├── enrichment/       # Exa API fast lookups, 6ixfour OSINT
├── synthesis/        # Claude/Gemini dossier generation
├── db/               # Convex client, in-memory fallback
├── memory/           # SuperMemory for dossier caching
├── observability/    # Laminar tracing
├── main.py           # FastAPI app & route handlers
├── pipeline.py       # CapturePipeline orchestrator
├── config.py         # Settings (pydantic-settings)
└── schemas.py        # Request/response models

frontend/
├── src/
│   ├── app/          # Next.js app router (single-page corkboard)
│   ├── components/   # IntelBoard, Corkboard, PersonCard, DossierView, etc.
│   ├── hooks/        # useResearchStream, useFrameCapture, useVoiceCommands
│   └── lib/          # Types, animations, utilities
├── convex/           # Schema, queries, mutations
└── e2e/              # Playwright tests

API Endpoints

Method	Path	Description
`GET`	`/api/health`	Health check & configured services
`GET`	`/api/services`	Detailed service status
`POST`	`/api/capture`	Upload image for identification
`POST`	`/api/capture/frame`	Process video frame (real-time face detection)
`POST`	`/api/capture/identify`	Identify person by name + image
`GET`	`/api/research/{name}/stream`	SSE stream of research results
`GET`	`/api/person/{id}`	Get person dossier
`POST`	`/api/agents/research`	Spawn Browser Use sessions
`GET`	`/api/agents/sessions/{id}`	Poll session status
`WS`	`/ws/audio/{room}`	Audio transcription (glasses mic)

Development

# Backend tests
cd backend && pytest

# Frontend tests
cd frontend && npm test

# Type checking
cd frontend && npm run typecheck

# Linting
cd backend && ruff check .
cd frontend && npm run lint

# E2E tests
cd frontend && npm run test:e2e

Architecture

The system is designed around graceful degradation. Each service is optional:

No Browser Use key? Falls back to Exa-only research
No Exa key? Returns a clear "configure API key" message
No Convex? Uses in-memory storage
No Anthropic key? Falls back to Gemini for synthesis
No Gemini key? Skips synthesis, still returns raw research

The research pipeline streams results via Server-Sent Events so the UI updates incrementally as agents report back.

Further Documentation

Document	What's In It
`SYSTEM_DESIGN.md`	Detailed component architecture, data flows, anti-bot strategies
`TECH_DOC.md`	Implementation guide, 6-phase build order, Convex schema details
`DESIGN_HANDOFF.md`	UI/UX specs, color system, typography, animation details
`ARCHITECTURE.md`	System overview diagrams, component deep-dives
`TASKS.md`	Build task checklist with acceptance criteria
`PRD.md`	Product requirements, success metrics, risk analysis

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.claude		.claude
.github		.github
HumanDetection		HumanDetection
assets		assets
backend		backend
frontend		frontend
samples		samples
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
CREDITS_CHECKLIST.md		CREDITS_CHECKLIST.md
DESIGN_HANDOFF.md		DESIGN_HANDOFF.md
MULTI_AGENT_EXECUTION_PLAN.md		MULTI_AGENT_EXECUTION_PLAN.md
PRD.md		PRD.md
README.md		README.md
SYSTEM_DESIGN.md		SYSTEM_DESIGN.md
TASKS.md		TASKS.md
TECH_DOC.md		TECH_DOC.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JARVIS

How It Works

Tech Stack

Quick Start

Prerequisites

1. Clone and set up environment

2. Backend

3. Frontend

4. Convex (real-time database)

Environment Variables

Project Structure

API Endpoints

Development

Architecture

Further Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

JARVIS

How It Works

Tech Stack

Quick Start

Prerequisites

1. Clone and set up environment

2. Backend

3. Frontend

4. Convex (real-time database)

Environment Variables

Project Structure

API Endpoints

Development

Architecture

Further Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages