Person Perception

A local web application with a chat interface for interacting with an LLM and a visualization panel that displays insights based on the responses.

Screenshots

Run conversations — after a few turns of a simulated eval, the right panel shows live mental-model scores and a chart of how they evolve:

Explore conversations — browse saved Spiral-Bench eval runs turn by turn with the mental model JSON per turn:

Chart view — average mental-model scores across 20 turns per scenario category; this run shows a clear upward trend in validation_seeking and user_rightness:

Features

💬 Chat interface for LLM conversations
📊 Visualization panel with live mental-model scores and per-turn chart
🧠 Multiple mental model types: Induct, Support, Structured, Person Perception
🔁 Simulated eval runner (Spiral-Bench, 30 scenarios × 20 turns)
📂 Explore saved runs: chatlog and chart views
🔌 Azure OpenAI (GPT-4o), Google Gemini, and Llama (Vertex AI) support

Setup

Install dependencies:

npm install

Start the development server:

npm run dev

Open your browser to the URL shown in the terminal (usually http://localhost:5173)

Azure API Integration

The app supports Azure OpenAI (GPT-4o) and Google Gemini. You can switch the API model in the UI (dropdown “API model”) or in the terminal with --api_gemini or --api_gpt-4o when running npm run run_eval.

Copy the example environment file (if you have one) or create a .env file in the project root.
Add your keys to .env:

Azure OpenAI (GPT-4o):

VITE_AZURE_ENDPOINT=your-actual-endpoint-here
VITE_AZURE_API_KEY=your-actual-api-key-here
VITE_AZURE_DEPLOYMENT=gpt-4o
VITE_AZURE_API_VERSION=2024-12-01-preview

Google Gemini (optional; use when API model = Gemini):

Browser / API key: Get an API key from Google AI Studio and set:
```
VITE_GEMINI_API_KEY=your-gemini-api-key-here
```
CLI / Node (Vertex AI with service account): Use a Google Cloud service account JSON key (like the one you pasted) and set:
```
GOOGLE_APPLICATION_CREDENTIALS=./new_key.json   # or another path to your JSON
VITE_GEMINI_PROJECT_ID=your-gcp-project-id      # optional if JSON has project_id
```
Optional: VITE_GEMINI_LOCATION=us-central1, VITE_GEMINI_MODEL=gemini-1.5-flash (or e.g. gemini-2.5-pro).

Default provider (optional):

# VITE_API_PROVIDER=gpt-4o
# or
# VITE_API_PROVIDER=gemini

Replace placeholder values with your actual keys.
Restart the development server for the changes to take effect.

How data is saved

When you run evals (from the UI or via npm run run_eval), results are written under the data/ folder. Every saved JSON now includes api_model (e.g. "gpt-4o" or "gemini") so you know which API produced the run.

Where things go

Mode	Command / UI	Save location
Single-call	`--single_call --model_induct` (or `--model_support_2`)	`data/single_call/<model>/run_<api_model>_<N>/` → e.g. `run_gemini_1`, `run_gpt-4o_2`. One folder per run; inside it, one JSON per scenario (e.g. `spiral_tropes/sc01.json`).
Separate call	`--separate_call --convo_<N> --model_induct`	`data/separate_call/convo_<N>/<model>/run_<api_model>_<N>/` → same structure (e.g. `run_gemini_1`).
Generate convos	`--generate_convo`	`data/separate_call/convo_<N>/` → only user/assistant turns (no mental model); used later by separate_call.
Human data	`--human_data --model_induct --filename do_not_upload/h01.json`	`data/do_not_upload/<filename_no_ext>/<filename_no_ext>_<api_model>_<mental_model_type>.json` → e.g. `h01_gemini_induct.json`, `h01_gpt-4o_induct.json`. One file per (source, api model, mental model); re-running overwrites/resumes that file.
Backfill empty	`--backfill_empty --model_induct --file <path>`	Overwrites the given file, filling in missing mental models and updating `meta.api_model`.

What’s in the JSON

Scenario runs (single_call / separate_call): Each file has category, prompt_id, categoryInjection, extraInjection, api_model, turns, and situation_log. Each turn has turnIndex, userMessage, assistantMessage, and mentalModel.
Human data runs: Top-level meta (includes api_model, source, mentalModelType, turns_recorded_up_to, etc.) and turns (same turn shape as above).

So you can always see which API model was used for a run by checking api_model in the saved JSON.

Resuming a run after a crash or error

Single-call runs: If a run fails mid-way (e.g. API error, rate limit, or OAuth error), re-run the same command with --resume_run <runId>. Use the same --api_*, --seed, and --prior (if you used them) as the original run. The script loads existing scenario JSONs from the run folder, skips scenarios that already have 20 turns, and continues from the first incomplete scenario (re-running that scenario from the beginning, then the rest).

Example (run folder run_gemini_3_prior):

npm run run_eval -- --single_call --model_induct --resume_run run_gemini_3_prior --api_gemini --prior

If you used a seed, add it (e.g. --seed 42). The run ID is the folder name under data/single_call/<model>/, e.g. run_gemini_3_prior, run_gpt-4o_2.

Human data runs: The CLI writes a checkpoint file after each turn. Re-run the same --human_data --filename ... command; it will detect the checkpoint and resume from the next turn.

Project Structure

├── src/
│   ├── components/
│   │   ├── ChatInterface.jsx           # Chat UI
│   │   ├── ExploreConversations.jsx    # Browse saved eval runs (chatlog + chart)
│   │   └── VisualizationPanel.jsx      # Mental model scores + per-turn chart
│   ├── eval/
│   │   ├── categories.js              # Spiral-Bench category injections
│   │   ├── default_prompt.js          # Seeker LLM system prompt
│   │   ├── injections.js              # Per-scenario extra injections
│   │   ├── mental_model_prompts.js    # Prompt builders + response parsers
│   │   └── scenarios.js              # Spiral-Bench scenario list
│   ├── services/
│   │   └── api.js                     # LLM API calls (Azure, Gemini, Llama)
│   ├── App.jsx                        # Root component + eval orchestration
│   └── main.jsx                       # Entry point
├── scripts/
│   ├── run_eval.js                    # CLI eval runner (npm run run_eval)
│   └── generate-spiral-manifest.js   # Rebuild public data manifest
├── data/                              # Saved eval run JSONs (gitignored)
├── screenshots/                       # README screenshots
├── index.html
├── package.json
└── vite.config.js                     # Dev server + data middleware + build plugin

Customization

Mental model types: Add or modify prompts and parsers in src/eval/mental_model_prompts.js
Visualization Panel: Edit VisualizationPanel.jsx to add new score series or chart types
Eval scenarios: Swap in different scenarios via src/eval/scenarios.js and src/eval/categories.js
API providers: Add new LLM backends in src/services/api.js

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
csvs_plots		csvs_plots
data		data
public/data/single_call		public/data/single_call
screenshots		screenshots
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
app3.py		app3.py
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Person Perception

Screenshots

Features

Setup

Azure API Integration

How data is saved

Where things go

What’s in the JSON

Resuming a run after a crash or error

Project Structure

Customization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Person Perception

Screenshots

Features

Setup

Azure API Integration

How data is saved

Where things go

What’s in the JSON

Resuming a run after a crash or error

Project Structure

Customization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages