Transform casual voice recordings into clean, structured context data for AI applications.
Context Cruncher extracts structured context data from voice recordings using Gemini AI's multimodal capabilities. It processes audio directly, cleaning up natural speech patterns and organizing information into useful context data that AI systems can use for personalization.
Context data refers to specific information about users that grounds AI inference for more personalized results. This tool achieves that by:
- Removing irrelevant information and tangents
- Eliminating duplicates and redundancy
- Reformatting from first person to third person
- Organizing information hierarchically
- Outputting both Markdown and JSON formats
Check out the demo page to see real results from processing example audio about movie preferences.
- 🎤 Flexible Audio Input: Record directly in your browser or upload audio files (MP3, WAV, OPUS)
- 🤖 AI-Powered Extraction: Uses Gemini 2.0 Flash for intelligent audio understanding and context extraction
- 📝 Dual Output Formats: Get both human-readable Markdown and machine-readable JSON
- 👤 Customizable Identification: Choose how you're referred to in the context data (by name or as "the user")
- 📋 Easy Export: Download files or copy directly to clipboard
- Python 3.12+
- A Gemini API key
- Clone the repository:
git clone https://github.com/danielrosehill/Context-Cruncher.git
cd Context-Cruncher- Create a virtual environment and install dependencies:
# Using uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
# Or using standard venv
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt- Create a
.envfile with your Gemini API key:
cp .env.example .env
# Edit .env and add your API key:
# GEMINI_API="your_api_key_here"- Run the application:
Option A: Using the launch script (easiest)
./run.shOption B: Manual launch
source .venv/bin/activate
python app.pyThe app will launch in your browser at http://localhost:7860
- Configure: Enter your Gemini API key (or load from
.env) - Choose Identification: Select whether to be referred to by name or as "the user"
- Provide Audio: Either:
- Record directly in the browser using your microphone
- Upload an audio file (MP3, WAV, or OPUS)
- Extract: Click "Extract Context" to process your audio
- Download: Get your structured context data as Markdown or JSON
Raw Audio Input:
"Okay so... let's document my health problems and the meds I take for this AI project... ehm.. where do I start... well, I've had asthma since I was a kid. I take a daily inhaler called Relvar for that. I also take Vyvanse for ADHD which is a stimulant medication. Oh.. hey Jay! What's up, man! Yeah see you at the gym. Okay, where was I. Note to self, pick up the laundry later. Oh yeah.. I've been on Vyvanse for three years and think it's great. I get bloods every 3 months."
Structured Output:
## Medical Conditions
- the user has had asthma since childhood
- the user has adult ADHD
## Medication List
- the user takes Relvar, daily, for asthma
- the user takes Vyvanse 70mg, daily, for ADHDTo regenerate the demo results with the example audio:
python generate_demo.pyThis will process the example-data/movie-prefs.opus file and save results to demo-results/.
Your audio is processed using the Gemini API. Review Google's privacy policies before using this tool with sensitive information.
- AI Assistant Personalization: Provide context to chatbots and AI assistants
- Knowledge Management: Convert verbal notes into structured information
- Preference Mapping: Document likes, dislikes, and preferences
- Medical History: Organize health information (note privacy considerations)
- Project Context: Capture project requirements and preferences
- Frontend: Gradio web interface
- AI Model: Gemini 2.0 Flash (with multimodal audio understanding)
- Audio Processing: Direct audio file upload to Gemini API
- Output Formats: Markdown and JSON
Context-Cruncher/
├── app.py # Main Gradio application
├── gemini_processor.py # Gemini API integration
├── generate_demo.py # Demo generation script
├── run.sh # Launch script
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── demo.html # Demo results page
├── example-data/ # Example audio files
└── demo-results/ # Generated demo outputs
Contributions welcome! Please feel free to submit issues or pull requests.
MIT License - See LICENSE file for details
Daniel Rosehill
- Website: danielrosehill.com
- GitHub: @danielrosehill