EatWise AI is your personal, multimodal food safety assistant. Using the cutting-edge Gemini Multimodal Live API, it helps you determine if a food product is safe for you to consume based on your specific dietary restrictions, allergies, and lifestyle choices.
Whether you're grocery shopping, dining out, or checking your pantry, EatWise AI listens, sees, and understands your needs in real-time.
The agent can understand and respond to voice, images, and text. It is designed for natural conversations, can handle interruptions and limits its interactions to food safety related conversations.
You can test the agent through deployed web based application at https://eatwise-frontend-477953542175.us-central1.run.app/
Contact [email protected] if you experience any issues.
- 🎙️ Real-time Voice Interaction: Talk to the assistant naturally. Tell it your allergies, ask about a dish, or narrate what you're seeing.
- 📷 Multimodal Intelligence:
- Product Visuals: Show the assistant a photo of a product to identify it.
- Ingredient Labels: Snap a photo of a label, and the assistant will read the ingredients to find hidden allergens.
- Barcode Scanning: Show a barcode or say the number. The system uses Google Search grounding to find exact product details and ingredient lists.
- 💬 Text & URL Support: Type a product name or paste a URL to a product page for a quick safety check.
- 🚀 Fast & Responsive: Optimized for low-latency voice responses with sub-second turn-taking.
- 📱 Mobile Ready: Designed to be used on the go as a Web App (PWA) on your smartphone.
EatWise AI is built on a modern, decoupled architecture designed for high-performance streaming.
- Frontend: React (Vite) + Vanilla CSS. Uses the
AudioContextAPI for real-time PCM audio streaming. - Backend: FastAPI (Python) handles the orchestration between the client and the AI models.
- AI Core:
- Gemini 2.5 Flash (Native Audio Preview): Powers the Live WebSocket session for voice and vision.
- Google Search Tool: Provides the assistant with up-to-date product information for barcode lookups.
- Infrastructure: Fully containerized with Docker and ready for Google Cloud Run deployment.
- Connect: The React client opens a secure WebSocket (
wss://) to the FastAPI backend. - Stream: Audio from the mic is converted to 16kHz PCM and streamed directly to Gemini.
- Analyze: When you show a barcode or label, the agent processes the frame, looks up ingredients if needed, and cross-references them against your stated "Dietary Profile."
- Respond: The agent speaks back to you instantly with a clear SAFE or UNSAFE verdict.
You can find a detailed visual breakdown in architecture.md.
graph LR
User((User)) <--> UI[React App]
UI <-->|Unified WebSocket| Backend[EatWise Agent]
Backend <-->|Live API Session| Gemini[Gemini Live API]
Backend <-->|Search Grounding| Google[Google Search Tool]
To run or deploy this project yourself, please refer to our detailed Deployment Guide.
- Clone the repo.
- Setup environment variables (
GEMINI_API_KEY). - Run locally or deploy to Google Cloud Run.
Powered by Google Gemini 2.5 and the google-genai SDK.