Project Report: Voice-Based AI Assistant and Continuous Speech Recognition
Project Title
Voice Controlled Smart Assistant using Speech Recognition and Text-to-Speech
Developed By
Bharath
Rehan
Date
June 2025
Project Overview
This project integrates two Python-based systems:
1. Continuous Speech Recognition - a real-time transcription system that converts spoken words into text.
2. AI Voice Assistant (Jarvis) - an interactive assistant capable of executing tasks via voice commands.
Tools and Libraries Used
speech_recognition - Converts speech to text using Google API
pyttsx3 - Converts text to speech (offline support)
datetime - Retrieves the current time
wikipedia - Retrieves summaries of Wikipedia topics
webbrowser - Opens URLs in the default browser
os - Interacts with the operating system
smtplib - Sends emails using SMTP
Project Report: Voice-Based AI Assistant and Continuous Speech Recognition
System Requirements
- Python 3.7 or higher
- Internet connection for some functions (Wikipedia, email)
- Microphone
- Install dependencies:
pip install SpeechRecognition pyttsx3 wikipedia pyaudio
Module 1: Continuous Speech Recognition
A simple speech recognizer that listens continuously and prints recognized text in real time until the user says 'stop'.
How it Works:
- Listens using sr.Microphone()
- Transcribes audio via recognize_google
- Adjusts for ambient noise
Loop exits on the word "stop".
Module 2: AI Voice Assistant (Jarvis)
A virtual assistant named Jarvis that responds to user voice commands and can perform various desktop tasks.
Example Commands:
- Open YouTube
- Search Python on Wikipedia
- Play music
- What is the time
Project Report: Voice-Based AI Assistant and Continuous Speech Recognition
- Send email
Limitations
- Speech recognition needs internet (Google API)
- Hardcoded email/password (needs secure method)
- No GUI interface
- Can misinterpret accents or unclear speech
Future Enhancements
- Add GUI with Tkinter or PyQt
- Secure email with OAuth or .env files
- Add more apps and voice services
- Use offline speech models (Vosk, Whisper)
- Add chatbot AI or LLM integration
Conclusion
The combination of speech recognition and command execution offers a strong foundation for smart virtual assistants.
Bharath and Rehan's implementation proves that even with simple libraries, we can create a functional, voice-controlled
smart system capable of performing real-world tasks interactively.