Speak in any language — get instant AI responses
A voice-powered AI assistant built with Streamlit, Google Gemini, and Google Text-to-Speech that lets you ask questions in any language using your microphone and receive intelligent spoken responses.
- 🎙️ Voice Input — Speak your query directly into the microphone using
SpeechRecognition - 🌍 Multilingual Support — Understands and responds to queries in multiple languages
- 🧠 Gemini AI — Powered by Google's Gemini LLM for accurate, context-aware responses
- 🔊 Text-to-Speech Output — Converts AI responses to natural-sounding audio via
gTTS - ⬇️ Downloadable Audio — Save the generated speech as an MP3 file
- 🖥️ Clean Web UI — Minimal dark-themed interface built with Streamlit
A clean, centered interface with a single "Ask me anything" button to kick off the voice interaction.
| Layer | Technology |
|---|---|
| Frontend / UI | Streamlit |
| AI / LLM | Google Gemini (google-generativeai) |
| Speech-to-Text | SpeechRecognition + PyAudio |
| Text-to-Speech | gTTS (Google Text-to-Speech) |
| Environment Config | python-dotenv |
| Language | Python 3.x |
Multilingual-AI-Voice-Assistant/
├── src/
│ └── helper.py # Core logic: voice_input, llm_model, text_to_speech
├── app.py # Streamlit app entry point
├── requirements.txt # Python dependencies
├── setup.py # Package setup
├── template.py # Project scaffold template
└── .gitignore
- Python 3.8+
- A working microphone
- A Google Gemini API key
git clone https://github.com/Pratik5767/Multilingual-AI-Voice-Assistant.git
cd Multilingual-AI-Voice-AssistantOn Windows, PyAudio requires pipwin:
pip install pipwin
pipwin install pyaudio
pip install -r requirements.txtOn Linux/macOS:
sudo apt-get install portaudio19-dev # Linux
brew install portaudio # macOS
pip install -r requirements.txtCreate a .env file in the project root:
GOOGLE_API_KEY=your_gemini_api_key_hereGet your free API key from Google AI Studio.
streamlit run app.pyOpen your browser at http://localhost:8501 and click 🎙️ Ask me anything to start.
[User speaks into mic]
↓
[SpeechRecognition converts audio to text]
↓
[Text sent to Google Gemini API]
↓
[Gemini returns AI-generated response]
↓
[gTTS converts response to speech (speech.mp3)]
↓
[Streamlit displays text + plays audio]
SpeechRecognition==3.10.0
pyaudio
gTTS
google-generativeai
python-dotenv
streamlit
Contributions, issues, and feature requests are welcome! Feel free to open an issue or submit a pull request.
- Fork the repository
- Create your feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -m 'Add your feature' - Push to the branch:
git push origin feature/your-feature - Open a Pull Request
This project is open source and available under the MIT License.
Pratik — @Pratik5767
⭐ If you found this project helpful, consider giving it a star on GitHub!
