A web-based speech-to-text application that converts spoken words into text using Google's Speech Recognition API. The application features a modern, responsive user interface and real-time audio processing.
- Real-time audio recording through browser
- Speech-to-text conversion using Google's Speech Recognition API
- Modern and responsive user interface
- Automatic audio format conversion for compatibility
- Error handling and user feedback
- Python 3.x
- pip (Python package installer)
- ffmpeg (for audio processing)
- Clone the repository:
git clone <your-repository-url>
cd <repository-name>- Install the required Python packages:
pip install -r requirements.txt- Install ffmpeg (if not already installed):
- On macOS:
brew install ffmpeg- On Ubuntu/Debian:
sudo apt-get install ffmpeg- On Windows: Download from ffmpeg.org
- Start the Flask server:
python app.py- Open your web browser and navigate to:
http://127.0.0.1:5000
- Click the "Start Recording" button and speak
- Click "Stop Recording" when finished
- The transcribed text will appear below
├── app.py # Flask backend
├── requirements.txt # Python dependencies
├── static/
│ └── style.css # CSS styles
└── templates/
└── index.html # Frontend template
- Python
- Flask
- SpeechRecognition
- PyAudio
- pydub
- HTML5
- CSS3
- JavaScript
This project is open source and available under the MIT License.
Contributions are welcome! Please feel free to submit a Pull Request.