WhisperPlay is a web application that provides transcription services for audio files. It allows users to upload audio files, transcribe them with Whisper, and manage the resulting transcripts.
Users can play the audio file and navigate the transcript by sentences.
It is a great tool for learning foreign languages.
- Reading by Sentence: Navigate the transcript by sentences and listen to the audio with exact timestamps located with transcript.
- Transcription: Automatically transcribes uploaded audio files using a transcription service.
- Audio Upload: Supports multiple audio formats including MP3, WAV, M4A, and OGG.
- User Authentication: Secure login system to manage user sessions.
- Job Management: View, update, and delete transcription jobs.
- Logging: Comprehensive logging for debugging and monitoring.
- Python 3.8 or higher
- Flask
- NLTK
- Bcrypt
- mlx_whisper for MacOS
- fast_whisper for Linux with CUDA
- Other dependencies listed in
requirements.txt
-
Clone the repository:
git clone https://github.com/reeyarn/WhisperPlay.git cd WhisperPlay -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install dependencies:
pip install -r requirements.txt
-
Download NLTK data:
import nltk nltk.download('punkt')
-
Run the application:
python main.py
-
Access the application: Open your web browser and go to
http://localhost:5001. Or you can change the port inmain.pyto your desired port.
- Upload Audio: Navigate to the upload page and select an audio file to upload.
- Transcribe: The application will automatically transcribe the uploaded audio.
- Manage Jobs: View the status of transcription jobs, update titles, or delete jobs as needed.
- Listen to the Transcript: Navigate the transcript by sentences and listen to the audio with exact timestamps located with transcript.
- Environment Variables: Set
CUDA_VISIBLE_DEVICESto specify which GPU to use. - Application Settings: Modify
app.configinmain.pyto change upload and transcript directories, and other settings.
python mlx-examples/whisper/convert.py --torch-name-or-path primeline/whisper-large-v3-german --mlx-path mlx_models/whisper-large-v3-german
I would like to express our gratitude to the following projects for their contributions and inspiration:
- Cursor, Anthropic, devv.ai, and all the other AI code editors and services that help me to build this app.
- whisper_mlx @ ml-explore: Apple's MLX for MacOS to run OpenAI's Whisper.
- fast_whisper: A project to run efficient transcription processing with CUDA.
- HuggingFace community.
Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.
For questions or support, please contact [email protected].

