WhisperPlay

WhisperPlay is a web application that provides transcription services for audio files. It allows users to upload audio files, transcribe them with Whisper, and manage the resulting transcripts.

Users can play the audio file and navigate the transcript by sentences.

It is a great tool for learning foreign languages.

Features

Reading by Sentence: Navigate the transcript by sentences and listen to the audio with exact timestamps located with transcript.
Transcription: Automatically transcribes uploaded audio files using a transcription service.
Audio Upload: Supports multiple audio formats including MP3, WAV, M4A, and OGG.
User Authentication: Secure login system to manage user sessions.
Job Management: View, update, and delete transcription jobs.
Logging: Comprehensive logging for debugging and monitoring.

Interface

Index Page

Transcript Reading Page

Installation

Prerequisites

Python 3.8 or higher
Flask
NLTK
Bcrypt
mlx_whisper for MacOS
fast_whisper for Linux with CUDA
Other dependencies listed in requirements.txt

Setup

Clone the repository:

git clone https://github.com/reeyarn/WhisperPlay.git
cd WhisperPlay

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies:
```
pip install -r requirements.txt
```
Download NLTK data:
```
import nltk
nltk.download('punkt')
```
Run the application:
```
python main.py
```
Access the application: Open your web browser and go to http://localhost:5001. Or you can change the port in main.py to your desired port.

Usage

Upload Audio: Navigate to the upload page and select an audio file to upload.
Transcribe: The application will automatically transcribe the uploaded audio.
Manage Jobs: View the status of transcription jobs, update titles, or delete jobs as needed.
Listen to the Transcript: Navigate the transcript by sentences and listen to the audio with exact timestamps located with transcript.

Configuration

Environment Variables: Set CUDA_VISIBLE_DEVICES to specify which GPU to use.
Application Settings: Modify app.config in main.py to change upload and transcript directories, and other settings.

Preparing MLX Whisper Model

python mlx-examples/whisper/convert.py --torch-name-or-path primeline/whisper-large-v3-german --mlx-path mlx_models/whisper-large-v3-german

Acknowledgments

I would like to express our gratitude to the following projects for their contributions and inspiration:

Cursor, Anthropic, devv.ai, and all the other AI code editors and services that help me to build this app.
whisper_mlx @ ml-explore: Apple's MLX for MacOS to run OpenAI's Whisper.
fast_whisper: A project to run efficient transcription processing with CUDA.
HuggingFace community.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For questions or support, please contact [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
screenshots		screenshots
templates		templates
transcripts		transcripts
uploads		uploads
README.md		README.md
main.py		main.py
transcript_manager.py		transcript_manager.py
transcription_service.py		transcription_service.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WhisperPlay

Features

Interface

Index Page

Transcript Reading Page

Installation

Prerequisites

Setup

Usage

Configuration

Preparing MLX Whisper Model

Acknowledgments

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Languages

reeyarn/WhisperPlay

Folders and files

Latest commit

History

Repository files navigation

WhisperPlay

Features

Interface

Index Page

Transcript Reading Page

Installation

Prerequisites

Setup

Usage

Configuration

Preparing MLX Whisper Model

Acknowledgments

Contributing

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages