CLI tool for transcribing audio and video files using OpenAI Whisper API.
- Audio Transcription: Transcribe MP3, WAV, FLAC, AAC, M4A files
- Video Support: Extract and transcribe audio from MKV, MP4, AVI, MOV
- Batch Processing: Process entire directories with concurrent API calls
- Multiple Output Formats: Plain text (TXT) and subtitles (SRT)
- Large File Support: Automatic chunking for files >25MB
- Resume Support: Continue interrupted transcriptions
- Python 3.9+
- FFmpeg 4.0+
- OpenAI API key
Linux (Ubuntu/Debian):
sudo apt update && sudo apt install ffmpeg -ymacOS (Homebrew):
brew install ffmpegWindows (Chocolatey):
choco install ffmpeg -ypip install transcribe-cliOr install from source:
git clone https://github.com/jmagly/transcribe-cli.git
cd transcribe-cli
pip install -e .export OPENAI_API_KEY=sk-your-api-key-hereOr create a .env file:
cp .env.example .env
# Edit .env and add your API key# Transcribe a single file
transcribe audio.mp3
# Transcribe video (extracts audio automatically)
transcribe video.mkv
# Output as SRT subtitles
transcribe audio.mp3 --format srt
# Batch process a directory
transcribe batch ./recordings
# Extract audio only (no transcription)
transcribe extract video.mkvtranscribe <file> [OPTIONS]
Options:
-o, --output-dir PATH Output directory (default: current)
-f, --format TEXT Output format: txt, srt (default: txt)
-l, --language TEXT Language code or 'auto' (default: auto)
--verbose Enable verbose output
--help Show help messagetranscribe batch <directory> [OPTIONS]
Options:
-o, --output-dir PATH Output directory
-f, --format TEXT Output format: txt, srt
-c, --concurrency INT Max concurrent jobs (1-20, default: 5)
-r, --recursive Scan subdirectories
--dry-run Preview files without processing
--verbose Enable verbose output
--help Show help messageExamples:
# Preview what would be processed
transcribe batch ./recordings --dry-run
# Process subdirectories
transcribe batch ./media --recursive
# Combine options
transcribe batch ./videos --recursive --format srt --concurrency 3transcribe extract <file> [OPTIONS]
Options:
-o, --output PATH Output audio file path
-f, --format TEXT Output format: mp3, wav (default: mp3)
--verbose Enable verbose output
--help Show help messagetranscribe config [OPTIONS]
Options:
--show Show current configuration
--init Create default config file
--locations Show config file search paths
--help Show help messageCreate a transcribe.toml file in your project directory:
transcribe config --initExample configuration:
[output]
format = "txt"
[processing]
concurrency = 5
language = "auto"
recursive = false
[logging]
verbose = falseConfig files are searched in this order:
./transcribe.toml./.transcriberc~/.config/transcribe/config.toml~/.transcriberc
Settings can also be configured via environment variables:
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key (required) | - |
TRANSCRIBE_OUTPUT_DIR |
Default output directory | . |
TRANSCRIBE_FORMAT |
Default output format | txt |
TRANSCRIBE_CONCURRENCY |
Max concurrent jobs | 5 |
TRANSCRIBE_LANGUAGE |
Default language | auto |
# Clone repository
git clone https://github.com/jmagly/transcribe-cli.git
cd transcribe-cli
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# or: venv\Scripts\activate # Windows
# Install with dev dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install# Run tests
pytest
# Run with coverage
pytest --cov=src/transcribe_cli --cov-report=html
# Run linting
black src tests
flake8 src tests
mypy srcsrc/transcribe_cli/
├── cli/ # CLI commands (Typer)
├── config/ # Configuration management
├── core/ # Audio extraction, transcription
├── output/ # Output formatters (TXT, SRT)
├── models/ # Data models
└── utils/ # Utilities
MIT
- Fork the repository
- Create a feature branch
- Make changes with tests
- Run
pytestandpre-commit run --all-files - Submit a pull request
- OpenAI Whisper for speech recognition
- FFmpeg for audio/video processing
- Typer for CLI framework