DBAT - Deepgram Batch Audio Transcription 🎙️

Fast batch audio transcription powered by Deepgram Nova-3. Upload multiple MP3 files and get accurate transcriptions with concurrent processing.

Features

✅ Batch Processing - Upload unlimited files, processed in batches of 10
✅ Concurrent Transcription - 10 files transcribed simultaneously for 10x speed
✅ Multi-Language Support - English, Polish, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean
✅ Customizable Options - Toggle smart formatting and punctuation
✅ Download as ZIP - Get all transcriptions in one convenient file
✅ Simple Web UI - Clean Streamlit interface, no coding required

Quick Start

Prerequisites

Python 3.8 or higher
Deepgram API key (Get one free)

macOS Setup

If you don't have Python installed on macOS:

# Using Homebrew (install if needed: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)")
brew install python3

# Verify installation
python3 --version

Windows Setup

If you don't have Python installed on Windows:

Download Python from python.org
Run the installer and check "Add Python to PATH" during installation
Verify installation in Command Prompt or PowerShell:

python --version

Installation

Clone the repository:

git clone https://github.com/bsisduck/DBAT.git
cd DBAT

Create virtual environment:

macOS/Linux:

python3 -m venv venv
source venv/bin/activate

Windows:

python -m venv venv
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage

Start the application:

streamlit run app.py

Open your browser at http://localhost:8501
Enter your Deepgram API key in the sidebar
Select your language and options
Upload MP3 files and click "Transcribe All"
Download the ZIP file with all transcriptions

Configuration

Transcription Options

Smart Format: Improves readability with proper formatting
Punctuation: Adds punctuation marks to transcriptions
Language: Select the language of your audio files

API Key

You can provide your API key in two ways:

Enter it in the sidebar UI (recommended)
Set DEEPGRAM_API_KEY in .env file

How It Works

Files are split into batches of 10
Each batch processes 10 files concurrently using threading
Deepgram Nova-3 API transcribes the audio
Transcriptions are saved as .txt files
All files are packaged into a timestamped ZIP

Performance

Sequential: 28 files × 1 min = ~28 minutes
Concurrent (10 at once): 3 batches × 1 min = ~3 minutes

Processing time depends on audio length and API response time.

Requirements

streamlit==1.28.1
deepgram-sdk==3.5.0
python-dotenv==1.0.0

Project Structure

DBat/
├── app.py                    # Main Streamlit application
├── transcription_service.py  # Deepgram API wrapper
├── batch_processor.py        # Concurrent batch processing
├── zip_utils.py             # ZIP file creation
├── requirements.txt         # Python dependencies
├── .env.example            # Environment template
├── LICENSE                 # MIT License
└── README.md               # This file

Supported Languages

English, Polish, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean

More languages available - check Deepgram docs

Troubleshooting

API key not working?

Verify key is valid at https://console.deepgram.com
Check for typos when entering the key

Files not uploading?

Only MP3 files are supported
Max file size: 2000 MB per file
Ensure stable internet connection

Slow processing?

Processing time depends on audio length
Check your Deepgram plan limits
Longer files take proportionally longer

License

MIT License - see LICENSE file for details

Credits

Built with Streamlit
Powered by Deepgram Nova-3
Concurrent processing with Python ThreadPoolExecutor

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Support

Deepgram API Docs
Streamlit Docs
For bugs, open an issue on GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DBAT - Deepgram Batch Audio Transcription 🎙️

Features

Quick Start

Prerequisites

macOS Setup

Windows Setup

Installation

Usage

Configuration

Transcription Options

API Key

How It Works

Performance

Requirements

Project Structure

Supported Languages

Troubleshooting

License

Credits

Contributing

Support

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
batch_processor.py		batch_processor.py
requirements.txt		requirements.txt
transcription_service.py		transcription_service.py
zip_utils.py		zip_utils.py

License

bsisduck/DBAT

Folders and files

Latest commit

History

Repository files navigation

DBAT - Deepgram Batch Audio Transcription 🎙️

Features

Quick Start

Prerequisites

macOS Setup

Windows Setup

Installation

Usage

Configuration

Transcription Options

API Key

How It Works

Performance

Requirements

Project Structure

Supported Languages

Troubleshooting

License

Credits

Contributing

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages