███████╗██████╗ ███████╗ █████╗ ██╗ ██╗ ████████╗██╗ ██╗██████╗ ██████╗ ██████╗
██╔════╝██╔══██╗██╔════╝██╔══██╗██║ ██╔╝ ╚══██╔══╝██║ ██║██╔══██╗██╔══██╗██╔═══██╗
███████╗██████╔╝█████╗ ███████║█████╔╝ ██║ ██║ ██║██████╔╝██████╔╝██║ ██║
╚════██║██╔═══╝ ██╔══╝ ██╔══██║██╔═██╗ ██║ ██║ ██║██╔══██╗██╔══██╗██║ ██║
███████║██║ ███████╗██║ ██║██║ ██╗ ██║ ╚██████╔╝██║ ██║██████╔╝╚██████╔╝
╚══════╝╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝ ╚═════╝
~90ms to first sound. Realistic. Local. Private. Fast.
speakturbo "Hello world" → ⚡ 92ms → ▶ 93ms → ✓ done
For AI Agents (Claude Code, Cursor, Windsurf):
npx skills add EmZod/Speak-TurboCLI only:
pip install pocket-tts uvicorn fastapi
cd speakturbo-cli && cargo build --releasespeakturbo "Hello world" # Play instantly
speakturbo "Hello" -o out.wav # Save to file
speakturbo "Hello" -q # Quiet mode
speakturbo --list-voices # Show voicesalba ██████████ Female (default)
marius ██████████ Male
javert ██████████ Male
jean ██████████ Male
fantine ██████████ Female
cosette ██████████ Female
eponine ██████████ Female
azelma ██████████ Female
Time to first sound ░░░░░░░░░░░░░░░░░░░░ ~90ms
First run (cold) ████░░░░░░░░░░░░░░░░ 2-5s
Real-time factor ████████████████░░░░ 4x faster
┌─────────────────┐
│ speakturbo │
│ (Rust, 2.2MB) │
└────────┬────────┘
│ HTTP :7125
▼
┌─────────────────┐
│ daemon │
│ (Python + MLX) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Audio Output │
│ (rodio) │
└─────────────────┘
| Problem | Fix |
|---|---|
| No audio | curl http://127.0.0.1:7125/health |
| Daemon stuck | pkill -f "daemon_streaming" |
| Slow first run | Normal - model loading (2-5s) |
Need voice cloning? Emotion tags? Try speak.
MIT License · Built on Pocket TTS