Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tutorial for Kokoro Web integration with Open WebUI for TTS #451

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions docs/tutorials/text-to-speech/kokoro-web-integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
sidebar_position: 2
title: "🗨️ Kokoro Web - Effortless TTS for Open WebUI"
---

:::warning
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
:::

## What is `Kokoro Web`?

[Kokoro Web](https://github.com/eduardolat/kokoro-web) provides a lightweight, OpenAI-compatible API for the powerful [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) text-to-speech model, seamlessly integrating with Open WebUI to enhance your AI conversations with natural-sounding voices.

## 🚀 Two-Step Integration

### 1. Deploy Kokoro Web API (One Command)

```yaml
services:
kokoro-web:
image: ghcr.io/eduardolat/kokoro-web:latest
ports:
- "3000:3000"
environment:
# Change this to any secret key to use as your OpenAI compatible API key
- KW_SECRET_API_KEY=your-api-key
volumes:
- ./kokoro-cache:/kokoro/cache
restart: unless-stopped
```

Run with: `docker compose up -d`

### 2. Connect OpenWebUI (30 Seconds)

1. In OpenWebUI, go to `Admin Panel` → `Settings` → `Audio`
2. Configure:
- Text-to-Speech Engine: `OpenAI`
- API Base URL: `http://localhost:3000/api/v1`
(If using Docker: `http://host.docker.internal:3000/api/v1`)
- API Key: `your-api-key` (from step 1)
- TTS Model: `model_q8f16` (best balance of size/quality)
- TTS Voice: `af_heart` (default warm, natural english voice). You can change this to any other voice or formula from the [Kokoro Web Demo](https://voice-generator.pages.dev)

**That's it! Your OpenWebUI now has AI voice capabilities.**

## 🌍 Supported Languages

Kokoro Web supports 8 languages with specific voices optimized for each:

- English (US) - en-us
- English (UK) - en-gb
- Japanese - ja
- Chinese - cmn
- Spanish - es-419
- Hindi - hi
- Italian - it
- Portuguese (Brazil) - pt-br

Each language has dedicated voices for optimal pronunciation and natural flow. See the [GitHub repository](https://github.com/eduardolat/kokoro-web) for the complete list of language-specific voices or use the [Kokoro Web Demo](https://voice-generator.pages.dev) to preview and create your own custom voices instantly.

## 💾 Optimized Models for Any Hardware

Choose the model that fits your hardware needs:

| Model ID | Optimization | Size | Ideal For |
|----------|-------------|------|-----------|
| model_q8f16 | Mixed precision | 86 MB | **Recommended** - Best balance |
| model_quantized | 8-bit | 92.4 MB | Good CPU performance |
| model_uint8f16 | Mixed precision | 114 MB | Better quality on mid-range CPUs |
| model_q4f16 | 4-bit & fp16 weights | 154 MB | Higher quality, still efficient |
| model_fp16 | fp16 | 163 MB | Premium quality |
| model_uint8 | 8-bit & mixed | 177 MB | Balanced option |
| model_q4 | 4-bit matmul | 305 MB | High quality option |
| model | fp32 | 326 MB | Maximum quality (slower) |

## ✨ Try Before You Install

Visit the [**Kokoro Web Demo**](https://voice-generator.pages.dev) to preview all voices instantly. This demo:

- **Runs 100% in your browser** - No server required
- **Free forever** - No usage limits or registration needed
- **Zero installation** - Just visit the website and start creating
- **All features included** - Test any voice or language immediately

## Need More Help?

For additional options, voice customization guides, and advanced settings, visit the [GitHub repository](https://github.com/eduardolat/kokoro-web).

**Enjoy natural AI voices in your OpenWebUI conversations!**