Skip to content

lucasjinreal/Kokoros

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Banner

πŸ”₯πŸ”₯πŸ”₯ Kokoro Rust

AMSR

video-1737110239209.webm

Digital Human

output2_added_subtitle.mp4

Give a star ⭐ if you like it!

Kokoro is a trending top 2 TTS model on huggingface. This repo provides insanely fast Kokoro infer in Rust, you can now have your built TTS engine powered by Kokoro and infer fast by only a command of koko.

kokoros is a rust crate that provides easy to use TTS ability. One can directly call koko in terminal to synthesize audio.

kokoros uses a relative small model 87M params, while results in extremly good quality voices results.

Languge support:

  • English;
  • Chinese (partly);
  • Japanese (partly);
  • German (partly);

πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯ Kokoros Rust version just got a lot attention now. If you also interested in insanely fast inference, embeded build, wasm support etc, please star this repo! We are keep updating it.

New Discord community: https://discord.gg/E566zfDWqD, Please join us if you interested in Rust Kokoro.

Updates

  • 2025.01.22: πŸ”₯πŸ”₯πŸ”₯ Streaming mode supported. You can now using --stream to have fun with stream mode, kudos to mroigo;
  • 2025.01.17: πŸ”₯πŸ”₯πŸ”₯ Style mixing supported! Now, listen the output AMSR effect by simply specific style: af_sky.4+af_nicole.5;
  • 2025.01.15: OpenAI compatible server supported, openai format still under polish!
  • 2025.01.15: Phonemizer supported! Now Kokoros can inference E2E without anyother dependencies! Kudos to @tstm;
  • 2025.01.13: Espeak-ng tokenizer and phonemizer supported! Kudos to @mindreframer ;
  • 2025.01.12: Released Kokoros;

Installation

  1. Install required Python packages:
pip install -r scripts/requirements.txt
  1. Initialize voice data:
python scripts/fetch_voices.py

This step fetches the required voices.json data file, which is necessary for voice synthesis.

  1. Build the project:
cargo build --release

Usage

View available options

./target/release/koko -h

Generate speech for some text

./target/release/koko text "Hello, this is a TTS test"

The generated audio will be saved to tmp/output.wav by default. You can customize the save location with the --output or -o option:

./target/release/koko text "I hope you're having a great day today!" --output greeting.wav

Generate speech for each line in a file

./target/release/koko file poem.txt

For a file with 3 lines of text, by default, speech audio files tmp/output_0.wav, tmp/output_1.wav, tmp/output_2.wav will be outputted. You can customize the save location with the --output or -o option, using {line} as the line number:

./target/release/koko file lyrics.txt -o "song/lyric_{line}.wav"

OpenAI-Compatible Server

  1. Start the server:
./target/release/koko openai
  1. Make API requests using either curl or Python:

Using curl:

curl -X POST http://localhost:3000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anything can go here",
    "input": "Hello, this is a test of the Kokoro TTS system!",
    "voice": "af_sky"
  }'
  --output sky-says-hello.wav

Using Python:

python scripts/run_openai.py

Streaming

The stream option will start the program, reading for lines of input from stdin and outputting WAV audio to stdout.

Use it in conjunction with piping.

Typing manually

./target/release/koko stream > live-audio.wav
# Start typing some text to generate speech for and hit enter to submit
# Speech will append to `live-audio.wav` as it is generated
# Hit Ctrl D to exit

Input from another source

echo "Suppose some other program was outputting lines of text" | ./target/release/koko stream > programmatic-audio.wav

With docker

  1. Build the image
docker build -t kokoros .
  1. Run the image, passing options as described above
# Basic text to speech
docker run -v ./tmp:/app/tmp kokoros text "Hello from docker!" -o tmp/hello.wav

# An OpenAI server (with appropriately bound port)
docker run -p 3000:3000 kokoros openai

Roadmap

Due to Kokoro actually not finalizing it's ability, this repo will keep tracking the status of Kokoro, and helpfully we can have language support incuding: English, Mandarin, Japanese, German, French etc.

Copyright

Copyright reserved by Lucas Jin under Apache License.

About

πŸ”₯πŸ”₯ Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published