Bible-TTS

Create a virtual environment and install the dependencies

python3 -m venv .venv
. .venv/bin/activate
pip install -U pip
pip install -r requirements.txt

Download the data and unzip archives (Hausa)

./download_data.sh data/hausa/
./unzip_data.sh data/hausa

Preprocess the data (Hausa)

./prepare_data.sh data/hausa/text_by_book/release/USX_2 hausa

Preprocess the data (Luo)

./prepare_data.sh data/luo/text_by_book/release/USX_1 luo

Forced Alignment (Hausa)

docker build -t ctc-aligner -f align.dockerfile .
docker run --rm --runtime=nvidia --gpus '"device=0"' -v $PWD/data/hausa/text_by_chapter/processed:/app/text_dir \
                      -v $PWD/data/hausa/chapters:/app/audio_dir \
                      -v $PWD/data/hausa/tts_data:/app/data \
                      ctc-aligner /app/text_dir /app/audio_dir /app/data hausa
sudo chown -R $USER:$USER data/hausa/tts_data

Forced Alignment (Luo)

docker run --rm --runtime=nvidia --gpus '"device=0"' -v $PWD/data/luo/text_by_chapter/processed:/app/text_dir \
                      -v $PWD/data/luo/chapters:/app/audio_dir \
                      -v $PWD/data/luo/tts_data:/app/data \
                      ctc-aligner /app/text_dir /app/audio_dir /app/data luo --sample_rate 24000
sudo chown -R $USER:$USER data/luo/tts_data

Forced Alignment (Chichewa)

docker run --rm --runtime=nvidia --gpus '"device=0"' -v $PWD/data/chichewa/text_by_chapter/processed:/app/text_dir \
                      -v $PWD/data/chichewa/chapters:/app/audio_dir \
                      -v $PWD/data/chichewa/tts_data:/app/data \
                      ctc-aligner /app/text_dir /app/audio_dir /app/data chichewa --sample_rate 24000
sudo chown -R $USER:$USER data/chichewa/tts_data

You can use the resample.py script to resample the audio files to 22 kHz needed by the XTTS model. The YourTTS checkpoint we use was trained on 24 kHz audio files.

Isolate speakers and generate train, dev, and test splits (Hausa)

python process_manifest.py data/hausa/tts_data/manifest.jsonl hausa -55

Replace -55 with the threshold you want to use to filter out unaligned segments. We use a threshold of -140 for Luo and -45 for Chichewa.

For Chichewa, we also use the --drop_numbers flag to remove samples containing numbers from the dataset since they've not been converted to words.

Train VITS model (Hausa)

docker build -t bible-tts -f train_vits.dockerfile .
docker run --ipc=host --runtime=nvidia --gpus all -v $PWD/data:/app/data \
                      -v $PWD/vits_hausa/:/app/vits_hausa \
                      bible-tts \
                      --coqpit.batch_size 26 \
                      --coqpit.eval_batch_size 26 \
                      --coqpit.batch_group_size 5 \
                      --coqpit.max_audio_len 264600

Train YourTTS model (Luo)

docker build -t train_yourtts -f train_yourtts.dockerfile .
docker run --ipc=host --runtime=nvidia --gpus all -v $PWD/data/luo/tts_data:/app/data \
                      -v $PWD/yourtts_luo/:/app/yourtts_luo \
                      train_yourtts --restore_path /app/yourtts_luo/checkpoints_yourtts_cml_tts_dataset/best_model.pth \
                      --coqpit.batch_size 20 \
                      --coqpit.eval_batch_size 20

Train XTTS model (Luo)

Prepare data

python convert_to_coqui.py data/luo/tts_data_22khz/manifest_train.jsonl
python convert_to_coqui.py data/luo/tts_data_22khz/manifest_dev.jsonl
python convert_to_coqui.py data/luo/tts_data_22khz/manifest_test.jsonl

DVAE fine-tuning

docker build -t train_dvae -f train_dvae.dockerfile .
docker run --rm --ipc=host --runtime=nvidia --gpus '"device=0"' -v $PWD/data/luo/tts_data_22khz:/app/data \
                      -v $PWD/xtts_luo/:/app/xtts_luo \
                        train_dvae -o /app/xtts_luo -l luo

GPT2 fine-tuning

docker build -t train_gpt -f train_gpt.dockerfile .
docker run --rm --ipc=host --runtime=nvidia --gpus '"device=0"' -v $PWD/data/luo/tts_data_22khz:/app/data \
                      -v $PWD/xtts_luo:/app/xtts_luo \
                      train_gpt

Evaluate an XTTS checkpoint on the test set

docker build -t evaluate -f evaluate.dockerfile .
docker run --rm --ipc=host --runtime=nvidia --gpus '"device=0"' \
                      -v $PWD/xtts_hausa/:/app/xtts_hausa \
                      -v $PWD/data:/app/data \
                      -v $PWD/evaluation:/app/evaluation \
                      evaluate synthesize.py \
                      --model_path /app/xtts_hausa/path/to/checkpoint.pth \
                      --manifest_path /app/data/manifest_test.jsonl \
                      --output_dir /app/evaluation/xtts_hausa_test \
                      --speaker_audio_file /app/data/clips/path/to/audio/file.wav \
                      --calculate_mcd

Evaluate an XTTS training run on the dev set

docker run --rm --ipc=host --runtime=nvidia --gpus '"device=0"' \
                      -v $PWD/xtts_hausa/:/app/xtts_hausa \
                      -v $PWD/data:/app/data \
                      evaluate evaluate_run.py \
                      --run_dir /app/xtts_hausa/path/to/run/dir/ \
                      --manifest_path /app/data/manifest_dev.jsonl \
                      --speaker_audio_file /app/data/clips/path/to/audio/file.wav

Evaluate a VITS training run on the dev set

docker build -t evaluate_vits -f evaluate_vits.dockerfile .
docker run --rm --ipc=host --runtime=nvidia --gpus '"device=0"' \
                      -v $PWD/vits_hausa/:/app/vits_hausa \
                      -v $PWD/data:/app/data \
                      evaluate_vits evaluate_run.py \
                      --run_dir /app/vits_hausa/path/to/run/dir/ \
                      --manifest_path /app/data/manifest_dev.jsonl

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
align.dockerfile		align.dockerfile
convert_to_coqui.py		convert_to_coqui.py
ctc-alignment.py		ctc-alignment.py
download_data.sh		download_data.sh
evaluate.dockerfile		evaluate.dockerfile
evaluate_run.py		evaluate_run.py
evaluate_vits.dockerfile		evaluate_vits.dockerfile
num2hausa.py		num2hausa.py
num2luo.py		num2luo.py
prepare_data.sh		prepare_data.sh
process_chapters_text.py		process_chapters_text.py
process_manifest.py		process_manifest.py
requirements.txt		requirements.txt
resample.py		resample.py
synthesize.py		synthesize.py
synthesize_csv.py		synthesize_csv.py
text_utils.py		text_utils.py
train_dvae.dockerfile		train_dvae.dockerfile
train_dvae.sh		train_dvae.sh
train_dvae_xtts.py		train_dvae_xtts.py
train_gpt.dockerfile		train_gpt.dockerfile
train_gpt_xtts.py		train_gpt_xtts.py
train_vits.dockerfile		train_vits.dockerfile
train_vits.py		train_vits.py
train_yourtts.dockerfile		train_yourtts.dockerfile
train_yourtts.py		train_yourtts.py
train_yourtts_hausa.py		train_yourtts_hausa.py
train_yourtts_luo.py		train_yourtts_luo.py
unzip_data.sh		unzip_data.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bible-TTS

Create a virtual environment and install the dependencies

Download the data and unzip archives (Hausa)

Preprocess the data (Hausa)

Preprocess the data (Luo)

Forced Alignment (Hausa)

Forced Alignment (Luo)

Forced Alignment (Chichewa)

Isolate speakers and generate train, dev, and test splits (Hausa)

Train VITS model (Hausa)

Train YourTTS model (Luo)

Train XTTS model (Luo)

Prepare data

DVAE fine-tuning

GPT2 fine-tuning

Evaluate an XTTS checkpoint on the test set

Evaluate an XTTS training run on the dev set

Evaluate a VITS training run on the dev set

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

translatorswb/Bible-TTS

Folders and files

Latest commit

History

Repository files navigation

Bible-TTS

Create a virtual environment and install the dependencies

Download the data and unzip archives (Hausa)

Preprocess the data (Hausa)

Preprocess the data (Luo)

Forced Alignment (Hausa)

Forced Alignment (Luo)

Forced Alignment (Chichewa)

Isolate speakers and generate train, dev, and test splits (Hausa)

Train VITS model (Hausa)

Train YourTTS model (Luo)

Train XTTS model (Luo)

Prepare data

DVAE fine-tuning

GPT2 fine-tuning

Evaluate an XTTS checkpoint on the test set

Evaluate an XTTS training run on the dev set

Evaluate a VITS training run on the dev set

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages