Graph-TTS

Implementation of "GraphTTS: graph-to-sequence modelling in neural text-to-speech"
I failed to generate the plausible speech :(

Training

Download and extract the LJ Speech dataset
Make preprocessed folder in LJSpeech directory and make char_seq & phone_seq & melspectrogram folder in it
Set data_path in hparams.py as the LJSpeech folder
Using prepare_data.ipynb, prepare melspectrogram and text (converted into indices) tensors.
python train.py

Training curve (Orange: transformer-tts / Navy: graph-tts / Red: grap-tts-iter5 / Blue: gae)

Stop prediction loss (train / val)
Guided attention loss (train / val)
L1 loss (train / val)

Alignments

Encoder-Decoder Alignments

Melspectrogram

Stop prediction

Audio Samples

You can hear the audio samples here
You can also hear the audio samples obtained from the Transformer-TTS here

Notice

Unlike the original paper, I didn't use the encoder-prenet following espnet
I apply additional "guided attention loss" to the two heads of the last two layers
Batch size is important, so I use gradient accumulation
You can also use DataParallel. Change the n_gpus, batch_size, accumulation appropriately.
To draw attention plots for every each head, I change return values of the "torch.nn.functional.multi_head_attention_forward()"

#before
return attn_output, attn_output_weights.sum(dim=1) / num_heads  

#after  
return attn_output, attn_output_weights

Among num_layers*num_heads attention matrices, the one with the highest focus rate is saved.

Reference

1.NVIDIA/tacotron2: https://github.com/NVIDIA/tacotron2
2.espnet/espnet: https://github.com/espnet/espnet

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
figures		figures
filelists		filelists
modules		modules
text		text
utils		utils
waveglow		waveglow
wavs		wavs
README.md		README.md
audio_processing.py		audio_processing.py
generate_samples.ipynb		generate_samples.ipynb
hparams.py		hparams.py
index.html		index.html
inference.ipynb		inference.ipynb
layers.py		layers.py
prepare_data.ipynb		prepare_data.ipynb
stft.py		stft.py
train-gae.py		train-gae.py
train-graphtts.py		train-graphtts.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph-TTS

Training

Training curve (Orange: transformer-tts / Navy: graph-tts / Red: grap-tts-iter5 / Blue: gae)

Alignments

Audio Samples

Notice

Reference

About

Releases

Packages

Languages

LEEYOONHYUNG/GraphTTS

Folders and files

Latest commit

History

Repository files navigation

Graph-TTS

Training

Training curve (Orange: transformer-tts / Navy: graph-tts / Red: grap-tts-iter5 / Blue: gae)

Alignments

Audio Samples

Notice

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages