You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for this great job. When I try to use zero-shot TTS, I found speakers' similarity is low between spk_smp and generated aduio. My prompt audio、prompt_text and generated audio are in audios.zip. What may be the reason for causing this, and is there any advice for improvement, thanks.
audio_file='sample.wav'prompt_text='I chance to leave him alone, but[uv_break] no[uv_break]. She just wanted to see him again[uv_break]. Anna[uv_break], you don'tknowhowitfeelstoloseasister[uv_break].'
spk_smp=chat.sample_audio_speaker(load_audio(audio_file, 24000))
params_infer_code=ChatTTS.Chat.InferCodeParams(
spk_smp=spk_smp,
txt_smp=prompt_text,
temperature=0.3,
top_P=0.7,
top_K=20
)
params_refine_text=ChatTTS.Chat.RefineTextParams(
prompt='[oral_5]'
)
text="I do love books, but I think I like writing about them more than selling them."wav=chat.infer(
text,
params_infer_code=params_infer_code,
split_text=False,
params_refine_text=params_refine_text
)
torchaudio.save("sample_generated.wav", torch.from_numpy(wav[0]).unsqueeze(0), 24000)
The text was updated successfully, but these errors were encountered:
ZeroShot works best on the audio generated by ChatTTS. If you want to use outside audio, make sure that the audio has good quality and the transcript, txt_smp, is completely identical to the audio, including [lbreak] mark, etc.
Thank you for this great job. When I try to use zero-shot TTS, I found speakers' similarity is low between spk_smp and generated aduio. My prompt audio、prompt_text and generated audio are in audios.zip. What may be the reason for causing this, and is there any advice for improvement, thanks.
The text was updated successfully, but these errors were encountered: