Issue with gruut/espeak-ng when using ru voice in Xtts-2 #4164
-
Hi, from my understanding Xtts 2 is using gruut as a phonemizer. Weirdly enough, it seems to be not working with ru voice. I tried installing additional dependencies - gruut - ru, also espeak-ng just to be safe. But it seems to be not using those specific ones. Just to be precise the generation itself is working as intended, but I'm having some issues with correct stress. I have checked and the correct pronunciation do exists in both gruut dict and espeak dict, but it seems to be not used. The word which I'm using is "госпиталь" which should have stress on first syllable, but it always setting it on the last. Am I doing something wrong? I have seen that it's possible to force this behaviour somewhere in the code. So that's how it should be done or something is missing on my side? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
XTTS doesn't use a phonemizer at all, it goes directly from text to speech, so there's no way to manually provide the correct stress position. The model would have to be trained on much more Russian speech to learn it on its own, it has seen "only" 147 hours. |
Beta Was this translation helpful? Give feedback.
-
@eginhard thank you for fast response! |
Beta Was this translation helpful? Give feedback.
XTTS doesn't use a phonemizer at all, it goes directly from text to speech, so there's no way to manually provide the correct stress position. The model would have to be trained on much more Russian speech to learn it on its own, it has seen "only" 147 hours.