epseak phonemizer implementation #55

heabeounMKTO · 2025-02-08T10:34:55Z

hello, I see the Espeak section of the phonomizer is not yet implemented, coincedentally, I was working on running kokoro with rust using onnxruntime without knowing the existence of this repo. i've written bindings for espeak-ng for grapheme to phoneme generation here , i was wondering if it's at all useful for your project.

lucasjinreal · 2025-02-08T13:13:02Z

Thanks for the contribution.

The currently espeak-ng actually seems work and used the one from: https://github.com/thewh1teagle/piper-rs

However, I am wonder if you could help support other languages other than English.

Both espeak-rs and this repo didn't actually had ability to support Chinese or Japanese.

heabeounMKTO · 2025-02-08T13:45:24Z

Thanks for the contribution.

The currently espeak-ng actually seems work and used the one from: https://github.com/thewh1teagle/piper-rs

However, I am wonder if you could help support other languages other than English.

Both espeak-rs and this repo didn't actually had ability to support Chinese or Japanese.

i've only tested with english, but i think it should work with other languages like zh/jp with a build config change, i will ping you when i finish implementing and testing :>

lucasjinreal · 2025-02-08T14:13:13Z

thanks so much for the interest! Hoping for it could support CN with latest kokoro 1.0 model!

heabeounMKTO · 2025-02-08T14:35:10Z

thanks so much for the interest! Hoping for it could support CN with latest kokoro 1.0 model!

i've added suppourt for g2p for chinese and japanse, currently , you can run the example in the lazy_phonememize repo and get phonemes for chinese like so:

❯ cargo run --example lp_cli --release -- --input-text "这是一篇懒惰的文字" --lang cmn
   Compiling lazy_phonememize v0.1.2 (/Users/aa/Documents/lazy_phonememize)
    Finished `release` profile [optimized] target(s) in 13.57s
     Running `target/release/examples/lp_cli --input-text '这是一篇懒惰的文字' --lang cmn`
[DEBUG] [LazyPhonemizer] `lazy_p buffer len 72
INPUT_TEXT: 这是一篇懒惰的文字
PHONEMIZED: ts.ˈo-5 s.ˈi.5 ji5phˈiɛ5n lˈa2n tˈuo5 tˈə1 wˈuəɜn tsˈi̪5

would you like to use the same language code as in espeak (as in cmn for chinese - mandarin) or would you like a something else ,for reference here's the full list that's suppourted, please also confirm that the phonemes are correct because i do not speak chinese 😅, thank you !

lucasjinreal · 2025-02-08T14:56:07Z

Hi, looks like not exactly right, you can listen from here:

https://ipa-reader.com/

What's the lib link to espeak differences compare with piper-rs I linked previously?

I am not sure what's need further to confirm to support Chinese or Japanese. Might we need make Kokoro work as the final goal

heabeounMKTO · 2025-02-08T16:01:55Z

Hi, looks like not exactly right, you can listen from here:
https://ipa-reader.com/

is there any resource i can look up to see the correct phonemes, is google translate voice sufficient for comparison to the link above?

What's the lib link to espeak differences compare with piper-rs I linked previously?

sorry , for this i am not sure what the difference are because i just quickly wrapped the libespeak-ng in rust specifically for the g2p functionality , then just ran the model, so i am not sure of the implementation details of the other libraries (i will take a look though all of them though)

I am not sure what's need further to confirm to support Chinese or Japanese. Might we need make Kokoro work as the final goal

i think if the g2p part works it should work.

lucasjinreal · 2025-02-09T02:32:27Z

Do you able to speak Japanese? Japanese could also be used to identify whether the result is OK or not.

But also, a known correct Chinese phoneme and sentence pair can be used to check.

I think Kokoro 1.0 should have some examples to check.

shanzhengliu · 2025-02-09T05:04:32Z

@heabeounMKTO
looks like the your library import is failed in the project?

I have fixed it via run brew install automake

suggest add the automake as dependency in Readme

heabeounMKTO · 2025-02-09T05:10:22Z

@heabeounMKTO looks like the your library import is failed in the project?

hello, for your error, i think you need to install autotools first.
please make sure all the build dependencies are met , you can view them here,

heabeounMKTO · 2025-02-09T05:11:16Z

Do you able to speak Japanese? Japanese could also be used to identify whether the result is OK or not.
But also, a known correct Chinese phoneme and sentence pair can be used to check.

sorry i only speak english and my native language khmer, but i will look up some example online and fix

I think Kokoro 1.0 should have some examples to check.

i will check this too

lucasjinreal · 2025-02-09T08:51:50Z

thewh1teagle/kokoro-onnx#99 (comment)

the python version kokoro-onnx version supports Chinese, seems we can reference from it. (at least compare the phoneme output)

heabeounMKTO · 2025-02-10T03:52:41Z

thewh1teagle/kokoro-onnx#99 (comment)

the python version kokoro-onnx version supports Chinese, seems we can reference from it. (at least compare the phoneme output)

hello, i've updated lazy_phonemizer to match the phonemes output from kokoro-onnx. from what i gathered my wrapper library is outputting more "details" to the syllables with the extra 5's and 2's so i just removed it and it matched with the kokoro-onnx python version.

output from `lazy_phonemizer`
[DEBUG] [LazyPhonemizer] `lazy_p buffer len 72
INPUT_TEXT: 这是一篇懒惰的文字
PHONEMIZED: ts.ˈo s.ˈi. jiphˈiɛn lˈan tˈuo tˈə wˈuəɜn tsˈi

output from kokoro-onnx:

DEBUG [__init__.py:84] [DEBUG] phonemes ts. ˈo s. ˈi. jiphˈiɛn lˈan tˈuo tˈə wˈuəɜn tsˈi

can you test with with chinese voice to confirm that it sounds correct?

lucasjinreal · 2025-02-10T13:41:21Z

If so, then the IPA should right.

Have u successfully used kokoro-onnx generated some voices? Can u attach some Chinese / Japanese voices let me have a listen, i can tell if the voices are correct or not.

Once it listened correct, we can migrate to lazy_phoneme

heabeounMKTO · 2025-02-13T03:42:54Z

If so, then the IPA should right.

Have u successfully used kokoro-onnx generated some voices? Can u attach some Chinese / Japanese voices let me have a listen, i can tell if the voices are correct or not.

Once it listened correct, we can migrate to lazy_phoneme

zh_test_zfxiaoxiao1.mp4

hello , sorry for the late reply i am a bit busy, but here is the test audio for
"这是一个懒惰的测试" , i am using zf_xiaoxiao voice, from the rust implementation.

lucasjinreal · 2025-02-13T06:28:39Z

The voice overall is workable but sounds weired.
Can u attach the one the kokoro-onnx generated as well?

heabeounMKTO · 2025-02-13T06:53:55Z

The voice overall is workable but sounds weired. Can u attach the one the kokoro-onnx generated as well?

zf_xiaoxiao_onnx_py.mp4

this is the python version ,

lucasjinreal · 2025-02-13T09:17:37Z

Holy moly, the kokor-onnx is wrong.

the Chinese is not right. Let me link a issue to it.

lucasjinreal · 2025-02-13T09:21:56Z

But the good knows is seems rust aligned to it. So once kokoro-onnx fix the Chinese issue, we might will have a right voice for Chinese and Japanese.

heabeounMKTO · 2025-02-13T09:35:11Z

But the good knows is seems rust aligned to it. So once kokoro-onnx fix the Chinese issue, we might will have a right voice for Chinese and Japanese.

for chinese , i think the issue might be with tokenization/normalization too, and for japanese , espeak-ng doesn't work well, its a issue with espeak g2p itself.
i will extend lazy_p to add suppourt for better g2p for japanese.

❯ cargo run --example lp_cli --release -- --input-text "空を見上げる" --lang ja
   Compiling lazy_phonememize v0.1.1-rc (/Users/aa/Documents/lazy_phonememize)
    Finished `release` profile [optimized] target(s) in 19.68s
     Running `target/release/examples/lp_cli --input-text '空を見上げる' --lang ja`
[DEBUG] [LazyPhonemizer] `lazy_p buffer len 48
INPUT_TEXT: 空を見上げる
PHONEMIZED: (en)tʃˈaɪniːz(ja)lˈe̞tə ˈo̞ (en)tʃˈa

some of the characters are fall back to either chinese or english phonemes.
do you have any suggestions for chinese g2p ?

lucasjinreal · 2025-02-13T13:26:02Z

Am new sure how did Kokoro original does, it uses miskai which written by author himself.

https://github.com/hexgrad/misaki

He seems uses espeak backend mainly, and language support for Chinese and Japanese had some modifications.

The essential alignment actually should be align lazy_g2p with misaki

heabeounMKTO · 2025-02-14T02:50:17Z

Am new sure how did Kokoro original does, it uses miskai which written by author himself.

https://github.com/hexgrad/misaki

He seems uses espeak backend mainly, and language support for Chinese and Japanese had some modifications.

The essential alignment actually should be align lazy_g2p with misaki

ohh, im not aware of this one, i think i'll have a look and change lazy_p accordingly!

g2p impl

569680e

lucasjinreal mentioned this pull request Feb 13, 2025

Chinese results sounds not right thewh1teagle/kokoro-onnx#106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epseak phonemizer implementation #55

epseak phonemizer implementation #55

heabeounMKTO commented Feb 8, 2025 •

edited

Loading

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025 •

edited

Loading

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025

lucasjinreal commented Feb 9, 2025

shanzhengliu commented Feb 9, 2025 •

edited

Loading

heabeounMKTO commented Feb 9, 2025

heabeounMKTO commented Feb 9, 2025

lucasjinreal commented Feb 9, 2025

heabeounMKTO commented Feb 10, 2025 •

edited

Loading

lucasjinreal commented Feb 10, 2025

heabeounMKTO commented Feb 13, 2025 •

edited

Loading

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 13, 2025

lucasjinreal commented Feb 13, 2025

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 13, 2025 •

edited

Loading

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 14, 2025

epseak phonemizer implementation #55

Are you sure you want to change the base?

epseak phonemizer implementation #55

Conversation

heabeounMKTO commented Feb 8, 2025 • edited Loading

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025 • edited Loading

lucasjinreal commented Feb 8, 2025

heabeounMKTO commented Feb 8, 2025

lucasjinreal commented Feb 9, 2025

shanzhengliu commented Feb 9, 2025 • edited Loading

heabeounMKTO commented Feb 9, 2025

heabeounMKTO commented Feb 9, 2025

lucasjinreal commented Feb 9, 2025

heabeounMKTO commented Feb 10, 2025 • edited Loading

lucasjinreal commented Feb 10, 2025

heabeounMKTO commented Feb 13, 2025 • edited Loading

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 13, 2025

lucasjinreal commented Feb 13, 2025

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 13, 2025 • edited Loading

lucasjinreal commented Feb 13, 2025

heabeounMKTO commented Feb 14, 2025

heabeounMKTO commented Feb 8, 2025 •

edited

Loading

heabeounMKTO commented Feb 8, 2025 •

edited

Loading

shanzhengliu commented Feb 9, 2025 •

edited

Loading

heabeounMKTO commented Feb 10, 2025 •

edited

Loading

heabeounMKTO commented Feb 13, 2025 •

edited

Loading

heabeounMKTO commented Feb 13, 2025 •

edited

Loading