Y
Hacker News
new
|
ask
|
show
|
jobs
by
popalchemist
495 days ago
Alternatively, text that is input to these services should be passed through a normalization process, i.e. use LLAMA to convert kanji to hiragana or a romanization. The TTS output is then much better.
1 comments
laurieg
495 days ago
Unfortunately, a simple normalization of kanji --> hiragana throws away pronunciation information.
link
popalchemist
494 days ago
You could just as easily use the LLM to convert the kanji into phonemes.
link
xyzhg
492 days ago
You can't lose word boundaries and phonemes don't tell you which part of the word is emphasized.
link
popalchemist
492 days ago
Modern TTS engines use tokenizers to convert words to phonemes. See:
https://github.com/FunAudioLLM/CosyVoice/issues/202
link