Hacker News new | ask | show | jobs
by laurieg 495 days ago
Unfortunately, a simple normalization of kanji --> hiragana throws away pronunciation information.
1 comments

You could just as easily use the LLM to convert the kanji into phonemes.
You can't lose word boundaries and phonemes don't tell you which part of the word is emphasized.
Modern TTS engines use tokenizers to convert words to phonemes. See: https://github.com/FunAudioLLM/CosyVoice/issues/202