Hacker News new | ask | show | jobs
by xiande04 319 days ago
Yeah, that bit about each phoneme sounding exactly the same everytime really made a lot of sense. Even if the TTS phoneme sounds nothing like a human would say it, once you've heard it enough times, you just memorize it.

I guess sounding "natural" really just amounts to adding variation across the sentence, which destroys phoneme-level accuracy.