Hacker News new | ask | show | jobs
by tlrobinson 2182 days ago
It still sounds very robotic to me. I think Google's WaveNet sounds much more natural: https://cloud.google.com/text-to-speech#section-2
1 comments

There's a personal taste element: I agree with you that certain WaveNet voices sound better (I've actually used them for video narration with some success). The breathing caught me off guard: it took me a minute to identify THAT as the element that was there but I implicitly wasn't expecting to hear.

The breathing + pausing at commas/full stops and general cadence was frankly superior to what I've seen with Google Cloud Voice, which is why I was curious if preprocessing was done. I've generally had to do multiple manual passes with Google Cloud Voice to get audio output that didn't sound robotic.