Hacker News new | ask | show | jobs
by p-e-w 1074 days ago
What do you mean? Text-to-speech systems from the past few years are indistinguishable from an actual human voice.
1 comments

I agree that they got pretty good but there’s still something that they get wrong, their intonation is a kind of passable average. If you want to be able to distinguish them from actual human speech pay close attention to intonation/inflection. They’re still very usable, Im not claiming otherwise
I can't hear much difference in the Studio voice:

https://cloud.google.com/text-to-speech/docs/wavenet

I'm fairly sure I couldn't tell Studio voices and real people apart in a blind test.

It would be a good test. I don't think they are yet indistinguishable, for what it's worth.
I don’t think they’ll ever get undistinguishable because humans have variability. Too much consistency and it starts having an artificial smell. Look at ChatGPT for example, you could read a perfectly writen answer and yet kind of sense it was written by ChatGPT..