Hacker News new | ask | show | jobs
by monk_e_boy 3394 days ago
OK, that went from uncanny valley to flipping amazing. I could picture the person speaking. An old lady. A young woman. It was hard to picture an algorithm in a machine.

It's amazing that is all boils down to 1s and 0s and some boolean logic.

1 comments

You've misunderstood what you're listening to, I suggest reading the post again.

The recordings at the bottom are just recordings of an old lady and a young woman.

Yeah, I understood that. The ones in the middle are generated using their voices. You don't find that amazing?
I mean, it's sort of amazing, but it wasn't completely generated by machine. Those sound clips in the middle were generated by copying the inflections from actual recordings, not generating the inflections from scratch. It sounds like the current system they have sounds like the robotic voices at the very top.
It's not TEXT to speech, it's speech to speech. I think it would be amazing when we have TTS of that quality.