Hacker News new | ask | show | jobs
by sillysaurusx 2398 days ago
I am more curious how they did the audio than the video. From experience, it's not nearly as easy to clone someone's voice as you might think.

It might be that they just found a good voice actor. That's what most deepfake videos do now. But maybe someday it will be possible to press a button and hear a beautiful result.

2 comments

The audio is also generated. We used speech2speech voice conversion for this, so it is indeed more involving than TTS, for instance, but also more expressive and controllable. Here's another example: https://youtu.be/t5yw5cR79VA
Possibly also generated. From the top of HN last week: "AI Clones Your Voice After Listening for 5 Seconds" https://news.ycombinator.com/item?id=21525878