| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sillysaurusx 2444 days ago
	I am more curious how they did the audio than the video. From experience, it's not nearly as easy to clone someone's voice as you might think. It might be that they just found a good voice actor. That's what most deepfake videos do now. But maybe someday it will be possible to press a button and hear a beautiful result.

2 comments

belevtsoff 2444 days ago

The audio is also generated. We used speech2speech voice conversion for this, so it is indeed more involving than TTS, for instance, but also more expressive and controllable. Here's another example: https://youtu.be/t5yw5cR79VA

link

wtfrmyinitials 2444 days ago

Possibly also generated. From the top of HN last week: "AI Clones Your Voice After Listening for 5 Seconds" https://news.ycombinator.com/item?id=21525878

link