| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dmckinno 625 days ago
	This is a bit different. These audio clips use the default voice of each of these systems. I was asking about zero-shot voice cloning, i.e. transferring a recorded voice and synthesizing speech in that voice. I tried zero-shot voice cloning in all of the top OSS models in the Arena and performance was bad.

1 comments

popalchemist 617 days ago

Most of those models DO do zero shot cloning. The best is VoiceCraft. It's nearly 11Labs quality. Check it out.

link

dmckinno 616 days ago

Thanks for the flag. VoiceCraft is indeed the best ZS OSS voice cloning tool, despite appearing at the bottom of the TTS arena They have a really easy-to-use gradio demo on their repo if anyone else wants to give it a try.

There is still a big gap between 11Labs and Character.ai and the VoiceCraft voices would not be confused for the real speaker, but this is much closer.

link