| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wczekalski 950 days ago
	One thing I've seen done for style cloning is a high quality fine tuned TTS -> RVC pipeline to "enhance" the output. TTS for intonation + pronunciation, RVC for voice texture. With StyleTTS and this pipeline you should get close to ElevenLabs.

2 comments

eigenvalue 950 days ago

I suspect they are doing many more things to make it sounds better. I certainly hope open source solutions can approach that level of quality, but so far I've been very disappointed.

link

KolmogorovComp 949 days ago

RVC? R… Voice Model?

link

a2128 949 days ago

Retrieval-based Voice Conversion - https://github.com/RVC-Project/Retrieval-based-Voice-Convers...

link

stavros 949 days ago

Retrieval-based voice conversion, apparently.

link