|
|
|
|
|
by wczekalski
950 days ago
|
|
One thing I've seen done for style cloning is a high quality fine tuned TTS -> RVC pipeline to "enhance" the output. TTS for intonation + pronunciation, RVC for voice texture. With StyleTTS and this pipeline you should get close to ElevenLabs. |
|