|
|
|
|
|
by Nouser76
851 days ago
|
|
I've used coqui.ai's TTS models[0] and library[1] to great success. I was able to get cloned voice to be rendered in about 80% of the audio clip length, and I believe you can also stream the response. Do note the model license for XTTS, it is one they wrote themselves that has some restrictions. [0] https://huggingface.co/coqui/XTTS-v2 [1] https://github.com/coqui-ai/TTS |
|