|
|
|
|
|
by jamez
1320 days ago
|
|
I haven't tried Tortoise, thanks for pointing me to it.
The voices were cloned by fine tuning a VITS model with coqui.ai. I used about two hours of speech for each speaker. With more time and resources, I'm certain it's possible to make those voices considerably better. |
|