|
|
|
|
|
by randkyp
806 days ago
|
|
I played with it some more and I have to agree. For actual voice _cloning_, XTTS2 sounds much, much closer to the original speaker. But the resulting output is also much more unpredictable and sometimes downright glitchy compared to OpenVoice. XTTS2 also tries to "act out" the implied emotion/tone/pitch/cadence in the input text, for better or worse. But my use case is just to have a nice-sounding local TTS engine, and current text-to-phoneme conversion quirks aside, OpenVoice seems promising. It's fast, too. |
|