| I am trying to run it locally but it doesn't quite work for me. I was able to run the demos allright, but when trying to use another reference speaker (in demo_part1), the result doesn't sound at all like the source (it's just a random male voice). I'm also trying to produce French output, using a reference audio file in French for the base speaker, and a text in French. This triggers an error in api.py line 75 that the source language is not accepted. Indeed, in api.py line 45 the only two source languages allowed are English and Chineese; simply adding French to language_marks in api.py line 43 avoids errors but produces a weird/unintelligible result with a super heavy English accent and pronunciation. I guess one would need to generate source_se again, and probably mess with config.json and checkpoint.pth as well, but I could not find instructions on how to do this...? Edit -- tried again on https://app.myshell.ai/ The result sounds French alright, but still nothing like the original reference. It would be absolutely impossible to confuse one with the other, even for someone who didn't know the person very well. |
But my use case is just to have a nice-sounding local TTS engine, and current text-to-phoneme conversion quirks aside, OpenVoice seems promising. It's fast, too.