Hacker News new | ask | show | jobs
by anotherevan 900 days ago
Hi and thanks for the suggestions. Looking through them, it looks like you need to do what they call "voice banking" before you lose your voice. Basically reading a script they provide.

Unfortunately my friend's voice is too far gone for that to be possible. Hoping for something where they can use old recordings to generate a voice.

3 comments

From my tests I think Audiobox from Meta is the most promising (even better than Eleven Labs) - too bad it's closed source and they force you to read some randomly generated sentences (to prevent the case of someone generating a cloned voice without consent).

Right now Eleven Labs is your best bet.

xTTS is just not there quality wise. The version available in the studio is marginally better than the OSS version but it's still pretty far from being believable.

The non-nerfed version of Tortoise (the author decided to ruin their own project but forks exist) was decent at voice cloning but it takes a lot of tries.

I'm pretty sure we already have the technology to do what you want and help your friend, it's just a matter of time until it gets better and more software comes out.

Some of the recent transformer models can work with audio clips just a few seconds long. I'm sure the final output is less good, but perhaps your friend has audio clips that would work for that from e.g. home movies.