Hacker News new | ask | show | jobs
by dijksterhuis 901 days ago
> It's obviously computer generated to my ear.

From the README

    Disclaimer

    This is an open-source implementation that approximates the performance of the internal voice clone technology of myshell.ai. The online version in myshell.ai has better 1) audio quality, 2) voice cloning similarity, 3) speech naturalness and 4) computational efficiency.
2 comments

So this paper is a thinly veiled ad of myshell.ai's services?
Yes. And I used myshell.ai out of interest. It’s also absolutely terrible.
I went through downloading the open source version yesterday and tried it with my voice in the microphone, and a few other saved wav files.

It was terrible. Absolutely terrible. Like, given how much hype I saw about this, I expected something half decent. It was not. It was bad, so bad bad bad.

I was thinking maybe I did something wrong, but then I watched some of the youtube reviews - these guys were SO excited at the start of the video and then at the end, they all literally said, "Uh, well, you be the judge"

I still can't help but feel there's some kind of trick to it - get the right input sample, done in the right intonation, and maybe you can generate anything

I came here just for your comment. Thank you for doing this work so the rest of us doesn't have to.
Like 50% of arxiv. SV figured out that people read papers in 202x, not PRNewsWire and have adjusted accordingly.
Not totally unexpected unfortunately. Any other OSS players on the market?
RVC