Hacker News new | ask | show | jobs
by jjoonathan 1745 days ago
Very cool! This is fun and I can see big improvements from v1 to v2. I look forward to watching this evolve!

https://vo.codes/tts/result/TR:s1rj02g34ppc7bhq1m8bf6p3thny6

1 comments

Thanks!!

A few areas for improvement in the clip you posted:

I need to add better duration estimation. It's unfortunately truncated.

A lot of the community-trained voices don't fully leverage phonetic annotation, so some of the words fall flat.

I think the synthesizer has too much noise in it (you can see this in the image). The person who trained it probably used noisy data.

Finally, the universal vocoder isn't handling James Earl Jones' deep voice very well. It should be fine tuned.