|
|
|
|
|
by dheera
2202 days ago
|
|
Which generator works the best, qualitatively? I come from a vision/ML background but haven't played with speech at all, so it's completely new to me, and wondering what the state of the art is. I've been wanting to create a TTS of myself so I can take phone calls using headphones and type back what I want to say so that I don't have to yell private information out loud in public locations. Would be nice if during non-COVID times I could sit in a train seat and take phone calls completely silently. |
|
Here's a recent work that has a good comparison of some vocoders: https://wavenode-example.github.io/
Edit: WaveRNN struck a good balance for me in the past but is not shown in the link. Tons of new work coming out though!