| I was very impressed with the TTS examples in the original DeepMind article (https://deepmind.com/blog/wavenet-generative-model-raw-audio...). Can someone elaborate on the usefulness of this implementation for Text-to-Speech? I'm keen to experiment with voice synthesis. I want to create dialog, from multiple voice sources, for some characters in a VR application that I'm working on. Perhaps this lib is a better option for TTS: https://github.com/ibab/tensorflow-wavenet I guess I could do with an ELI5 on how I'd approach this with either of these libraries. I'm not familiar with any deep learning frameworks. But I am pretty handy with Python and have implemented SciKit stuff. Also thinking this will give me a reason to try Azure K80 instance vs the AWS GPU instances I've been using for other stuff. That said, is a Tesla K80 the only option for WaveNet? I'm guessing I could run it on other GPU's but had read that memory might be an issue on some cards. If so what the lowest card I can run it on and will one of the AWS GPU instances suffice? I also have a GTX 970 at home, but I'm guessing that won't cut it. |
I'm looking forward to seeing faster implementations in the future, playing around with this looks like a lot of fun.