|
|
|
|
|
by rprenger
2979 days ago
|
|
Full disclosure, I currently work at Nvidia on speech synthesis. You can definitely do this on a GPU. We use the older auto-regressive WaveNets (not Parallel Wavenet) for inference on GPUs, with the newly released nv-wavenet code. Here's a link to a blog post about it: https://devblogs.nvidia.com/nv-wavenet-gpu-speech-synthesis That code will generate audio samples at 48khz, or if you're worried about throughput, it'll do a batch of 320 parallel utterances at 16khz. |
|