|
|
|
|
|
by striking
3367 days ago
|
|
1. If you can pay for around 24.6 hours of VA speech data, you can get enough data to run this process with the same quality that Google presented. (that's from the "Experiments section") Not cheap (definitely not free, especially considering the amount of quality control you have to apply), but not expensive either. 2. You can rent out a 96GB GDDR5 GPU instance from Google's cloud for pretty cheap. (https://cloud.google.com/compute/docs/gpus/) I don't think you need anything more powerful than that (but feel free to prove me wrong). I think your last paragraph is totally misguided/uninformed. You can download models for cheap/free (for non-commercial/edu use) from UPenn (https://www.ldc.upenn.edu/language-resources/data/obtaining). People don't give away models for free with 0 strings attached because they're a pain to make. And if you want something you can run on a home computer for cheap/free, you can try DeepSpeech: https://github.com/mozilla/DeepSpeech. All you need is an Nvidia GPU. |
|