|
|
|
|
|
by ganfortran
3368 days ago
|
|
> How difficult Very difficult: 1.Machine learning is pretty data dependent, and make those datasets are very expensive. Google is not likely to give them away for free, because it is their competitive advantage. 2.The infrastructure to train those models are hard to get outside of Google. Pretty sure it is 10s or 100s of GPUs, with Infinity Band connected PS server, running for days and weeks. Even with source code published, people will still have to scratch their head to duplicate Google's performance. Until the day, some equivalent organization as GNU that democratize data access to the public and some mighty algorithm being discovered dramatically reduced the computational requirement for training those models, Google succeeds by just being Google is unlikely going to change. |
|
2. You can rent out a 96GB GDDR5 GPU instance from Google's cloud for pretty cheap. (https://cloud.google.com/compute/docs/gpus/) I don't think you need anything more powerful than that (but feel free to prove me wrong).
I think your last paragraph is totally misguided/uninformed. You can download models for cheap/free (for non-commercial/edu use) from UPenn (https://www.ldc.upenn.edu/language-resources/data/obtaining). People don't give away models for free with 0 strings attached because they're a pain to make.
And if you want something you can run on a home computer for cheap/free, you can try DeepSpeech: https://github.com/mozilla/DeepSpeech. All you need is an Nvidia GPU.