|
|
|
|
|
by rhizome
2743 days ago
|
|
The built in offline Android speech recognizer is really bad. Giants like Google and Facebook are blessed with data, and so they can train state of the art speech recognition models (much much better than what you get out of the built in Android recognizer) and then provide speech recognition as a service. They can control the recognition because it happens on their servers and is independent of Android or any other OS. Is the implication that offline Android recognition does not train on the owner's voice at all? I imagine a lot of phones these days are at least as powerful as the Pentium 200s used to train (successfully!) Dragon Dictate et al 20+ years ago. |
|
Secondly, when I say "train", it is in a totally different context than how you seem to be using the term. You are using it in the context of adapting an acoustic model to a individual speaker to improve the performance. I am talking about building the initial model. Typical RNN or even convolution based algorithms require a lot of time and processing power to train. What's even harder to get than the processing power though is of course, data to train off of.