|
|
|
|
|
by jedberg
917 days ago
|
|
I can because I've worked in the space. You have to build a model for every key word that you are looking for. Those models take up space and lots of compute to train. That's why you can't set an arbitrary wake word for your Alexa/Google/Siri and you have to choose from a short list. Because those are the only models they have trained. It would not make sense to train a model for every advertiser and then upload that to the phone. It would only make sense to capture the audio and send it to the cloud for generic speech processing. But that would also not make sense because it takes a bunch of compute to do speech processing, not to mention you'd see all the data being uploaded from your device and the cost of receiving, processing, and storing all that data. I'm 99.9% sure that this is not happening today, but we are on the verge of the tech being good enough to do local speech processing, and then there is no bandwidth limitations, no storage issue, and the consumer pays for compute. |
|
Here's a very basic offline app for Android: https://f-droid.org/packages/org.stypox.dicio . It works pretty bad for me, with its tiny not specialized models, but still enough for some purposes.
You can use an online model for the confirmation of the recognitions, by the way