Hacker News new | ask | show | jobs
by nemo846 2798 days ago
If they could I believe they would. They rather not be reliant on whoever is providing their speak recognition.
3 comments

A limited subset of functionality should already be possible, especially if they can have the speech recognition trained specifically to the user.
I get the impression that this is already happening to a certain extent and they are testing the water. On newer/faster devices, you’ll notice Siri transcribing on the screen nearly instantaneously, faster than any server round trip, but with poor accuracy. Then a few seconds later once the server connection is established you’ll see some of the transcription change to be more accurate.
Voice Control does this, but it’s not particularly great. I doubt it’s seen significant updates in years.
> They rather not be reliant on whoever is providing their speak recognition.

Apple has their own speech recognition tech.

They don't have the dataset needed to train the speech recognition engine, nor do they seem to be willing to deploy some of that $250b in cash to hire engineers smart enough to make a model which only needs the local user's input for training.
It might be an issue with processing power. The new NPU might fit nicely in this case.
Apple has had stripped-down versions of this before on both iOS and macOS via Voice Control.
There's a big difference between recognizing "open the photos app, computer" and "what times is the avenger's showing today, Siri?"
How do you plan to get the Avengers showtimes without connecting to the internet?
You misunderstand what the internet connection is for. Right now most machine learning models run on server farms and your voice snppet is sent to the server, where the model processes it. They then send the interpretation back

"Local model" means the voice snippet is processed on your device. Never beamed off to a server farm.

The interpretation might be:

  Action: internetSearch
  QueryString: movie times for "Avengers" [near me || {userPreferences.movieTheater}]
  UseLocation: true
Which then kicks off whatever process Siri has for handling internet search actions