|
|
|
|
|
by kkielhofner
893 days ago
|
|
Snapchat filters, iPhone photo processing/speech to text/always-on Hey Siri/OCR/object detection and segmentation - there are countless applications and functionality doing this on device today (and for years). For something like the RAG approach I mentioned the sync and coordination of your local content to a remote API would be more taxing on the battery just in terms of the radio than what we already see from on device neural engines and TPUs as leveraged by the functionality I described. These applications would also likely be very upload heavy (photo/video inference - massive upload, tiny JSON response) which could very likely end up taxing cell networks further. Even RAG is thousands of tokens in and a few hundred out (in most cases). There's also the issue of Nvidia GPUs having > 1 yr lead times and the exhaustion of GPUs available from various cloud providers. LLMs especially use tremendous resources for training and this increase is leading to more and more contention for available GPU resources. People are going to be looking more and more to save the clouds and big GPUs for what you really need to do there - big training. Plus, not everyone can burn $1m/day like ChatGPT. If AI keeps expanding and eating more and more functionality the remote-first approach just isn't sustainable. There will likely always be some sort of blend (with serious heavy lifting being cloud, of course) but it's going to shift more and more to local and on-device. There's just no other way. |
|
But those are peanuts compared to what will be possible in the (near) future. You think content-aware fill is neat? Wait until you can zoom out of a photo 50% or completely change the angle.
That’ll costs gobs of processing power and thus time and battery, much more than a 20MB burst transfer of a photo and the backsynced modifications.
> If AI keeps expanding and eating more and more functionality the remote-first approach just isn't sustainable.
It’ll definitely create a large moat around companies with lots of money or extremely efficient proprietary models.