Hacker News new | ask | show | jobs
by vineyardmike 1307 days ago
> The idea that you could run these kinds of ML inference tasks is economically fanciful. You would need a huge investment in hardware and the opex would be ridiculous.

Google, Apple, Amazon and even Sonos are all releasing voice assistants that work locally on their relatively low powered speakers.

Apple seems to be ahead with what is local, while Google seems to be the smartest. (Sonos doesn’t have a cloud, but it’s not ‘general purpose’ afaik).

Sure you can’t amortize them across a bunch of TPUs BUT instead they can ship custom hardware. A tpu needs to be big and support parallel streams. A home server may only need to ever serve one stream. There are arduino style devices that can perform basic tensor flow audio models in real time now. And obviously most phones can perform this locally now, so depending on opinion that may be considered affordable.