Hacker News new | ask | show | jobs
by hn8726 529 days ago
That's what I thought, but the link doesn't say anything about off-device inference, it's only about storing and retrieving the model. There's just one off-hand note about cloud inference.

In any case, yeah you can not download the model to the device at all, but then you have to deal with the other angle - making sure the endpoint isn't abused.

Maybe a hybrid approach would work - infer just part of the model (layers?) on the cloud, and then carry on the inference on the device? I'm not familiar with how AI models look like and work like exactly, but I feel like hiding even a tiny portion of the model would make it not usable in practice

1 comments

Your second note is very interesting, having looked at the model myself this is very plausible.

For models which use a lot of input nodes, a lot of "hidden layers" and in the end just perform a softmax this may get infeasible because of the amount of data you would have to transfer.

You may have inspired a second article :)