Hacker News new | ask | show | jobs
by ithkuil 1179 days ago
(assuming we're not talking about the near future)

I think this can be a scenario of converging incentives: on one side large models will incentivized hardware manufacturers to increase the memory available on the devices, while on the other sides model developers will be incentivized to trim the fat on the models and devise compression mechanisms that don't compromise quality too much.

It's not unthinkable to imagine a hand held device able to run full inference locally a few device generations in the future.