|
LLMs on device is the future. It's more secure and solves the problem of too much demand for inference compared to data center supply, it also would use less electricity. It's just a matter of getting the performance good enough. Most users don't need frontier model performance. |
> solves the problem of too much demand for inference
False, it creates consumer demand for inference chips, which will be badly utilised.
> also would use less electricity
What makes you think that? (MAYBE you can save power on cooling. But not if the data center is close to a natural heat sink)
> It's just a matter of getting the performance good enough.
The performance limitations are inherent to the limited compute and memory.
> Most users don't need frontier model performance.
What makes you think that?