|
|
|
|
|
by otabdeveloper4
100 days ago
|
|
> most people won't ever run local inference because it sucks and is a resource hog most can't afford a) Local inference for chats sucks. Using LLMs for chatting is stupid though. b) Local inference is cheap if you're not selling a general-purpose chatbot. There's lots of fun stuff you can get with a local LLM that previously wasn't economically possible. Two big ones are gaming (for example, text adventure games or complex board games like Magic the Gathering) and office automation (word processors, excel tables). |
|
If you can use the NPU to process embeddings quickly, you get some incredible functionality — from photo search by subject to near match email search.
For consumer applications that’s what I’m most excited for. It takes something that used to require large teams, data, and bespoke models into commodity that any app can use.