|
|
|
|
|
by brucethemoose2
1022 days ago
|
|
One misleading thing is the notion that you need a 1-2B model to run on commodity hardware. This is not really true. Llama 7B runs with Vulkan/llama.cpp on ~8GB smartphones and ~12GB laptops. That ease is going to get much better over time, as lower RAM hardware starts dropping out of the market and the Vulkan implementations get more widespread. For users trying to run LLMs on 8GB or less machines, the AI Horde approach of distributed models seems much more practical anyway. |
|
(Not even /s - while the developers of LLM applications may have 64GB RAM in their laptops or desktops, the less-technical early adopters of LLMs running locally are likely to be power users with lower-powered laptops, much more stringent RAM limits, and numerous line-of-business applications and browser tabs contending for that RAM. Causing those applications to be swapped onto disk will almost certainly result in a degraded overall experience that could easily be blamed on the LLM application itself.)