|
|
|
|
|
by Terretta
742 days ago
|
|
> Doing everything on-device would result in a horrible user experience. They might as well not participate in this generative AI rush at all if they hoped to keep it on-device. On the contrary, I'm shocked over the last few months how "on device" on a Macbook Pro or Mac Studio competes plausibly with last year's early GPT-4, leveraging Llama 3 70b or Qwen2 72b. There are surprisingly few things you "need" 128GB of so-called "unified RAM" for, but with M-series processors and the memory bandwidth, this is a use case that shines. From this thread covering performance of llama.cpp on Apple Silicon M-series … https://github.com/ggerganov/llama.cpp/discussions/4167 … "Buy as much memory as you can afford would be my bottom line!" |
|
And whilst the LLM's running locally are cool, they're still pretty damn slow compared to Chat-GPT, or Meta's LLM.