|
|
|
|
|
by t0lo
524 days ago
|
|
Is anyone blown away by how fast we got to running something this powerful locally? I know it's easy to get burnt out on llms but this is pretty incredible. I genuinely think we're only 2 years away from full custom local voice to voice llm assistants that grow with you like JOI in BR2049 and it's going to change how we think about being human and being social, and how we grow up. |
|
I've been experimenting with running local LLMs for nearly two years now, ever since the first LLaMA release back in March 2023.
About six months ago I had mostly lost interest in them. They were fun to play around with but the quality difference between the ones I could run on my MacBook and the ones I could access via an online API felt insurmountable.
This has completely changed in the second half of 2024. The models I can run locally had a leap in quality - they feel genuinely GPT-4 class now.
They're not as good as the best hosted models (GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet) but they're definitely good enough to be extremely useful.
This started with the Qwen 2 and 2.5 series, but I also rate Llama 3.3 70B and now Phi-4 as GPT-4 class models that run on my laptop.
I wrote more about this here: https://simonwillison.net/2024/Dec/31/llms-in-2024/#some-of-...