It highly depends on the task. For math and coding, sure. But for knowledge tasks
GPT-4 is wayy better than even SOTA ~100B models. For my knowledge test cases the lines get blurry at >400B
There seems to be a mass delusion about how capable SOTA models actually are. That's my only explanation for how poorly I find them performing in basic knowledge tasks compared to how others describe their prowess.
I understand you to be implying that I shouldn't trust my perception that there's a meaningful difference in how much different models hallucinate. I will take that under advisement, but I am still interested in the answer to my original question.
I am eagerly awaiting being able to run a strong local model. I'd hand Apple $5k right now for a Claude in a box. I know the cost might not be there now, just saying that is around my ideal price point.
$10k might even be worth it - but i'm assuming that the more expensive it is the beefier it is too, which also means more electricity.. and i already run ~6 computers/servers in my house. If a power surge happens i'm going to go live in the woods lol.
I would do the same but my issue is that the models are changing so fast, so I don't want to be left out of the next model cuz it only runs on an even newer GPU or something like that.
But maybe my limited understanding is thinking of this wrong.
I've run the latest local models over the last year, including the recent Qwen 3.6 30B A3B, on a 9yo GTX 1080 and 32G RAM I have lying around[0]. If I can do that I don't think hardware will be a problem for you in the near term. The only updates I've needed were to Llama.cpp when a new class of model was released.
[0]: In my case, I want to see how local models perform on limited hardware, sacrificing context size and intelligence compared to SOTA models, so I have to really limit my expectations.
> I would do the same but my issue is that the models are changing so fast, so I don't want to be left out of the next model cuz it only runs on an even newer GPU or something like that.
I think the same, and it's why i stopped caring about running llama/etc at home last year. That coupled with the models being dumb by comparison to SOTA really make me fine with waiting.
But in a year or two it's going to be difficult to resist at home, assuming the pace of improvement holds.
I do, though they're not as bullet proof as you'd hope to my understanding. Hell i have one at the house level too - since i have an EV sitting behind that as well.