|
|
|
|
|
by nl
1 hour ago
|
|
> I am very excited for local LLMs I think we may have GPT 5.5-xhigh level of performance for under 2000 EUR We are maybe 10 years off that. RAM prices are going to continue to increase for the next 2 years at least. Even putting that aside it's currently around 40-70,000 EUR to run this with a FP8 quantization (which you need to get close to maximum performance). To actually get GPT 5.5-xhigh performance in the real world you need more headroom to support things like subagents (which will fill up your KV cache). I like local models but realism is important. The sweet spot for the next 3 years will continue to be ~35B MoE models. They might match GPT 5.5-xhigh for chat-style problems but not for coding. |
|