Hacker News new | ask | show | jobs
by khalic 4 days ago
Yep, any American closed model is now a de facto existential risk for any company relying on them.

The latest open models are so good it’s worth the 6-8 months delayed capabilities. At least for coding

2 comments

The problem is there’s a real wall on the vram side. While fused main memory is ok the inference speeds on larger models are impractical. With vram on a GPU the machine class, power requirement, GPU costs, and other factors put them out of most people’s reach. Cloud GPUs require a second job to keep available and hot. What closed providers offer is packing and scale advantages as well as infrastructure. The scaling laws here aren’t the same as Moore’s law - in fact they predict more required hardware and more scale over time. Moore’s laws isn’t keeping up with expanded needs and the ability to fab and produce at scale the specific things that weren’t needed a few years ago are lagging. So it’s not a 6-8 month lag; it’s a lag that will be induced by hardware scarcity and an ever increasing lag until something fundamentally changes with matmul.
I will use the best available while it is available. 8 months ago with Codex would be intolerable today.
I believe we have somewhat plateaued and each percentage gained seems to be an exponential effort.

Fable was around 10x GPT5 pricing and 100x Chinese models pricing, was it really 100x better? I Don't think so.

If you want a personal story, I just solved a complicated coding problem with Kimi 2.7 that GPT 5.4 failed with.

5.5 is far ahead 5.4. I don't see any plateau (from using these 16+ hours a day 7 days a week)
On a personal level, ditto. But at a business level, this kind of uncertainty will kill you. You need to be able to plan ahead.