| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jmward01 59 days ago
	Haiku not getting an update is becoming telling. I suspect we are reaching a point where the low end models are cannibalizing high end and that isn't going to stop. How will these companies make money in a few years when even the smallest models are amazing?

4 comments

blixt 59 days ago

Isn't it pretty common for the smaller models to release a little while after the bigger ones, for all the big model providers?

link

jmward01 59 days ago

The last update for Haiku was in October, or in startup land, 10 years ago.

link

mvkel 59 days ago

It seems to be a rule that older models are more expensive than newer ones. The low end models have higher $CPT and worse output. I wonder if the move is to just have one model and quantize if you hit compute constraints

link

deaux 59 days ago

> It seems to be a rule that older models are more expensive than newer ones.

It isn't. Gemini has gotten more expensive with each release. Anthropic has stayed pretty similar over time, no? When is the last time OpenAI dropped API prices? OpenAI started very high because they were the first, so there was a ton of low hanging fruit and there was much room to drop.

link

mvkel 59 days ago

I'm talking about gross margins, not revenue.

It's well known that GPT-4 is much more expensive to operate than the GPT-5 family.

Of course they won't drop the prices; it's pure profit if they make models more efficient.

link

qingcharles 59 days ago

Google is putting a lot of research into small models. Most of my AI budget is now going to small models because I am doing lots of tiny tasks that the small models do great with. I would think a decent chunk of Goog's API revenue probably comes from their small models.

link

dkhenry 59 days ago

The Gemma models are at this point. A 31B model that can fit on a consumer card is as good as Sonnet 4.5. I haven't put it through as much on the coding front or tool calling as I have the Claude or GPT models, but for text processing it is on par with the frontier models.

link

make3 59 days ago

absolutely not on par you're smoking

link

dkhenry 59 days ago

You make a compelling argument, but thankfully I have data to back up my anecdotal experience

This comparison shows them neck and neck https://benchlm.ai/compare/claude-sonnet-4-5-vs-gemma-4-31b

As Does this one https://llm-stats.com/models/compare/claude-sonnet-4-6-vs-ge...

And the pelican benchmark even shows them pretty close https://simonwillison.net/2026/Apr/2/gemma-4/ https://simonwillison.net/2025/Sep/29/claude-sonnet-4-5/

Also this isn't a fringe statement, you can see most people who have done an evaluation agree with me

link

jmward01 59 days ago

I think one area I find hard to get around is context length. Everything self hosted is so limited on length that it is marginal to use. Additionally I think that the tools (like claude code) are clearly in the training mix for Anthropic's models so they seem to get a boost over other models pushed into that environment. That being said, open source and local inference is -really- good and only going to get better. There is no doubt that the current frontier biz model is not sustainable.

link

make3 59 days ago

if you look at the details of the numbers of the benchmarks that you shared, Sonnet 4.5 crushes gemma 4. Somehow the first link doesn't run Sonnet on the multi modal benchmark, that's why the top score looks close, it beats Gemma at every benchmark they actually ran. The arena in the second shows that it actually destroys Gemma 4 as well, not close

link

dkhenry 58 days ago

The second one is Sonnet 4.6 not 4.5. If you change it to 4.5 Gemma 4 actually beats 4.5

link

lostmsu 59 days ago

Just to be clear, did you notice the parent said 4.5?

link

cmorgan31 59 days ago

They are also on par in a lot of classification tasks. I did have to actually use gemma4 and fine tune it a bit but that is part of the value add.

link

make3 59 days ago

I did, what's your point?

link