| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nicoburns 82 days ago
	The cost factors on the new models compared to the old models.

3 comments

jeremyjh 82 days ago

Qwen3.6 9B is as good as GPT-4o and runs on my M2 MacBook Air. Models are getting stronger and less costly at the same time, but these are somewhat separate branches of research. Frontier labs are spending more because they are still getting marginal returns and there is more capacity to spend than there was a year ago.

link

gertop 82 days ago

Qwen 3.6 9B doesn't exist.

If you meant 3.5 9B and you truly believe it's as good as 4o then I can only assume you have a very basic use case.

link

jeremyjh 82 days ago

You are right, I was mistaken about the version. I evaluated it in general chat assistant prompts plucked from my history across a range of topics but did not use it for coding - there was never a time when I thought 4o was “good enough” for agentic coding.

link

bdelmas 82 days ago

You are mixing cost and progress. It’s not because it’s more and more expensive that progress is slowing down by itself.

link

nicoburns 82 days ago

They are intrinsically linked beyond a certain point. If we're making progress but costs are spiraling exponentially then it stands to reason that we will soon reach a point where we can no longer afford the increasing costs and thus progress will slow.

(barring some breakthrough that reduces costs, which of course may happen, but for which recent model improvements are not strong evidence of)

link

aspenmartin 82 days ago

Cost for a specific level of performance decreases 10x per year, this has been a pretty consistent property for awhile now.

link

butlike 79 days ago

I guess within the domain of AI, a pertinent question would be: "do I want to use anything but the best?" The errors older models give being directly analogous to being stupider in my eyes.

link

aspenmartin 79 days ago

Depends — many tasks in various pipelines have a reasonable Pareto frontier and diminishing returns after a certain level of performance. You may just have a high budget constraint (say like YouTube computing ASR subtitles; they are not going to be using the best ASR models because it’s expensive). If it’s myself, with a coding agent, I’m going to get the best thing I can afford.

link