| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by golol 377 days ago
	The first figure in the paper with Accuracy vs Complexity makes the whole point moot. The authors find that the performance of Claude 3.7 collapses around complexity 3 while Claude 3.7 thinking collapsed around complexity 7. A massive improvement in the complexity horizon that can be dealt with. It's real, it's quantitative, so what's the point of philosophical atguments about whether it is truly "reasoning" or not. All LLMs have various horizons, a context horizon/length, a complexity horizon etc. Reasoning pushes this out further, but not to some infinite algorithmically perfect recurrent reasoning effect. But I bet humans pretty much just have a complexity horizon of 12 or 20 or whatever and bigger models trained on bigger data with bigger reasoning posttraining and better distillation will push the horizons further and further.

1 comments

threeseed 377 days ago

> bigger models trained on bigger data with bigger reasoning posttraining and better distillation will push the horizons further and further

There is no evidence this is the case.

We could be in an era of diminishing returns where bigger models do not yield substantial improvements in quality but instead they become faster, cheaper and more resource efficient.

link

golol 377 days ago

I would claim that o1 -> o3 is evidence of exactly that, and supposedly in half a year we will have even better reasoning models (further complexity horizon), so what could that be besides what I am describing.

link

threeseed 377 days ago

Is there some breakthrough in reasoning between o1 and o3 that we are all missing.

And no one cares what we may have in the future. OpenAI etc already have an issue with credibility.

link

golol 377 days ago

No breakthrough, it's just better in some quantitatively measurable way.

link

energy123 377 days ago

The empirical scaling laws are evidence. They're not deductive evidence, but still evidence.

The scaling laws themselves advertise diminishing returns, something like a natural log. This was never debated by AI optimists, so it's odd to suggest otherwise as if it contradicts anything the AI optimists have been saying.

The scaling laws are kind of a worst case scenario, anyway. They assume no paradigm shift in methodology. As we saw when the test-time scaling law was discovered, you can't bet on stasis here.

link