| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Jensson 542 days ago

> What they’ve proven here is that it can be done.

No they haven't, these results do not generalize, as mentioned in the article:

"Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute"

Meaning, they haven't solved AGI, and the task itself do not represent programming well, these model do not perform that well on engineering benchmarks.

1 comments

whynotminot 542 days ago

Sure, AGI hasn’t been solved today.

But what they’ve done is show that progress isn’t slowing down. In fact, it looks like things are accelerating.

So sure, we’ll be splitting hairs for a while about when we reach AGI. But the point is that just yesterday people were still talking about a plateau.

link

peepeepoopoo97 542 days ago

About 10,000 times the cost for twice the performance sure looks like progress is slowing to me.

link

whynotminot 542 days ago

Just to be clear — your position is that the cost of inference for o3 will not go down over time (which would be the first time that has happened for any of these models).

link

peepeepoopoo97 542 days ago

Even if compute costs drop by 10X a year (which seems like a gross overestimate IMO), you're still looking at 1000X the cost for a 2X annual performance gain. Costs outpacing progress is the very definition of diminishing returns.

link

whynotminot 542 days ago

From their charts, o3 mini outperforms o1 using less energy. I don’t see the diminishing returns you’re talking about. Improvement outpacing cost. By your logic, perhaps the very definition of progress?

You can also use the full o3 model, consume insane power, and get insane results. Sure, it will probably take longer to drive down those costs.

You’re welcome to bet against them succeeding at that. I won’t be.

link