|
|
|
|
|
by Jensson
542 days ago
|
|
> What they’ve proven here is that it can be done. No they haven't, these results do not generalize, as mentioned in the article: "Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute" Meaning, they haven't solved AGI, and the task itself do not represent programming well, these model do not perform that well on engineering benchmarks. |
|
But what they’ve done is show that progress isn’t slowing down. In fact, it looks like things are accelerating.
So sure, we’ll be splitting hairs for a while about when we reach AGI. But the point is that just yesterday people were still talking about a plateau.