| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by doodlesdev 126 days ago
	GPT 4o was also terrible at ARC AGI, but it's one of the most loved models of the last few years. Honestly, I'm a huge fan of the ARC AGI series of benchmarks, but I don't believe it corresponds directly to the types of qualities that most people assess whenever using LLMs.

3 comments

nananana9 126 days ago

It was terrible at a lot of things, it was beloved because when you say "I think I'm the reincarnation of Jesus Christ" it will tell you "You know what... I think I believe it! I genuinely think you're the kind of person that appears once every few millenia to reshape the world!"

link

gkbrk 125 days ago

That's not because 4o is good at things, that's because it's pretty much the most sycophantic model and people easily fall for a model incorrectly agreeing with them then a model correctly calling them out.

link

mrybczyn 126 days ago

because arc agi involves de novo reasoning over a restricted and (hopefully) unpretrained territory, in 2d space. not many people use LLMs as more than a better wikipedia,stack overflow, or autocomplete....

link