| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adroniser 727 days ago
	I don't see how the point about the typical human is relevant. Either you can reason or you can't, the ARC test is supposed to be an objective way to measure this. Clearly a vanilla LLM currently cannot do this, and somehow an expert crafting a super-specific prompt is supposed to be impressive.

1 comments

eigenvalue 727 days ago

The point is that if you have some test of whether an AI is intelligent that the vast majority of living humans would fail or do worse on than gpt4-o (let alone future LLMs) then it’s not a very persuasive argument.

link