| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by egeozcan 22 days ago
	I think there are so many variables from harnesses to tasks, making it very hard to put the models to a pecking order unless one beats another in virtually every task (like in Opus vs DeepSeek). But all in all, I don't think we disagree.