| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by campbel 842 days ago
	Opus got it correct for me. Seems like there is correct and incorrect responses from the models on this. I think testing 1 question 1 time really isn't worth much for an accurate representation of capability.