| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wongarsu 71 days ago
	Most of the 'coding benchmarks' are deeply flawed too. This one at least makes it explicit And so far, the ability to make SVGs of $animal on $ vehicle seems to correlate surprisingly well with model 'intelligence'