| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hopinhopout 70 days ago
	LLM's really causing serious brainrot if html pelican drawings are a usage basis for your programming projects, even all these shitty benchmarks don't say or mean anything if companies secretly tweak them on the go

1 comments

Most of the 'coding benchmarks' are deeply flawed too. This one at least makes it explicit

And so far, the ability to make SVGs of $animal on $ vehicle seems to correlate surprisingly well with model 'intelligence'